Re: [Gluster-devel] Regarding doing away with refkeeper in locks xlator

2014-06-08 Thread Pranith Kumar Karampuri


On 06/06/2014 11:38 AM, Pranith Kumar Karampuri wrote:


On 06/06/2014 11:37 AM, Anand Avati wrote:




On Thu, Jun 5, 2014 at 10:56 PM, Pranith Kumar Karampuri 
mailto:pkara...@redhat.com>> wrote:



On 06/06/2014 10:47 AM, Pranith Kumar Karampuri wrote:


On 06/06/2014 10:43 AM, Anand Avati wrote:




On Thu, Jun 5, 2014 at 9:54 PM, Pranith Kumar Karampuri
mailto:pkara...@redhat.com>> wrote:


On 06/06/2014 10:02 AM, Anand Avati wrote:

On Thu, Jun 5, 2014 at 7:52 PM, Pranith Kumar Karampuri
mailto:pkara...@redhat.com>> wrote:



This sounds a bit complicated. I think there is a
much simpler solution:

- First, make update_refkeeper() check for blocked
locks (which I mentioned as "optional" previously)

- Make grant_blocked_locks() double up and do the job
of update_refkeeper() internally.

Something which looks like this:

diff --git a/xlators/features/locks/src/common.c
b/xlators/features/locks/src/common.c
index f6c71c1..38df385 100644
--- a/xlators/features/locks/src/common.c
+++ b/xlators/features/locks/src/common.c
@@ -126,8 +126,14 @@ __pl_inode_is_empty (pl_inode_t
*pl_inode)
   if (!list_empty (&dom->entrylk_list))
 is_empty = 0;
+if (!list_empty (&dom->blocked_entrylks))
+  is_empty = 0;
+
   if (!list_empty (&dom->inodelk_list))
 is_empty = 0;
+
+if (!list_empty (&dom->blocked_inodelks))
+  is_empty = 0;
 }
 return is_empty;
@@ -944,12 +950,18 @@ grant_blocked_locks (xlator_t
*this, pl_inode_t *pl_inode)
 struct list_head granted_list;
 posix_lock_t *tmp = NULL;
 posix_lock_t *lock = NULL;
+ inode_t *unref = NULL;
 INIT_LIST_HEAD (&granted_list);
 pthread_mutex_lock (&pl_inode->mutex);
 {
 __grant_blocked_locks (this, pl_inode, &granted_list);
+
+   if (__pl_inode_is_empty (pl_inode) &&
pl_inode->refkeeper) {
+   unref = pl_inode->refkeeper;
+ pl_inode->refkeeper = NULL;
+   }
 }
 pthread_mutex_unlock (&pl_inode->mutex);
@@ -965,6 +977,9 @@ grant_blocked_locks (xlator_t
*this, pl_inode_t *pl_inode)
   GF_FREE (lock);
 }
+ if (unref)
+ inode_unref (unref);
+
 return;
 }


This should make pl_disconnect_cbk() pretty much race
free w.r.t refkpeer. Thoughts?

Lets say C1 is doing pl_inodelk_client_cleanup. After
the second for-loop(All granted and blocked locks are
out of the domain) if an unlock on the final granted
lock on that inode from client C2 completes, refkeeper
would be set to NULL and unrefed leading to zero refs
on that inode i.e. pl_forget will also happen. In 3rd
for-loop pl_inode is already freed and leads to free'd
memory access and will crash.



We also need:

diff --git a/xlators/features/locks/src/inodelk.c 
b/xlators/features/locks/src/inodelk.c
index c76cb7f..2aceb8a 100644
--- a/xlators/features/locks/src/inodelk.c
+++ b/xlators/features/locks/src/inodelk.c
@@ -494,13 +494,13 @@ pl_inodelk_client_cleanup (xlator_t *this, 
pl_ctx_t *ctx)
  
 dom = get_domain (pl_inode, l->volume);
  
-   grant_blocked_inode_locks (this, pl_inode, dom);

-
 pthread_mutex_lock (&pl_inode->mutex);
 {
 __pl_inodelk_unref (l);
 }
 pthread_mutex_unlock (&pl_inode->mutex);
+
+   grant_blocked_inode_locks (this, pl_inode, dom);
  }
   
   return 0;


Missed this in the last patch.

It still doesn't solve the problem I described earlier. By
the time it executes this third loop refkeeper is already
unreffed when C2 unlocks.



Right, we will need to introduce an "in_cleanup" counter, if
set pl_update_refkeeper() should not unref. Increment the
in_cleanup() in the first lookup, and decrement it in the last
loop, just before calling grant_blocked_locks() (along with the
patches in my last 2 mails)

s/first lookup/first loop/ ?

Consider the following scenario:
There are two granted locks L1, L2 from C1, C2 clients
respectively on same inode.
C1 gets disconnected.
C2 issues a unlock.

This is the se

Re: [Gluster-devel] Regarding doing away with refkeeper in locks xlator

2014-06-05 Thread Pranith Kumar Karampuri


On 06/06/2014 11:37 AM, Anand Avati wrote:




On Thu, Jun 5, 2014 at 10:56 PM, Pranith Kumar Karampuri 
mailto:pkara...@redhat.com>> wrote:



On 06/06/2014 10:47 AM, Pranith Kumar Karampuri wrote:


On 06/06/2014 10:43 AM, Anand Avati wrote:




On Thu, Jun 5, 2014 at 9:54 PM, Pranith Kumar Karampuri
mailto:pkara...@redhat.com>> wrote:


On 06/06/2014 10:02 AM, Anand Avati wrote:

On Thu, Jun 5, 2014 at 7:52 PM, Pranith Kumar Karampuri
mailto:pkara...@redhat.com>> wrote:



This sounds a bit complicated. I think there is a much
simpler solution:

- First, make update_refkeeper() check for blocked
locks (which I mentioned as "optional" previously)

- Make grant_blocked_locks() double up and do the job
of update_refkeeper() internally.

Something which looks like this:

diff --git a/xlators/features/locks/src/common.c
b/xlators/features/locks/src/common.c
index f6c71c1..38df385 100644
--- a/xlators/features/locks/src/common.c
+++ b/xlators/features/locks/src/common.c
@@ -126,8 +126,14 @@ __pl_inode_is_empty (pl_inode_t
*pl_inode)
   if (!list_empty (&dom->entrylk_list))
 is_empty = 0;
+if (!list_empty (&dom->blocked_entrylks))
+  is_empty = 0;
+
   if (!list_empty (&dom->inodelk_list))
 is_empty = 0;
+
+if (!list_empty (&dom->blocked_inodelks))
+  is_empty = 0;
 }
 return is_empty;
@@ -944,12 +950,18 @@ grant_blocked_locks (xlator_t
*this, pl_inode_t *pl_inode)
 struct list_head granted_list;
 posix_lock_t *tmp = NULL;
 posix_lock_t *lock = NULL;
+ inode_t *unref = NULL;
 INIT_LIST_HEAD (&granted_list);
 pthread_mutex_lock (&pl_inode->mutex);
 {
 __grant_blocked_locks (this, pl_inode, &granted_list);
+
+   if (__pl_inode_is_empty (pl_inode) &&
pl_inode->refkeeper) {
+   unref = pl_inode->refkeeper;
+ pl_inode->refkeeper = NULL;
+   }
 }
 pthread_mutex_unlock (&pl_inode->mutex);
@@ -965,6 +977,9 @@ grant_blocked_locks (xlator_t
*this, pl_inode_t *pl_inode)
   GF_FREE (lock);
 }
+ if (unref)
+ inode_unref (unref);
+
 return;
 }


This should make pl_disconnect_cbk() pretty much race
free w.r.t refkpeer. Thoughts?

Lets say C1 is doing pl_inodelk_client_cleanup. After
the second for-loop(All granted and blocked locks are
out of the domain) if an unlock on the final granted
lock on that inode from client C2 completes, refkeeper
would be set to NULL and unrefed leading to zero refs
on that inode i.e. pl_forget will also happen. In 3rd
for-loop pl_inode is already freed and leads to free'd
memory access and will crash.



We also need:

diff --git a/xlators/features/locks/src/inodelk.c 
b/xlators/features/locks/src/inodelk.c
index c76cb7f..2aceb8a 100644
--- a/xlators/features/locks/src/inodelk.c
+++ b/xlators/features/locks/src/inodelk.c
@@ -494,13 +494,13 @@ pl_inodelk_client_cleanup (xlator_t *this, 
pl_ctx_t *ctx)
  
 dom = get_domain (pl_inode, l->volume);
  
-   grant_blocked_inode_locks (this, pl_inode, dom);

-
 pthread_mutex_lock (&pl_inode->mutex);
 {
 __pl_inodelk_unref (l);
 }
 pthread_mutex_unlock (&pl_inode->mutex);
+
+   grant_blocked_inode_locks (this, pl_inode, dom);
  }
   
   return 0;


Missed this in the last patch.

It still doesn't solve the problem I described earlier. By
the time it executes this third loop refkeeper is already
unreffed when C2 unlocks.



Right, we will need to introduce an "in_cleanup" counter, if set
pl_update_refkeeper() should not unref. Increment the
in_cleanup() in the first lookup, and decrement it in the last
loop, just before calling grant_blocked_locks() (along with the
patches in my last 2 mails)

s/first lookup/first loop/ ?

Consider the following scenario:
There are two granted locks L1, L2 from C1, C2 clients
respectively on same inode.
C1 gets disconnected.
C2 issues a unlock.

This is the sequence of steps:
1) C1 executes first loop, increment

Re: [Gluster-devel] Regarding doing away with refkeeper in locks xlator

2014-06-05 Thread Anand Avati
On Thu, Jun 5, 2014 at 10:56 PM, Pranith Kumar Karampuri <
pkara...@redhat.com> wrote:

>
> On 06/06/2014 10:47 AM, Pranith Kumar Karampuri wrote:
>
>
> On 06/06/2014 10:43 AM, Anand Avati wrote:
>
>
>
>
> On Thu, Jun 5, 2014 at 9:54 PM, Pranith Kumar Karampuri <
> pkara...@redhat.com> wrote:
>
>>
>> On 06/06/2014 10:02 AM, Anand Avati wrote:
>>
>>  On Thu, Jun 5, 2014 at 7:52 PM, Pranith Kumar Karampuri <
>> pkara...@redhat.com> wrote:
>>
>>>
>>>  This sounds a bit complicated. I think there is a much simpler
>>> solution:
>>>
>>>  - First, make update_refkeeper() check for blocked locks (which I
>>> mentioned as "optional" previously)
>>>
>>>  - Make grant_blocked_locks() double up and do the job of
>>> update_refkeeper() internally.
>>>
>>>  Something which looks like this:
>>>
>>>   diff --git a/xlators/features/locks/src/common.c
>>> b/xlators/features/locks/src/common.c
>>> index f6c71c1..38df385 100644
>>> --- a/xlators/features/locks/src/common.c
>>> +++ b/xlators/features/locks/src/common.c
>>> @@ -126,8 +126,14 @@ __pl_inode_is_empty (pl_inode_t *pl_inode)
>>>  if (!list_empty (&dom->entrylk_list))
>>>  is_empty = 0;
>>>
>>> +if (!list_empty (&dom->blocked_entrylks))
>>> +is_empty = 0;
>>> +
>>>  if (!list_empty (&dom->inodelk_list))
>>>  is_empty = 0;
>>> +
>>> +if (!list_empty (&dom->blocked_inodelks))
>>> +is_empty = 0;
>>>  }
>>>
>>>   return is_empty;
>>> @@ -944,12 +950,18 @@ grant_blocked_locks (xlator_t *this, pl_inode_t
>>> *pl_inode)
>>>  struct list_head granted_list;
>>>  posix_lock_t *tmp = NULL;
>>>  posix_lock_t *lock = NULL;
>>> +   inode_t *unref = NULL;
>>>
>>>  INIT_LIST_HEAD (&granted_list);
>>>
>>>  pthread_mutex_lock (&pl_inode->mutex);
>>>  {
>>>  __grant_blocked_locks (this, pl_inode, &granted_list);
>>> +
>>> +   if (__pl_inode_is_empty (pl_inode) &&
>>> pl_inode->refkeeper) {
>>>  +   unref = pl_inode->refkeeper;
>>> +   pl_inode->refkeeper = NULL;
>>> +   }
>>>  }
>>>  pthread_mutex_unlock (&pl_inode->mutex);
>>>
>>> @@ -965,6 +977,9 @@ grant_blocked_locks (xlator_t *this, pl_inode_t
>>> *pl_inode)
>>>  GF_FREE (lock);
>>>  }
>>>
>>> +   if (unref)
>>> +   inode_unref (unref);
>>> +
>>>  return;
>>>  }
>>>
>>>
>>>  This should make pl_disconnect_cbk() pretty much race free w.r.t
>>> refkpeer. Thoughts?
>>>
>>>  Lets say C1 is doing pl_inodelk_client_cleanup. After the second
>>> for-loop(All granted and blocked locks are out of the domain) if an unlock
>>> on the final granted lock on that inode from client C2 completes, refkeeper
>>> would be set to NULL and unrefed leading to zero refs on that inode i.e.
>>> pl_forget will also happen. In 3rd for-loop pl_inode is already freed and
>>> leads to free'd memory access and will crash.
>>>
>>
>>
>>  We also need:
>>
>>  diff --git a/xlators/features/locks/src/inodelk.c 
>> b/xlators/features/locks/src/inodelk.c
>> index c76cb7f..2aceb8a 100644
>> --- a/xlators/features/locks/src/inodelk.c
>> +++ b/xlators/features/locks/src/inodelk.c
>> @@ -494,13 +494,13 @@ pl_inodelk_client_cleanup (xlator_t *this, pl_ctx_t 
>> *ctx)
>>
>> dom = get_domain (pl_inode, l->volume);
>>
>> -   grant_blocked_inode_locks (this, pl_inode, dom);
>> -
>> pthread_mutex_lock (&pl_inode->mutex);
>> {
>> __pl_inodelk_unref (l);
>> }
>> pthread_mutex_unlock (&pl_inode->mutex);
>> +
>> +   grant_blocked_inode_locks (this, pl_inode, dom);
>>  }
>>
>>
>>  return 0;
>>
>>  Missed this in the last patch.
>>
>>  It still doesn't solve the problem I described earlier. By the time it
>> executes this third loop refkeeper is already unreffed when C2 unlocks.
>>
>
>
>  Right, we will need to introduce an "in_cleanup" counter, if set
> pl_update_refkeeper() should not unref. Increment the in_cleanup() in the
> first lookup, and decrement it in the last loop, just before calling
> grant_blocked_locks() (along with the patches in my last 2 mails)
>
> s/first lookup/first loop/ ?
>
> Consider the following scenario:
> There are two granted locks L1, L2 from C1, C2 clients respectively on
> same inode.
> C1 gets disconnected.
> C2 issues a unlock.
>
> This is the sequence of steps:
> 1) C1 executes first loop, increments in_cleanup to 1
> 2) C2 executes pl_inode_setlk and removed L2 from granted list. It is now
> just before grant_blocked_inode_locks()
> 3) C1 starts 3rd for loop and unrefs L1, decrements in_cleanup to 0
> 4) C2 executes grant_blocked_inode_locks() and decrements the refkeepr,
> sets it to NULL and unwinds. This destroys the inode so pl_

Re: [Gluster-devel] Regarding doing away with refkeeper in locks xlator

2014-06-05 Thread Pranith Kumar Karampuri


On 06/06/2014 10:47 AM, Pranith Kumar Karampuri wrote:


On 06/06/2014 10:43 AM, Anand Avati wrote:




On Thu, Jun 5, 2014 at 9:54 PM, Pranith Kumar Karampuri 
mailto:pkara...@redhat.com>> wrote:



On 06/06/2014 10:02 AM, Anand Avati wrote:

On Thu, Jun 5, 2014 at 7:52 PM, Pranith Kumar Karampuri
mailto:pkara...@redhat.com>> wrote:



This sounds a bit complicated. I think there is a much
simpler solution:

- First, make update_refkeeper() check for blocked locks
(which I mentioned as "optional" previously)

- Make grant_blocked_locks() double up and do the job of
update_refkeeper() internally.

Something which looks like this:

diff --git a/xlators/features/locks/src/common.c
b/xlators/features/locks/src/common.c
index f6c71c1..38df385 100644
--- a/xlators/features/locks/src/common.c
+++ b/xlators/features/locks/src/common.c
@@ -126,8 +126,14 @@ __pl_inode_is_empty (pl_inode_t *pl_inode)
 if (!list_empty (&dom->entrylk_list))
   is_empty = 0;
+  if (!list_empty (&dom->blocked_entrylks))
+is_empty = 0;
+
 if (!list_empty (&dom->inodelk_list))
   is_empty = 0;
+
+  if (!list_empty (&dom->blocked_inodelks))
+is_empty = 0;
 }
 return is_empty;
@@ -944,12 +950,18 @@ grant_blocked_locks (xlator_t *this,
pl_inode_t *pl_inode)
 struct list_head granted_list;
 posix_lock_t *tmp = NULL;
 posix_lock_t *lock = NULL;
+   inode_t *unref = NULL;
 INIT_LIST_HEAD (&granted_list);
 pthread_mutex_lock (&pl_inode->mutex);
 {
 __grant_blocked_locks (this, pl_inode, &granted_list);
+
+ if (__pl_inode_is_empty (pl_inode) && pl_inode->refkeeper) {
+ unref = pl_inode->refkeeper;
+ pl_inode->refkeeper = NULL;
+   }
 }
 pthread_mutex_unlock (&pl_inode->mutex);
@@ -965,6 +977,9 @@ grant_blocked_locks (xlator_t *this,
pl_inode_t *pl_inode)
 GF_FREE (lock);
 }
+   if (unref)
+ inode_unref (unref);
+
 return;
 }


This should make pl_disconnect_cbk() pretty much race free
w.r.t refkpeer. Thoughts?

Lets say C1 is doing pl_inodelk_client_cleanup. After the
second for-loop(All granted and blocked locks are out of the
domain) if an unlock on the final granted lock on that inode
from client C2 completes, refkeeper would be set to NULL and
unrefed leading to zero refs on that inode i.e. pl_forget
will also happen. In 3rd for-loop pl_inode is already freed
and leads to free'd memory access and will crash.



We also need:

diff --git a/xlators/features/locks/src/inodelk.c 
b/xlators/features/locks/src/inodelk.c
index c76cb7f..2aceb8a 100644
--- a/xlators/features/locks/src/inodelk.c
+++ b/xlators/features/locks/src/inodelk.c
@@ -494,13 +494,13 @@ pl_inodelk_client_cleanup (xlator_t *this, pl_ctx_t 
*ctx)
  
 dom = get_domain (pl_inode, l->volume);
  
-   grant_blocked_inode_locks (this, pl_inode, dom);

-
 pthread_mutex_lock (&pl_inode->mutex);
 {
 __pl_inodelk_unref (l);
 }
 pthread_mutex_unlock (&pl_inode->mutex);
+
+   grant_blocked_inode_locks (this, pl_inode, dom);
  }
   
   return 0;


Missed this in the last patch.

It still doesn't solve the problem I described earlier. By the
time it executes this third loop refkeeper is already unreffed
when C2 unlocks.



Right, we will need to introduce an "in_cleanup" counter, if set 
pl_update_refkeeper() should not unref. Increment the in_cleanup() in 
the first lookup, and decrement it in the last loop, just before 
calling grant_blocked_locks() (along with the patches in my last 2 
mails)

s/first lookup/first loop/ ?

Consider the following scenario:
There are two granted locks L1, L2 from C1, C2 clients respectively on 
same inode.

C1 gets disconnected.
C2 issues a unlock.

This is the sequence of steps:
1) C1 executes first loop, increments in_cleanup to 1
2) C2 executes pl_inode_setlk and removed L2 from granted list. It is 
now just before grant_blocked_inode_locks()

3) C1 starts 3rd for loop and unrefs L1, decrements in_cleanup to 0
4) C2 executes grant_blocked_inode_locks() and decrements the refkeepr, 
sets it to NULL and unwinds. This destroys the inode so pl_inode is freed.

5) C1 calls grant_blocked_inode_locks with pl_inode which is free'd

Pranith


Pranith



___
Gluster-devel mailing list
Gluster-devel@glus

Re: [Gluster-devel] Regarding doing away with refkeeper in locks xlator

2014-06-05 Thread Pranith Kumar Karampuri


On 06/06/2014 10:43 AM, Anand Avati wrote:




On Thu, Jun 5, 2014 at 9:54 PM, Pranith Kumar Karampuri 
mailto:pkara...@redhat.com>> wrote:



On 06/06/2014 10:02 AM, Anand Avati wrote:

On Thu, Jun 5, 2014 at 7:52 PM, Pranith Kumar Karampuri
mailto:pkara...@redhat.com>> wrote:



This sounds a bit complicated. I think there is a much
simpler solution:

- First, make update_refkeeper() check for blocked locks
(which I mentioned as "optional" previously)

- Make grant_blocked_locks() double up and do the job of
update_refkeeper() internally.

Something which looks like this:

diff --git a/xlators/features/locks/src/common.c
b/xlators/features/locks/src/common.c
index f6c71c1..38df385 100644
--- a/xlators/features/locks/src/common.c
+++ b/xlators/features/locks/src/common.c
@@ -126,8 +126,14 @@ __pl_inode_is_empty (pl_inode_t *pl_inode)
 if (!list_empty (&dom->entrylk_list))
 is_empty = 0;
+if (!list_empty (&dom->blocked_entrylks))
+  is_empty = 0;
+
 if (!list_empty (&dom->inodelk_list))
 is_empty = 0;
+
+if (!list_empty (&dom->blocked_inodelks))
+  is_empty = 0;
 }
 return is_empty;
@@ -944,12 +950,18 @@ grant_blocked_locks (xlator_t *this,
pl_inode_t *pl_inode)
 struct list_head granted_list;
 posix_lock_t *tmp = NULL;
 posix_lock_t *lock = NULL;
+   inode_t *unref = NULL;
 INIT_LIST_HEAD (&granted_list);
 pthread_mutex_lock (&pl_inode->mutex);
 {
 __grant_blocked_locks (this, pl_inode, &granted_list);
+
+   if (__pl_inode_is_empty (pl_inode) &&
pl_inode->refkeeper) {
+   unref = pl_inode->refkeeper;
+ pl_inode->refkeeper = NULL;
+   }
 }
 pthread_mutex_unlock (&pl_inode->mutex);
@@ -965,6 +977,9 @@ grant_blocked_locks (xlator_t *this,
pl_inode_t *pl_inode)
 GF_FREE (lock);
 }
+   if (unref)
+ inode_unref (unref);
+
 return;
 }


This should make pl_disconnect_cbk() pretty much race free
w.r.t refkpeer. Thoughts?

Lets say C1 is doing pl_inodelk_client_cleanup. After the
second for-loop(All granted and blocked locks are out of the
domain) if an unlock on the final granted lock on that inode
from client C2 completes, refkeeper would be set to NULL and
unrefed leading to zero refs on that inode i.e. pl_forget
will also happen. In 3rd for-loop pl_inode is already freed
and leads to free'd memory access and will crash.



We also need:

diff --git a/xlators/features/locks/src/inodelk.c 
b/xlators/features/locks/src/inodelk.c
index c76cb7f..2aceb8a 100644
--- a/xlators/features/locks/src/inodelk.c
+++ b/xlators/features/locks/src/inodelk.c
@@ -494,13 +494,13 @@ pl_inodelk_client_cleanup (xlator_t *this, pl_ctx_t 
*ctx)
  
 dom = get_domain (pl_inode, l->volume);
  
-   grant_blocked_inode_locks (this, pl_inode, dom);

-
 pthread_mutex_lock (&pl_inode->mutex);
 {
 __pl_inodelk_unref (l);
 }
 pthread_mutex_unlock (&pl_inode->mutex);
+
+   grant_blocked_inode_locks (this, pl_inode, dom);
  }
   
 return 0;


Missed this in the last patch.

It still doesn't solve the problem I described earlier. By the
time it executes this third loop refkeeper is already unreffed
when C2 unlocks.



Right, we will need to introduce an "in_cleanup" counter, if set 
pl_update_refkeeper() should not unref. Increment the in_cleanup() in 
the first lookup, and decrement it in the last loop, just before 
calling grant_blocked_locks() (along with the patches in my last 2 mails)

s/first lookup/first loop/ ?

Pranith

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Regarding doing away with refkeeper in locks xlator

2014-06-05 Thread Anand Avati
On Thu, Jun 5, 2014 at 9:54 PM, Pranith Kumar Karampuri  wrote:

>
> On 06/06/2014 10:02 AM, Anand Avati wrote:
>
> On Thu, Jun 5, 2014 at 7:52 PM, Pranith Kumar Karampuri <
> pkara...@redhat.com> wrote:
>
>>
>>  This sounds a bit complicated. I think there is a much simpler solution:
>>
>>  - First, make update_refkeeper() check for blocked locks (which I
>> mentioned as "optional" previously)
>>
>>  - Make grant_blocked_locks() double up and do the job of
>> update_refkeeper() internally.
>>
>>  Something which looks like this:
>>
>>   diff --git a/xlators/features/locks/src/common.c
>> b/xlators/features/locks/src/common.c
>> index f6c71c1..38df385 100644
>> --- a/xlators/features/locks/src/common.c
>> +++ b/xlators/features/locks/src/common.c
>> @@ -126,8 +126,14 @@ __pl_inode_is_empty (pl_inode_t *pl_inode)
>>  if (!list_empty (&dom->entrylk_list))
>>  is_empty = 0;
>>
>> +if (!list_empty (&dom->blocked_entrylks))
>> +is_empty = 0;
>> +
>>  if (!list_empty (&dom->inodelk_list))
>>  is_empty = 0;
>> +
>> +if (!list_empty (&dom->blocked_inodelks))
>> +is_empty = 0;
>>  }
>>
>>   return is_empty;
>> @@ -944,12 +950,18 @@ grant_blocked_locks (xlator_t *this, pl_inode_t
>> *pl_inode)
>>  struct list_head granted_list;
>>  posix_lock_t *tmp = NULL;
>>  posix_lock_t *lock = NULL;
>> +   inode_t *unref = NULL;
>>
>>  INIT_LIST_HEAD (&granted_list);
>>
>>  pthread_mutex_lock (&pl_inode->mutex);
>>  {
>>  __grant_blocked_locks (this, pl_inode, &granted_list);
>> +
>> +   if (__pl_inode_is_empty (pl_inode) &&
>> pl_inode->refkeeper) {
>>  +   unref = pl_inode->refkeeper;
>> +   pl_inode->refkeeper = NULL;
>> +   }
>>  }
>>  pthread_mutex_unlock (&pl_inode->mutex);
>>
>> @@ -965,6 +977,9 @@ grant_blocked_locks (xlator_t *this, pl_inode_t
>> *pl_inode)
>>  GF_FREE (lock);
>>  }
>>
>> +   if (unref)
>> +   inode_unref (unref);
>> +
>>  return;
>>  }
>>
>>
>>  This should make pl_disconnect_cbk() pretty much race free w.r.t
>> refkpeer. Thoughts?
>>
>>  Lets say C1 is doing pl_inodelk_client_cleanup. After the second
>> for-loop(All granted and blocked locks are out of the domain) if an unlock
>> on the final granted lock on that inode from client C2 completes, refkeeper
>> would be set to NULL and unrefed leading to zero refs on that inode i.e.
>> pl_forget will also happen. In 3rd for-loop pl_inode is already freed and
>> leads to free'd memory access and will crash.
>>
>
>
>  We also need:
>
>  diff --git a/xlators/features/locks/src/inodelk.c 
> b/xlators/features/locks/src/inodelk.c
> index c76cb7f..2aceb8a 100644
> --- a/xlators/features/locks/src/inodelk.c
> +++ b/xlators/features/locks/src/inodelk.c
> @@ -494,13 +494,13 @@ pl_inodelk_client_cleanup (xlator_t *this, pl_ctx_t 
> *ctx)
>
> dom = get_domain (pl_inode, l->volume);
>
> -   grant_blocked_inode_locks (this, pl_inode, dom);
> -
> pthread_mutex_lock (&pl_inode->mutex);
> {
> __pl_inodelk_unref (l);
> }
> pthread_mutex_unlock (&pl_inode->mutex);
> +
> +   grant_blocked_inode_locks (this, pl_inode, dom);
>  }
>
>
>  return 0;
>
>  Missed this in the last patch.
>
> It still doesn't solve the problem I described earlier. By the time it
> executes this third loop refkeeper is already unreffed when C2 unlocks.
>


Right, we will need to introduce an "in_cleanup" counter, if set
pl_update_refkeeper() should not unref. Increment the in_cleanup() in the
first lookup, and decrement it in the last loop, just before calling
grant_blocked_locks() (along with the patches in my last 2 mails)
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Regarding doing away with refkeeper in locks xlator

2014-06-05 Thread Pranith Kumar Karampuri


On 06/06/2014 10:02 AM, Anand Avati wrote:




On Thu, Jun 5, 2014 at 7:52 PM, Pranith Kumar Karampuri 
mailto:pkara...@redhat.com>> wrote:




This sounds a bit complicated. I think there is a much simpler
solution:

- First, make update_refkeeper() check for blocked locks (which I
mentioned as "optional" previously)

- Make grant_blocked_locks() double up and do the job of
update_refkeeper() internally.

Something which looks like this:

diff --git a/xlators/features/locks/src/common.c
b/xlators/features/locks/src/common.c
index f6c71c1..38df385 100644
--- a/xlators/features/locks/src/common.c
+++ b/xlators/features/locks/src/common.c
@@ -126,8 +126,14 @@ __pl_inode_is_empty (pl_inode_t *pl_inode)
 if (!list_empty (&dom->entrylk_list))
 is_empty = 0;
+if (!list_empty (&dom->blocked_entrylks))
+is_empty = 0;
+
 if (!list_empty (&dom->inodelk_list))
 is_empty = 0;
+
+if (!list_empty (&dom->blocked_inodelks))
+is_empty = 0;
 }
 return is_empty;
@@ -944,12 +950,18 @@ grant_blocked_locks (xlator_t *this,
pl_inode_t *pl_inode)
 struct list_head granted_list;
 posix_lock_t *tmp = NULL;
 posix_lock_t *lock = NULL;
+   inode_t *unref = NULL;
 INIT_LIST_HEAD (&granted_list);
 pthread_mutex_lock (&pl_inode->mutex);
 {
 __grant_blocked_locks (this, pl_inode, &granted_list);
+
+   if (__pl_inode_is_empty (pl_inode) &&
pl_inode->refkeeper) {
+   unref = pl_inode->refkeeper;
+ pl_inode->refkeeper = NULL;
+   }
 }
 pthread_mutex_unlock (&pl_inode->mutex);
@@ -965,6 +977,9 @@ grant_blocked_locks (xlator_t *this,
pl_inode_t *pl_inode)
 GF_FREE (lock);
 }
+   if (unref)
+   inode_unref (unref);
+
 return;
 }


This should make pl_disconnect_cbk() pretty much race free w.r.t
refkpeer. Thoughts?

Lets say C1 is doing pl_inodelk_client_cleanup. After the second
for-loop(All granted and blocked locks are out of the domain) if
an unlock on the final granted lock on that inode from client C2
completes, refkeeper would be set to NULL and unrefed leading to
zero refs on that inode i.e. pl_forget will also happen. In 3rd
for-loop pl_inode is already freed and leads to free'd memory
access and will crash.



We also need:

diff --git a/xlators/features/locks/src/inodelk.c 
b/xlators/features/locks/src/inodelk.c
index c76cb7f..2aceb8a 100644
--- a/xlators/features/locks/src/inodelk.c
+++ b/xlators/features/locks/src/inodelk.c
@@ -494,13 +494,13 @@ pl_inodelk_client_cleanup (xlator_t *this, pl_ctx_t *ctx)
  
 dom = get_domain (pl_inode, l->volume);
  
-   grant_blocked_inode_locks (this, pl_inode, dom);

-
 pthread_mutex_lock (&pl_inode->mutex);
 {
 __pl_inodelk_unref (l);
 }
 pthread_mutex_unlock (&pl_inode->mutex);
+
+   grant_blocked_inode_locks (this, pl_inode, dom);
  }
   
 return 0;


Missed this in the last patch.
It still doesn't solve the problem I described earlier. By the time it 
executes this third loop refkeeper is already unreffed when C2 unlocks.


Pranith

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Regarding doing away with refkeeper in locks xlator

2014-06-05 Thread Anand Avati
On Thu, Jun 5, 2014 at 7:52 PM, Pranith Kumar Karampuri  wrote:

>
>  This sounds a bit complicated. I think there is a much simpler solution:
>
>  - First, make update_refkeeper() check for blocked locks (which I
> mentioned as "optional" previously)
>
>  - Make grant_blocked_locks() double up and do the job of
> update_refkeeper() internally.
>
>  Something which looks like this:
>
>   diff --git a/xlators/features/locks/src/common.c
> b/xlators/features/locks/src/common.c
> index f6c71c1..38df385 100644
> --- a/xlators/features/locks/src/common.c
> +++ b/xlators/features/locks/src/common.c
> @@ -126,8 +126,14 @@ __pl_inode_is_empty (pl_inode_t *pl_inode)
>  if (!list_empty (&dom->entrylk_list))
>  is_empty = 0;
>
> +if (!list_empty (&dom->blocked_entrylks))
> +is_empty = 0;
> +
>  if (!list_empty (&dom->inodelk_list))
>  is_empty = 0;
> +
> +if (!list_empty (&dom->blocked_inodelks))
> +is_empty = 0;
>  }
>
>   return is_empty;
> @@ -944,12 +950,18 @@ grant_blocked_locks (xlator_t *this, pl_inode_t
> *pl_inode)
>  struct list_head granted_list;
>  posix_lock_t *tmp = NULL;
>  posix_lock_t *lock = NULL;
> +   inode_t *unref = NULL;
>
>  INIT_LIST_HEAD (&granted_list);
>
>  pthread_mutex_lock (&pl_inode->mutex);
>  {
>  __grant_blocked_locks (this, pl_inode, &granted_list);
> +
> +   if (__pl_inode_is_empty (pl_inode) && pl_inode->refkeeper)
> {
>  +   unref = pl_inode->refkeeper;
> +   pl_inode->refkeeper = NULL;
> +   }
>  }
>  pthread_mutex_unlock (&pl_inode->mutex);
>
> @@ -965,6 +977,9 @@ grant_blocked_locks (xlator_t *this, pl_inode_t
> *pl_inode)
>  GF_FREE (lock);
>  }
>
> +   if (unref)
> +   inode_unref (unref);
> +
>  return;
>  }
>
>
>  This should make pl_disconnect_cbk() pretty much race free w.r.t
> refkpeer. Thoughts?
>
> Lets say C1 is doing pl_inodelk_client_cleanup. After the second
> for-loop(All granted and blocked locks are out of the domain) if an unlock
> on the final granted lock on that inode from client C2 completes, refkeeper
> would be set to NULL and unrefed leading to zero refs on that inode i.e.
> pl_forget will also happen. In 3rd for-loop pl_inode is already freed and
> leads to free'd memory access and will crash.
>


We also need:

diff --git a/xlators/features/locks/src/inodelk.c
b/xlators/features/locks/src/inodelk.c
index c76cb7f..2aceb8a 100644
--- a/xlators/features/locks/src/inodelk.c
+++ b/xlators/features/locks/src/inodelk.c
@@ -494,13 +494,13 @@ pl_inodelk_client_cleanup (xlator_t *this, pl_ctx_t *ctx)

dom = get_domain (pl_inode, l->volume);

-   grant_blocked_inode_locks (this, pl_inode, dom);
-
pthread_mutex_lock (&pl_inode->mutex);
{
__pl_inodelk_unref (l);
}
pthread_mutex_unlock (&pl_inode->mutex);
+
+   grant_blocked_inode_locks (this, pl_inode, dom);
 }


 return 0;

Missed this in the last patch.
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Regarding doing away with refkeeper in locks xlator

2014-06-05 Thread Pranith Kumar Karampuri


On 06/06/2014 12:11 AM, Anand Avati wrote:


On Thu, Jun 5, 2014 at 10:46 AM, Krutika Dhananjay 
mailto:kdhan...@redhat.com>> wrote:


To summarize, the real "problems" are:

- Deref of pl_inode->refkeeper as an inode_t in the cleanup
logger. It
is an internal artifact of pl_update_refkeeper() working and
nobody else
is supposed to "touch" it.

For this, the solution I have in mind is to
a. have a placeholder for gfid in pl_inode_t object,
b. remember the gfid of the inode at the time of pl_inode_t
creation in pl_ctx_get(), and
c. print pl_inode->gfid in the log message in
pl_{inodelk,entrylk}_log_cleanup().


OK.

- Missing call to pl_update_refkeeper() in
client_disconnect_cbk(). This
can result in a potential leak of inode ref (not a leak if the
same
inode gets any locking activity by another client in the future).

As far as updates to refkeeper is concerned during DISCONNECT is
concerned, Pranith and I did have discussions at length but could
not find a single safe place in cleanup functions to call
pl_update_refkeeper() that does not lead to illegal memory access,
whether before or after the call to __pl_inodelk_unref() in the
third for loop.

Case #1 explained:
=


list_for_each_entry_safe() {
...
...
pl_update_refkeeper(inode);
pthread_mutex_lock (&pl_inode->mutex);
 __pl_inodelk_unref(l);
pthread_mutex_unlock (&pl_inode->mutex);


This causes the last unref in pl_update_refkeeper() to implicitly
call pl_forget() where pl_inode_t object is destroyed while it is
still needed in terms of having to obtain pl_inode->mutex before
unrefing the lock object.

Case 2 explained:
=

list_for_each_entry_safe() {
...
...
pthread_mutex_lock (&pl_inode->mutex);
 __pl_inodelk_unref(l);
pthread_mutex_unlock (&pl_inode->mutex);
pl_update_refkeeper(inode);


After the first for loop is processed, the domain's lists could
have been emptied. And at this point, there could well come a
second thread that could update refkeeper,
triggering last unref on the inode leading to a call to
pl_forget() (where pl_inode_t is freed), after which the
DISCONNECT thread would be trying to perform illegal
memory access in its call to pl_update_refkeeper() during its turn.

So the solution Pranith and I came up with involves making sure
that the inode_t object is alive for as long as there is atleast
one lock on it:

1. refkeeper will not be used for inodelks and entrylks (but will
continue to be used in fcntl locks).
2. Each pl_inode_t object will also host an inode_t member which
is initialised during a call to pl_inode_get() in
pl_common_{inodelk, entrylks}().
3. Everytime a lock is granted/blocked, pl_inode->inode is ref'd
(in pl_inode_setlk() after successful locking.
4. Everytime a lock is unlocked, pl_inode->inode is unref'd.
5. In disconnect codepath, as part of "released" list processing,
the inodes associated with the client are unref'd in the same loop
right after every lock object is unref'd.

With all this, there is one problem with the lock object that is
found to be in both blocked and granted list in the transient
state, when there's a race between the unlocking thread
and the disconnecting thread. This can be best explained with the
following example:

Consider an inode I on which there's a granted lock G and a
blocked lock B, from clients C1 and C2 respectively. With this
approach, at this point the number of refs taken
by the locks xlator on I would be 2.
C1 unlocks B, at which point I's refcount becomes 1.
Now as part of granting other blocked locks in unlock codepath, B
is put in granted list.
Now B's client C2 sends a DISCONNECT event, which would cause I to
be unref'd again. This being the last unref, pl_forget() is called
on I causing its pl_inode to be destroyed
At this point, the unlocking thread could try to do a mutex lock
on pl_inode->mutex in grant_blocked_{inode,entry}_locks() (whereas
pl_inode is now already destroyed) leading to a crash.


The problem described in the example can be fixed by making sure
grant_blocked_{inode,entry}_locks() refs the inode first thing and
then unrefs it just before returning.
This would fix illegal memory access.


This sounds a bit complicated. I think there is a much simpler solution:

- First, make update_refkeeper() check for blocked locks (which I 
mentioned as "optional" previously)


- Make grant_blocked_locks() double up and do the job of 
update_refkeeper() internally.


Something which looks like this:

diff --git a/xlators/features/locks/src/common.c 
b/xlators/features/locks/src/common.c

index f6c71c1..38df385 10064

Re: [Gluster-devel] Regarding doing away with refkeeper in locks xlator

2014-06-05 Thread Anand Avati
On Thu, Jun 5, 2014 at 10:46 AM, Krutika Dhananjay 
wrote:

> To summarize, the real "problems" are:
>
> - Deref of pl_inode->refkeeper as an inode_t in the cleanup logger. It
> is an internal artifact of pl_update_refkeeper() working and nobody else
> is supposed to "touch" it.
>
> For this, the solution I have in mind is to
> a. have a placeholder for gfid in pl_inode_t object,
> b. remember the gfid of the inode at the time of pl_inode_t creation in
> pl_ctx_get(), and
> c. print pl_inode->gfid in the log message in
> pl_{inodelk,entrylk}_log_cleanup().
>

OK.

- Missing call to pl_update_refkeeper() in client_disconnect_cbk(). This
> can result in a potential leak of inode ref (not a leak if the same
> inode gets any locking activity by another client in the future).
>
> As far as updates to refkeeper is concerned during DISCONNECT is
> concerned, Pranith and I did have discussions at length but could
> not find a single safe place in cleanup functions to call
> pl_update_refkeeper() that does not lead to illegal memory access,
> whether before or after the call to __pl_inodelk_unref() in the third for
> loop.
>
> Case #1 explained:
> =
>
> 
> list_for_each_entry_safe() {
> ...
> ...
> pl_update_refkeeper(inode);
> pthread_mutex_lock (&pl_inode->mutex);
>  __pl_inodelk_unref(l);
> pthread_mutex_unlock (&pl_inode->mutex);
> 
>
> This causes the last unref in pl_update_refkeeper() to implicitly call
> pl_forget() where pl_inode_t object is destroyed while it is
> still needed in terms of having to obtain pl_inode->mutex before unrefing
> the lock object.
>
> Case 2 explained:
> =
> 
> list_for_each_entry_safe() {
> ...
> ...
> pthread_mutex_lock (&pl_inode->mutex);
>  __pl_inodelk_unref(l);
> pthread_mutex_unlock (&pl_inode->mutex);
> pl_update_refkeeper(inode);
> 
>
> After the first for loop is processed, the domain's lists could have been
> emptied. And at this point, there could well come a second thread that
> could update refkeeper,
> triggering last unref on the inode leading to a call to pl_forget() (where
> pl_inode_t is freed), after which the DISCONNECT thread would be trying to
> perform illegal
> memory access in its call to pl_update_refkeeper() during its turn.
>
> So the solution Pranith and I came up with involves making sure that the
> inode_t object is alive for as long as there is atleast one lock on it:
>
> 1. refkeeper will not be used for inodelks and entrylks (but will continue
> to be used in fcntl locks).
> 2. Each pl_inode_t object will also host an inode_t member which is
> initialised during a call to pl_inode_get() in pl_common_{inodelk,
> entrylks}().
> 3. Everytime a lock is granted/blocked, pl_inode->inode is ref'd (in
> pl_inode_setlk() after successful locking.
> 4. Everytime a lock is unlocked, pl_inode->inode is unref'd.
> 5. In disconnect codepath, as part of "released" list processing, the
> inodes associated with the client are unref'd in the same loop right after
> every lock object is unref'd.
>
> With all this, there is one problem with the lock object that is found to
> be in both blocked and granted list in the transient state, when there's a
> race between the unlocking thread
> and the disconnecting thread. This can be best explained with the
> following example:
> 
> Consider an inode I on which there's a granted lock G and a blocked lock
> B, from clients C1 and C2 respectively. With this approach, at this point
> the number of refs taken
> by the locks xlator on I would be 2.
> C1 unlocks B, at which point I's refcount becomes 1.
> Now as part of granting other blocked locks in unlock codepath, B is put
> in granted list.
> Now B's client C2 sends a DISCONNECT event, which would cause I to be
> unref'd again. This being the last unref, pl_forget() is called on I
> causing its pl_inode to be destroyed
> At this point, the unlocking thread could try to do a mutex lock on
> pl_inode->mutex in grant_blocked_{inode,entry}_locks() (whereas pl_inode is
> now already destroyed) leading to a crash.
> 
>
> The problem described in the example can be fixed by making sure
> grant_blocked_{inode,entry}_locks() refs the inode first thing and then
> unrefs it just before returning.
> This would fix illegal memory access.
>

This sounds a bit complicated. I think there is a much simpler solution:

- First, make update_refkeeper() check for blocked locks (which I mentioned
as "optional" previously)

- Make grant_blocked_locks() double up and do the job of update_refkeeper()
internally.

Something which looks like this:

diff --git a/xlators/features/locks/src/common.c
b/xlators/features/locks/src/common.c
index f6c71c1..38df385 100644
--- a/xlators/features/locks/src/common.c
+++ b/xlators/features/locks/src/common.c
@@ -126,8 +126,14 @@ __pl_inode_is_empty (pl_inode_t *pl_inode)
 if (!list_empty (&dom->entrylk_list))
 is_empty = 0;

+if (!list_empty (&dom->blocked_entrylks))
+

Re: [Gluster-devel] Regarding doing away with refkeeper in locks xlator

2014-06-05 Thread Krutika Dhananjay
- Original Message -

> From: "Krutika Dhananjay" 
> To: "Anand Avati" 
> Cc: gluster-devel@gluster.org
> Sent: Thursday, June 5, 2014 11:16:33 PM
> Subject: Re: [Gluster-devel] Regarding doing away with refkeeper in locks
> xlator

> - Original Message -

> > From: "Anand Avati" 
> 
> > To: "Krutika Dhananjay" , "Pranith Kumar Karampuri"
> > 
> 
> > Cc: gluster-devel@gluster.org
> 
> > Sent: Thursday, June 5, 2014 12:38:41 PM
> 
> > Subject: Re: [Gluster-devel] Regarding doing away with refkeeper in locks
> > xlator
> 

> > On 6/4/14, 9:43 PM, Krutika Dhananjay wrote:
> 
> > >
> 
> > >
> 
> > > 
> 
> > >
> 
> > > *From: *"Pranith Kumar Karampuri" 
> 
> > > *To: *"Krutika Dhananjay" , "Anand Avati"
> 
> > > 
> 
> > > *Cc: *gluster-devel@gluster.org
> 
> > > *Sent: *Wednesday, June 4, 2014 12:23:59 PM
> 
> > > *Subject: *Re: [Gluster-devel] Regarding doing away with refkeeper
> 
> > > in locks xlator
> 
> > >
> 
> > >
> 
> > > On 06/04/2014 12:02 PM, Pranith Kumar Karampuri wrote:
> 
> > >
> 
> > >
> 
> > > On 06/04/2014 11:37 AM, Krutika Dhananjay wrote:
> 
> > >
> 
> > > Hi,
> 
> > >
> 
> > > Recently there was a crash in locks translator (BZ 1103347,
> 
> > > BZ 1097102) with the following backtrace:
> 
> > > (gdb) bt
> 
> > > #0 uuid_unpack (in=0x8 ,
> 
> > > uu=0x7fffea6c6a60) at ../../contrib/uuid/unpack.c:44
> 
> > > #1 0x7feeba9e19d6 in uuid_unparse_x (uu= 
> > > optimized out>, out=0x2350fc0
> 
> > > "081bbc7a-7551-44ac-85c7-aad5e2633db9",
> 
> > > fmt=0x7feebaa08e00
> 
> > > "%08x-%04x-%04x-%02x%02x-%02x%02x%02x%02x%02x%02x") at
> 
> > > ../../contrib/uuid/unparse.c:55
> 
> > > #2 0x7feeba9be837 in uuid_utoa (uuid=0x8  
> > > out of bounds>) at common-utils.c:2138
> 
> > > #3 0x7feeb06e8a58 in pl_inodelk_log_cleanup
> 
> > > (this=0x230d910, ctx=0x7fee700f0c60) at inodelk.c:396
> 
> > > #4 pl_inodelk_client_cleanup (this=0x230d910,
> 
> > > ctx=0x7fee700f0c60) at inodelk.c:428
> 
> > > #5 0x7feeb06ddf3a in pl_client_disconnect_cbk
> 
> > > (this=0x230d910, client=) at posix.c:2550
> 
> > > #6 0x7feeba9fa2dd in gf_client_disconnect
> 
> > > (client=0x27724a0) at client_t.c:368
> 
> > > #7 0x7feeab77ed48 in server_connection_cleanup
> 
> > > (this=0x2316390, client=0x27724a0, flags= 
> > > out>) at server-helpers.c:354
> 
> > > #8 0x7feeab77ae2c in server_rpc_notify (rpc= 
> > > optimized out>, xl=0x2316390, event=,
> 
> > > data=0x2bf51c0) at server.c:527
> 
> > > #9 0x7feeba775155 in rpcsvc_handle_disconnect
> 
> > > (svc=0x2325980, trans=0x2bf51c0) at rpcsvc.c:720
> 
> > > #10 0x7feeba776c30 in rpcsvc_notify (trans=0x2bf51c0,
> 
> > > mydata=, event=,
> 
> > > data=0x2bf51c0) at rpcsvc.c:758
> 
> > > #11 0x7feeba778638 in rpc_transport_notify (this= 
> > > optimized out>, event=, data= 
> > > optimized out>) at rpc-transport.c:512
> 
> > > #12 0x7feeb115e971 in socket_event_poll_err (fd= 
> > > optimized out>, idx=, data=0x2bf51c0,
> 
> > > poll_in=, poll_out=0,
> 
> > > poll_err=0) at socket.c:1071
> 
> > > #13 socket_event_handler (fd=,
> 
> > > idx=, data=0x2bf51c0, poll_in= 
> > > optimized out>, poll_out=0, poll_err=0) at socket.c:2240
> 
> > > #14 0x7feeba9fc6a7 in event_dispatch_epoll_handler
> 
> > > (event_pool=0x22e2d00) at event-epoll.c:384
> 
> > > #15 event_dispatch_epoll (event_pool=0x22e2d00) at
> 
> > > event-epoll.c:445
> 
> > > #16 0x00407e93 in main (argc=19,
> 
> > > argv=0x7fffea6c7f88) at glusterfsd.c:2023
> 
> > > (gdb) f 4
> 
> > > #4 pl_inodelk_client_cleanup (this=0x230d910,
> 
> > > ctx=0x7fee700f0c60) at inodelk.c:428
> 
> > > 428 pl_inodelk_log_cleanup (l);
> 
> > > (gdb) p l->pl_inode->refkeeper
> 
> > > $1 = (inode_t *) 0x0
> 
> > > (gdb)
> 
> > >
> 
> > > pl_inode->refkeeper was found to be NULL even when there
> 
> &

Re: [Gluster-devel] Regarding doing away with refkeeper in locks xlator

2014-06-05 Thread Krutika Dhananjay


- Original Message -


From: "Anand Avati"  
To: "Krutika Dhananjay" , "Pranith Kumar Karampuri" 
 
Cc: gluster-devel@gluster.org 
Sent: Thursday, June 5, 2014 12:38:41 PM 
Subject: Re: [Gluster-devel] Regarding doing away with refkeeper in locks 
xlator 

On 6/4/14, 9:43 PM, Krutika Dhananjay wrote: 
> 
> 
>  
> 
> *From: *"Pranith Kumar Karampuri"  
> *To: *"Krutika Dhananjay" , "Anand Avati" 
>  
> *Cc: *gluster-devel@gluster.org 
> *Sent: *Wednesday, June 4, 2014 12:23:59 PM 
> *Subject: *Re: [Gluster-devel] Regarding doing away with refkeeper 
> in locks xlator 
> 
> 
> On 06/04/2014 12:02 PM, Pranith Kumar Karampuri wrote: 
> 
> 
> On 06/04/2014 11:37 AM, Krutika Dhananjay wrote: 
> 
> Hi, 
> 
> Recently there was a crash in locks translator (BZ 1103347, 
> BZ 1097102) with the following backtrace: 
> (gdb) bt 
> #0 uuid_unpack (in=0x8 , 
> uu=0x7fffea6c6a60) at ../../contrib/uuid/unpack.c:44 
> #1 0x7feeba9e19d6 in uuid_unparse_x (uu= optimized out>, out=0x2350fc0 
> "081bbc7a-7551-44ac-85c7-aad5e2633db9", 
> fmt=0x7feebaa08e00 
> "%08x-%04x-%04x-%02x%02x-%02x%02x%02x%02x%02x%02x") at 
> ../../contrib/uuid/unparse.c:55 
> #2 0x7feeba9be837 in uuid_utoa (uuid=0x8  out of bounds>) at common-utils.c:2138 
> #3 0x7feeb06e8a58 in pl_inodelk_log_cleanup 
> (this=0x230d910, ctx=0x7fee700f0c60) at inodelk.c:396 
> #4 pl_inodelk_client_cleanup (this=0x230d910, 
> ctx=0x7fee700f0c60) at inodelk.c:428 
> #5 0x7feeb06ddf3a in pl_client_disconnect_cbk 
> (this=0x230d910, client=) at posix.c:2550 
> #6 0x7feeba9fa2dd in gf_client_disconnect 
> (client=0x27724a0) at client_t.c:368 
> #7 0x7feeab77ed48 in server_connection_cleanup 
> (this=0x2316390, client=0x27724a0, flags= out>) at server-helpers.c:354 
> #8 0x7feeab77ae2c in server_rpc_notify (rpc= optimized out>, xl=0x2316390, event=, 
> data=0x2bf51c0) at server.c:527 
> #9 0x7feeba775155 in rpcsvc_handle_disconnect 
> (svc=0x2325980, trans=0x2bf51c0) at rpcsvc.c:720 
> #10 0x7feeba776c30 in rpcsvc_notify (trans=0x2bf51c0, 
> mydata=, event=, 
> data=0x2bf51c0) at rpcsvc.c:758 
> #11 0x7feeba778638 in rpc_transport_notify (this= optimized out>, event=, data= optimized out>) at rpc-transport.c:512 
> #12 0x7feeb115e971 in socket_event_poll_err (fd= optimized out>, idx=, data=0x2bf51c0, 
> poll_in=, poll_out=0, 
> poll_err=0) at socket.c:1071 
> #13 socket_event_handler (fd=, 
> idx=, data=0x2bf51c0, poll_in= optimized out>, poll_out=0, poll_err=0) at socket.c:2240 
> #14 0x7feeba9fc6a7 in event_dispatch_epoll_handler 
> (event_pool=0x22e2d00) at event-epoll.c:384 
> #15 event_dispatch_epoll (event_pool=0x22e2d00) at 
> event-epoll.c:445 
> #16 0x00407e93 in main (argc=19, 
> argv=0x7fffea6c7f88) at glusterfsd.c:2023 
> (gdb) f 4 
> #4 pl_inodelk_client_cleanup (this=0x230d910, 
> ctx=0x7fee700f0c60) at inodelk.c:428 
> 428 pl_inodelk_log_cleanup (l); 
> (gdb) p l->pl_inode->refkeeper 
> $1 = (inode_t *) 0x0 
> (gdb) 
> 
> pl_inode->refkeeper was found to be NULL even when there 
> were some blocked inodelks in a certain domain of the inode, 
> which when dereferenced by the epoll thread in the cleanup 
> codepath led to a crash. 
> 
> On inspecting the code (for want of a consistent 
> reproducer), three things were found: 
> 
> 1. The function where the crash happens 
> (pl_inodelk_log_cleanup()), makes an attempt to resolve the 
> inode to path as can be seen below. But the way inode_path() 
> itself 
> works is to first construct the path based on the given 
> inode's ancestry and place it in the buffer provided. And if 
> all else fails, the gfid of the inode is placed in a certain 
> format (""). 
> This eliminates the need for statements from line 4 
> through 7 below, thereby "preventing" dereferencing of 
> pl_inode->refkeeper. 
> Now, although this change prevents the crash 
> altogether, it still does not fix the race that led to 
> pl_inode->refkeeper becoming NULL, and comes at the cost of 
> printing "(null)" in the log message on line 9 every 
> time pl_inode->refkeeper is found to be NULL, rendering the 
> logged messages somewhat useless. 
> 
>  
> 0 pl_inode = lock->pl_inode; 
> 1 
> 2 inode_path (pl_inode->refkeeper, NULL, &path); 
> 3 
> 4 if (path) 
> 5 file = path; 
> 6 else 
> 7 file = uuid_utoa 
> (pl_inode->refkeeper->gfid); 
> 8 
> 9 gf_log (THIS->name, GF_LOG_WARNING, 
> 10 "releasing lock on %s held by &

Re: [Gluster-devel] Regarding doing away with refkeeper in locks xlator

2014-06-05 Thread Anand Avati

On 6/4/14, 9:43 PM, Krutika Dhananjay wrote:





*From: *"Pranith Kumar Karampuri" 
*To: *"Krutika Dhananjay" , "Anand Avati"

*Cc: *gluster-devel@gluster.org
*Sent: *Wednesday, June 4, 2014 12:23:59 PM
    *Subject: *Re: [Gluster-devel] Regarding doing away with refkeeper
in locks xlator


On 06/04/2014 12:02 PM, Pranith Kumar Karampuri wrote:


On 06/04/2014 11:37 AM, Krutika Dhananjay wrote:

Hi,

Recently there was a crash in locks translator (BZ 1103347,
BZ 1097102) with the following backtrace:
(gdb) bt
#0  uuid_unpack (in=0x8 ,
uu=0x7fffea6c6a60) at ../../contrib/uuid/unpack.c:44
#1  0x7feeba9e19d6 in uuid_unparse_x (uu=, out=0x2350fc0
"081bbc7a-7551-44ac-85c7-aad5e2633db9",
 fmt=0x7feebaa08e00
"%08x-%04x-%04x-%02x%02x-%02x%02x%02x%02x%02x%02x") at
../../contrib/uuid/unparse.c:55
#2  0x7feeba9be837 in uuid_utoa (uuid=0x8 ) at common-utils.c:2138
#3  0x7feeb06e8a58 in pl_inodelk_log_cleanup
(this=0x230d910, ctx=0x7fee700f0c60) at inodelk.c:396
#4  pl_inodelk_client_cleanup (this=0x230d910,
ctx=0x7fee700f0c60) at inodelk.c:428
#5  0x7feeb06ddf3a in pl_client_disconnect_cbk
(this=0x230d910, client=) at posix.c:2550
#6  0x7feeba9fa2dd in gf_client_disconnect
(client=0x27724a0) at client_t.c:368
#7  0x7feeab77ed48 in server_connection_cleanup
(this=0x2316390, client=0x27724a0, flags=) at server-helpers.c:354
#8  0x7feeab77ae2c in server_rpc_notify (rpc=, xl=0x2316390, event=,
data=0x2bf51c0) at server.c:527
#9  0x7feeba775155 in rpcsvc_handle_disconnect
(svc=0x2325980, trans=0x2bf51c0) at rpcsvc.c:720
#10 0x7feeba776c30 in rpcsvc_notify (trans=0x2bf51c0,
mydata=, event=,
data=0x2bf51c0) at rpcsvc.c:758
#11 0x7feeba778638 in rpc_transport_notify (this=, event=, data=) at rpc-transport.c:512
#12 0x7feeb115e971 in socket_event_poll_err (fd=, idx=, data=0x2bf51c0,
poll_in=, poll_out=0,
 poll_err=0) at socket.c:1071
#13 socket_event_handler (fd=,
idx=, data=0x2bf51c0, poll_in=, poll_out=0, poll_err=0) at socket.c:2240
#14 0x7feeba9fc6a7 in event_dispatch_epoll_handler
(event_pool=0x22e2d00) at event-epoll.c:384
#15 event_dispatch_epoll (event_pool=0x22e2d00) at
event-epoll.c:445
#16 0x00407e93 in main (argc=19,
argv=0x7fffea6c7f88) at glusterfsd.c:2023
(gdb) f 4
#4  pl_inodelk_client_cleanup (this=0x230d910,
ctx=0x7fee700f0c60) at inodelk.c:428
428pl_inodelk_log_cleanup (l);
(gdb) p l->pl_inode->refkeeper
$1 = (inode_t *) 0x0
(gdb)

pl_inode->refkeeper was found to be NULL even when there
were some blocked inodelks in a certain domain of the inode,
which when dereferenced by the epoll thread in the cleanup
codepath led to a crash.

On inspecting the code (for want of a consistent
reproducer), three things were found:

1. The function where the crash happens
(pl_inodelk_log_cleanup()), makes an attempt to resolve the
inode to path as can be seen below. But the way inode_path()
itself
 works is to first construct the path based on the given
inode's ancestry and place it in the buffer provided. And if
all else fails, the gfid of the inode is placed in a certain
format ("").
 This eliminates the need for statements from line 4
through 7 below, thereby "preventing" dereferencing of
pl_inode->refkeeper.
 Now, although this change prevents the crash
altogether, it still does not fix the race that led to
pl_inode->refkeeper becoming NULL, and comes at the cost of
 printing "(null)" in the log message on line 9 every
time pl_inode->refkeeper is found to be NULL, rendering the
logged messages somewhat useless.


   0 pl_inode = lock->pl_inode;
   1
   2 inode_path (pl_inode->refkeeper, NULL, &path);
   3
   4 if (path)
   5 file = path;
   6 else
   7 file = uuid_utoa
(pl_ino

Re: [Gluster-devel] Regarding doing away with refkeeper in locks xlator

2014-06-05 Thread Anand Avati

On 6/3/14, 11:32 PM, Pranith Kumar Karampuri wrote:


On 06/04/2014 11:37 AM, Krutika Dhananjay wrote:

Hi,

Recently there was a crash in locks translator (BZ 1103347, BZ
1097102) with the following backtrace:
(gdb) bt
#0  uuid_unpack (in=0x8 ,
uu=0x7fffea6c6a60) at ../../contrib/uuid/unpack.c:44
#1  0x7feeba9e19d6 in uuid_unparse_x (uu=,
out=0x2350fc0 "081bbc7a-7551-44ac-85c7-aad5e2633db9",
fmt=0x7feebaa08e00
"%08x-%04x-%04x-%02x%02x-%02x%02x%02x%02x%02x%02x") at
../../contrib/uuid/unparse.c:55
#2  0x7feeba9be837 in uuid_utoa (uuid=0x8 ) at common-utils.c:2138
#3  0x7feeb06e8a58 in pl_inodelk_log_cleanup (this=0x230d910,
ctx=0x7fee700f0c60) at inodelk.c:396
#4  pl_inodelk_client_cleanup (this=0x230d910, ctx=0x7fee700f0c60) at
inodelk.c:428
#5  0x7feeb06ddf3a in pl_client_disconnect_cbk (this=0x230d910,
client=) at posix.c:2550
#6  0x7feeba9fa2dd in gf_client_disconnect (client=0x27724a0) at
client_t.c:368
#7  0x7feeab77ed48 in server_connection_cleanup (this=0x2316390,
client=0x27724a0, flags=) at server-helpers.c:354
#8  0x7feeab77ae2c in server_rpc_notify (rpc=, xl=0x2316390, event=, data=0x2bf51c0) at
server.c:527
#9  0x7feeba775155 in rpcsvc_handle_disconnect (svc=0x2325980,
trans=0x2bf51c0) at rpcsvc.c:720
#10 0x7feeba776c30 in rpcsvc_notify (trans=0x2bf51c0,
mydata=, event=,
data=0x2bf51c0) at rpcsvc.c:758
#11 0x7feeba778638 in rpc_transport_notify (this=, event=, data=) at
rpc-transport.c:512
#12 0x7feeb115e971 in socket_event_poll_err (fd=, idx=, data=0x2bf51c0, poll_in=, poll_out=0,
poll_err=0) at socket.c:1071
#13 socket_event_handler (fd=, idx=, data=0x2bf51c0, poll_in=,
poll_out=0, poll_err=0) at socket.c:2240
#14 0x7feeba9fc6a7 in event_dispatch_epoll_handler
(event_pool=0x22e2d00) at event-epoll.c:384
#15 event_dispatch_epoll (event_pool=0x22e2d00) at event-epoll.c:445
#16 0x00407e93 in main (argc=19, argv=0x7fffea6c7f88) at
glusterfsd.c:2023
(gdb) f 4
#4  pl_inodelk_client_cleanup (this=0x230d910, ctx=0x7fee700f0c60) at
inodelk.c:428
428pl_inodelk_log_cleanup (l);
(gdb) p l->pl_inode->refkeeper
$1 = (inode_t *) 0x0
(gdb)

pl_inode->refkeeper was found to be NULL even when there were some
blocked inodelks in a certain domain of the inode,
which when dereferenced by the epoll thread in the cleanup codepath
led to a crash.

On inspecting the code (for want of a consistent reproducer), three
things were found:

1. The function where the crash happens (pl_inodelk_log_cleanup()),
makes an attempt to resolve the inode to path as can be seen below.
But the way inode_path() itself
works is to first construct the path based on the given inode's
ancestry and place it in the buffer provided. And if all else fails,
the gfid of the inode is placed in a certain format ("").
This eliminates the need for statements from line 4 through 7
below, thereby "preventing" dereferencing of pl_inode->refkeeper.
Now, although this change prevents the crash altogether, it still
does not fix the race that led to pl_inode->refkeeper becoming NULL,
and comes at the cost of
printing "(null)" in the log message on line 9 every time
pl_inode->refkeeper is found to be NULL, rendering the logged messages
somewhat useless.


  0 pl_inode = lock->pl_inode;
  1
  2 inode_path (pl_inode->refkeeper, NULL, &path);
  3
  4 if (path)
  5 file = path;
  6 else
  7 file = uuid_utoa (pl_inode->refkeeper->gfid);
8
  9 gf_log (THIS->name, GF_LOG_WARNING,
 10 "releasing lock on %s held by "
 11 "{client=%p, pid=%"PRId64" lk-owner=%s}",
 12 file, lock->client, (uint64_t) lock->client_pid,
 13 lkowner_utoa (&lock->owner));
<\code>

I think this logging code is from the days when gfid handle concept was
not there. So it wasn't returning  in cases the path is
not present in the dentries. I believe the else block can be deleted
safely now.

Pranith


2. There is at least one codepath found that can lead to this crash:
Imagine an inode on which an inodelk operation is attempted by a
client and is successfully granted too.
   Now, between the time the lock was granted and
pl_update_refkeeper() was called by this thread, the client could send
a DISCONNECT event,
   causing cleanup codepath to be executed, where the epoll thread
crashes on dereferencing pl_inode->refkeeper which is STILL NULL at
this point.

   Besides, there are still places in locks xlator where the refkeeper
is NOT updated whenever the lists are modified - for instance in the
cleanup codepath from a DISCONNECT.

3. Also, pl_update_refkeeper() seems to be not taking into account
blocked locks on the inode in the __pl_inode_is_empty() check, when it
should, as there could still be cases
where the granted list could be empty but not the blocked list at
the time of udpating the refkeeper, in which case pl_inode must still
take ref on the inode.


Re: [Gluster-devel] Regarding doing away with refkeeper in locks xlator

2014-06-04 Thread Anand Avati

On 6/3/14, 11:07 PM, Krutika Dhananjay wrote:

Hi,

Recently there was a crash in locks translator (BZ 1103347, BZ 1097102)
with the following backtrace:
(gdb) bt
#0  uuid_unpack (in=0x8 , uu=0x7fffea6c6a60)
at ../../contrib/uuid/unpack.c:44
#1  0x7feeba9e19d6 in uuid_unparse_x (uu=,
out=0x2350fc0 "081bbc7a-7551-44ac-85c7-aad5e2633db9",
 fmt=0x7feebaa08e00
"%08x-%04x-%04x-%02x%02x-%02x%02x%02x%02x%02x%02x") at
../../contrib/uuid/unparse.c:55
#2  0x7feeba9be837 in uuid_utoa (uuid=0x8 ) at common-utils.c:2138
#3  0x7feeb06e8a58 in pl_inodelk_log_cleanup (this=0x230d910,
ctx=0x7fee700f0c60) at inodelk.c:396
#4  pl_inodelk_client_cleanup (this=0x230d910, ctx=0x7fee700f0c60) at
inodelk.c:428
#5  0x7feeb06ddf3a in pl_client_disconnect_cbk (this=0x230d910,
client=) at posix.c:2550
#6  0x7feeba9fa2dd in gf_client_disconnect (client=0x27724a0) at
client_t.c:368
#7  0x7feeab77ed48 in server_connection_cleanup (this=0x2316390,
client=0x27724a0, flags=) at server-helpers.c:354
#8  0x7feeab77ae2c in server_rpc_notify (rpc=,
xl=0x2316390, event=, data=0x2bf51c0) at server.c:527
#9  0x7feeba775155 in rpcsvc_handle_disconnect (svc=0x2325980,
trans=0x2bf51c0) at rpcsvc.c:720
#10 0x7feeba776c30 in rpcsvc_notify (trans=0x2bf51c0, mydata=, event=, data=0x2bf51c0) at rpcsvc.c:758
#11 0x7feeba778638 in rpc_transport_notify (this=, event=, data=) at
rpc-transport.c:512
#12 0x7feeb115e971 in socket_event_poll_err (fd=, idx=, data=0x2bf51c0, poll_in=, poll_out=0,
 poll_err=0) at socket.c:1071
#13 socket_event_handler (fd=, idx=, data=0x2bf51c0, poll_in=, poll_out=0,
poll_err=0) at socket.c:2240
#14 0x7feeba9fc6a7 in event_dispatch_epoll_handler
(event_pool=0x22e2d00) at event-epoll.c:384
#15 event_dispatch_epoll (event_pool=0x22e2d00) at event-epoll.c:445
#16 0x00407e93 in main (argc=19, argv=0x7fffea6c7f88) at
glusterfsd.c:2023
(gdb) f 4
#4  pl_inodelk_client_cleanup (this=0x230d910, ctx=0x7fee700f0c60) at
inodelk.c:428
428pl_inodelk_log_cleanup (l);
(gdb) p l->pl_inode->refkeeper
$1 = (inode_t *) 0x0
(gdb)

pl_inode->refkeeper was found to be NULL even when there were some
blocked inodelks in a certain domain of the inode,
which when dereferenced by the epoll thread in the cleanup codepath led
to a crash.

On inspecting the code (for want of a consistent reproducer), three
things were found:

1. The function where the crash happens (pl_inodelk_log_cleanup()),
makes an attempt to resolve the inode to path as can be seen below. But
the way inode_path() itself
 works is to first construct the path based on the given inode's
ancestry and place it in the buffer provided. And if all else fails, the
gfid of the inode is placed in a certain format ("").
 This eliminates the need for statements from line 4 through 7
below, thereby "preventing" dereferencing of pl_inode->refkeeper.
 Now, although this change prevents the crash altogether, it still
does not fix the race that led to pl_inode->refkeeper becoming NULL, and
comes at the cost of
 printing "(null)" in the log message on line 9 every time
pl_inode->refkeeper is found to be NULL, rendering the logged messages
somewhat useless.


   0 pl_inode = lock->pl_inode;
   1
   2 inode_path (pl_inode->refkeeper, NULL, &path);
   3
   4 if (path)
   5 file = path;
   6 else
   7 file = uuid_utoa (pl_inode->refkeeper->gfid);
   8
   9 gf_log (THIS->name, GF_LOG_WARNING,
  10 "releasing lock on %s held by "
  11 "{client=%p, pid=%"PRId64" lk-owner=%s}",
  12 file, lock->client, (uint64_t) lock->client_pid,
  13 lkowner_utoa (&lock->owner));
<\code>

2. There is at least one codepath found that can lead to this crash:
 Imagine an inode on which an inodelk operation is attempted by a
client and is successfully granted too.
Now, between the time the lock was granted and pl_update_refkeeper()
was called by this thread, the client could send a DISCONNECT event,
causing cleanup codepath to be executed, where the epoll thread
crashes on dereferencing pl_inode->refkeeper which is STILL NULL at this
point.

Besides, there are still places in locks xlator where the refkeeper
is NOT updated whenever the lists are modified - for instance in the
cleanup codepath from a DISCONNECT.

3. Also, pl_update_refkeeper() seems to be not taking into account
blocked locks on the inode in the __pl_inode_is_empty() check, when it
should, as there could still be cases
 where the granted list could be empty but not the blocked list at
the time of udpating the refkeeper, in which case pl_inode must still
take ref on the inode.

Proposed solution to 2/3:

1. Do away with refkeeper in pl_inode_t altogether.
2. Let every lock object (inodelk/entryllk/posix_lock) have an inode_t *
member to act as a placeholder for the associated inode object on which
it is locking.
3. Let each

Re: [Gluster-devel] Regarding doing away with refkeeper in locks xlator

2014-06-04 Thread Krutika Dhananjay
- Original Message -

> From: "Pranith Kumar Karampuri" 
> To: "Krutika Dhananjay" , "Anand Avati"
> 
> Cc: gluster-devel@gluster.org
> Sent: Wednesday, June 4, 2014 12:23:59 PM
> Subject: Re: [Gluster-devel] Regarding doing away with refkeeper in locks
> xlator

> On 06/04/2014 12:02 PM, Pranith Kumar Karampuri wrote:

> > On 06/04/2014 11:37 AM, Krutika Dhananjay wrote:
> 

> > > Hi,
> > 
> 

> > > Recently there was a crash in locks translator (BZ 1103347, BZ 1097102)
> > > with
> > > the following backtrace:
> > 
> 
> > > (gdb) bt
> > 
> 
> > > #0 uuid_unpack (in=0x8 , uu=0x7fffea6c6a60) at
> > > ../../contrib/uuid/unpack.c:44
> > 
> 
> > > #1 0x7feeba9e19d6 in uuid_unparse_x (uu=,
> > > out=0x2350fc0 "081bbc7a-7551-44ac-85c7-aad5e2633db9",
> > 
> 
> > > fmt=0x7feebaa08e00 "%08x-%04x-%04x-%02x%02x-%02x%02x%02x%02x%02x%02x") at
> > > ../../contrib/uuid/unparse.c:55
> > 
> 
> > > #2 0x7feeba9be837 in uuid_utoa (uuid=0x8 )
> > > at
> > > common-utils.c:2138
> > 
> 
> > > #3 0x7feeb06e8a58 in pl_inodelk_log_cleanup (this=0x230d910,
> > > ctx=0x7fee700f0c60) at inodelk.c:396
> > 
> 
> > > #4 pl_inodelk_client_cleanup (this=0x230d910, ctx=0x7fee700f0c60) at
> > > inodelk.c:428
> > 
> 
> > > #5 0x7feeb06ddf3a in pl_client_disconnect_cbk (this=0x230d910,
> > > client=) at posix.c:2550
> > 
> 
> > > #6 0x7feeba9fa2dd in gf_client_disconnect (client=0x27724a0) at
> > > client_t.c:368
> > 
> 
> > > #7 0x7feeab77ed48 in server_connection_cleanup (this=0x2316390,
> > > client=0x27724a0, flags=) at server-helpers.c:354
> > 
> 
> > > #8 0x7feeab77ae2c in server_rpc_notify (rpc=,
> > > xl=0x2316390, event=, data=0x2bf51c0) at
> > > server.c:527
> > 
> 
> > > #9 0x7feeba775155 in rpcsvc_handle_disconnect (svc=0x2325980,
> > > trans=0x2bf51c0) at rpcsvc.c:720
> > 
> 
> > > #10 0x7feeba776c30 in rpcsvc_notify (trans=0x2bf51c0, mydata= > > optimized out>, event=, data=0x2bf51c0) at
> > > rpcsvc.c:758
> > 
> 
> > > #11 0x7feeba778638 in rpc_transport_notify (this= > > out>,
> > > event=, data=) at
> > > rpc-transport.c:512
> > 
> 
> > > #12 0x7feeb115e971 in socket_event_poll_err (fd= > > out>,
> > > idx=, data=0x2bf51c0, poll_in=,
> > > poll_out=0,
> > 
> 
> > > poll_err=0) at socket.c:1071
> > 
> 
> > > #13 socket_event_handler (fd=, idx= > > out>, data=0x2bf51c0, poll_in=, poll_out=0,
> > > poll_err=0)
> > > at socket.c:2240
> > 
> 
> > > #14 0x7feeba9fc6a7 in event_dispatch_epoll_handler
> > > (event_pool=0x22e2d00)
> > > at event-epoll.c:384
> > 
> 
> > > #15 event_dispatch_epoll (event_pool=0x22e2d00) at event-epoll.c:445
> > 
> 
> > > #16 0x00407e93 in main (argc=19, argv=0x7fffea6c7f88) at
> > > glusterfsd.c:2023
> > 
> 
> > > (gdb) f 4
> > 
> 
> > > #4 pl_inodelk_client_cleanup (this=0x230d910, ctx=0x7fee700f0c60) at
> > > inodelk.c:428
> > 
> 
> > > 428 pl_inodelk_log_cleanup (l);
> > 
> 
> > > (gdb) p l->pl_inode->refkeeper
> > 
> 
> > > $1 = (inode_t *) 0x0
> > 
> 
> > > (gdb)
> > 
> 

> > > pl_inode->refkeeper was found to be NULL even when there were some
> > > blocked
> > > inodelks in a certain domain of the inode,
> > 
> 
> > > which when dereferenced by the epoll thread in the cleanup codepath led
> > > to
> > > a
> > > crash.
> > 
> 

> > > On inspecting the code (for want of a consistent reproducer), three
> > > things
> > > were found:
> > 
> 

> > > 1. The function where the crash happens (pl_inodelk_log_cleanup()), makes
> > > an
> > > attempt to resolve the inode to path as can be seen below. But the way
> > > inode_path() itself
> > 
> 
> > > works is to first construct the path based on the given inode's ancestry
> > > and
> > > place it in the buffer provided. And if all else fails, the gfid of the
> > > inode is placed in a certain format ("").
> > 
> 
> > > This eliminates the need for statements from line 4 through 7 below,
> > > thereby
> 

Re: [Gluster-devel] Regarding doing away with refkeeper in locks xlator

2014-06-03 Thread Pranith Kumar Karampuri


On 06/04/2014 12:02 PM, Pranith Kumar Karampuri wrote:


On 06/04/2014 11:37 AM, Krutika Dhananjay wrote:

Hi,

Recently there was a crash in locks translator (BZ 1103347, BZ 
1097102) with the following backtrace:

(gdb) bt
#0  uuid_unpack (in=0x8 , 
uu=0x7fffea6c6a60) at ../../contrib/uuid/unpack.c:44
#1  0x7feeba9e19d6 in uuid_unparse_x (uu=, 
out=0x2350fc0 "081bbc7a-7551-44ac-85c7-aad5e2633db9",
fmt=0x7feebaa08e00 
"%08x-%04x-%04x-%02x%02x-%02x%02x%02x%02x%02x%02x") at 
../../contrib/uuid/unparse.c:55
#2  0x7feeba9be837 in uuid_utoa (uuid=0x8 bounds>) at common-utils.c:2138
#3  0x7feeb06e8a58 in pl_inodelk_log_cleanup (this=0x230d910, 
ctx=0x7fee700f0c60) at inodelk.c:396
#4  pl_inodelk_client_cleanup (this=0x230d910, ctx=0x7fee700f0c60) at 
inodelk.c:428
#5  0x7feeb06ddf3a in pl_client_disconnect_cbk (this=0x230d910, 
client=) at posix.c:2550
#6  0x7feeba9fa2dd in gf_client_disconnect (client=0x27724a0) at 
client_t.c:368
#7  0x7feeab77ed48 in server_connection_cleanup (this=0x2316390, 
client=0x27724a0, flags=) at server-helpers.c:354
#8  0x7feeab77ae2c in server_rpc_notify (rpc=out>, xl=0x2316390, event=, data=0x2bf51c0) at 
server.c:527
#9  0x7feeba775155 in rpcsvc_handle_disconnect (svc=0x2325980, 
trans=0x2bf51c0) at rpcsvc.c:720
#10 0x7feeba776c30 in rpcsvc_notify (trans=0x2bf51c0, 
mydata=, event=, 
data=0x2bf51c0) at rpcsvc.c:758
#11 0x7feeba778638 in rpc_transport_notify (this=out>, event=, data=) at 
rpc-transport.c:512
#12 0x7feeb115e971 in socket_event_poll_err (fd=out>, idx=, data=0x2bf51c0, poll_in=optimized out>, poll_out=0,

poll_err=0) at socket.c:1071
#13 socket_event_handler (fd=, idx=optimized out>, data=0x2bf51c0, poll_in=, 
poll_out=0, poll_err=0) at socket.c:2240
#14 0x7feeba9fc6a7 in event_dispatch_epoll_handler 
(event_pool=0x22e2d00) at event-epoll.c:384

#15 event_dispatch_epoll (event_pool=0x22e2d00) at event-epoll.c:445
#16 0x00407e93 in main (argc=19, argv=0x7fffea6c7f88) at 
glusterfsd.c:2023

(gdb) f 4
#4  pl_inodelk_client_cleanup (this=0x230d910, ctx=0x7fee700f0c60) at 
inodelk.c:428

428pl_inodelk_log_cleanup (l);
(gdb) p l->pl_inode->refkeeper
$1 = (inode_t *) 0x0
(gdb)

pl_inode->refkeeper was found to be NULL even when there were some 
blocked inodelks in a certain domain of the inode,
which when dereferenced by the epoll thread in the cleanup codepath 
led to a crash.


On inspecting the code (for want of a consistent reproducer), three 
things were found:


1. The function where the crash happens (pl_inodelk_log_cleanup()), 
makes an attempt to resolve the inode to path as can be seen below. 
But the way inode_path() itself
works is to first construct the path based on the given inode's 
ancestry and place it in the buffer provided. And if all else fails, 
the gfid of the inode is placed in a certain format ("").
This eliminates the need for statements from line 4 through 7 
below, thereby "preventing" dereferencing of pl_inode->refkeeper.
Now, although this change prevents the crash altogether, it still 
does not fix the race that led to pl_inode->refkeeper becoming NULL, 
and comes at the cost of
printing "(null)" in the log message on line 9 every time 
pl_inode->refkeeper is found to be NULL, rendering the logged 
messages somewhat useless.



  0 pl_inode = lock->pl_inode;
  1
  2 inode_path (pl_inode->refkeeper, NULL, &path);
  3
  4 if (path)
  5 file = path;
  6 else
  7 file = uuid_utoa (pl_inode->refkeeper->gfid);
8
  9 gf_log (THIS->name, GF_LOG_WARNING,
 10 "releasing lock on %s held by "
 11 "{client=%p, pid=%"PRId64" lk-owner=%s}",
 12 file, lock->client, (uint64_t) lock->client_pid,
 13 lkowner_utoa (&lock->owner));
<\code>
I think this logging code is from the days when gfid handle concept 
was not there. So it wasn't returning  in cases the 
path is not present in the dentries. I believe the else block can be 
deleted safely now.


Pranith


2. There is at least one codepath found that can lead to this crash:
Imagine an inode on which an inodelk operation is attempted by a 
client and is successfully granted too.
   Now, between the time the lock was granted and 
pl_update_refkeeper() was called by this thread, the client could 
send a DISCONNECT event,
   causing cleanup codepath to be executed, where the epoll thread 
crashes on dereferencing pl_inode->refkeeper which is STILL NULL at 
this point.


   Besides, there are still places in locks xlator where the 
refkeeper is NOT updated whenever the lists are modified - for 
instance in the cleanup codepath from a DISCONNECT.


3. Also, pl_update_refkeeper() seems to be not taking into account 
blocked locks on the inode in the __pl_inode_is_empty() check, when 
it should, as there could still be cases
where the granted list could be empty but not the blocke

Re: [Gluster-devel] Regarding doing away with refkeeper in locks xlator

2014-06-03 Thread Pranith Kumar Karampuri


On 06/04/2014 11:37 AM, Krutika Dhananjay wrote:

Hi,

Recently there was a crash in locks translator (BZ 1103347, BZ 
1097102) with the following backtrace:

(gdb) bt
#0  uuid_unpack (in=0x8 , 
uu=0x7fffea6c6a60) at ../../contrib/uuid/unpack.c:44
#1  0x7feeba9e19d6 in uuid_unparse_x (uu=, 
out=0x2350fc0 "081bbc7a-7551-44ac-85c7-aad5e2633db9",
fmt=0x7feebaa08e00 
"%08x-%04x-%04x-%02x%02x-%02x%02x%02x%02x%02x%02x") at 
../../contrib/uuid/unparse.c:55
#2  0x7feeba9be837 in uuid_utoa (uuid=0x8 bounds>) at common-utils.c:2138
#3  0x7feeb06e8a58 in pl_inodelk_log_cleanup (this=0x230d910, 
ctx=0x7fee700f0c60) at inodelk.c:396
#4  pl_inodelk_client_cleanup (this=0x230d910, ctx=0x7fee700f0c60) at 
inodelk.c:428
#5  0x7feeb06ddf3a in pl_client_disconnect_cbk (this=0x230d910, 
client=) at posix.c:2550
#6  0x7feeba9fa2dd in gf_client_disconnect (client=0x27724a0) at 
client_t.c:368
#7  0x7feeab77ed48 in server_connection_cleanup (this=0x2316390, 
client=0x27724a0, flags=) at server-helpers.c:354
#8  0x7feeab77ae2c in server_rpc_notify (rpc=out>, xl=0x2316390, event=, data=0x2bf51c0) at 
server.c:527
#9  0x7feeba775155 in rpcsvc_handle_disconnect (svc=0x2325980, 
trans=0x2bf51c0) at rpcsvc.c:720
#10 0x7feeba776c30 in rpcsvc_notify (trans=0x2bf51c0, 
mydata=, event=, 
data=0x2bf51c0) at rpcsvc.c:758
#11 0x7feeba778638 in rpc_transport_notify (this=out>, event=, data=) at 
rpc-transport.c:512
#12 0x7feeb115e971 in socket_event_poll_err (fd=out>, idx=, data=0x2bf51c0, poll_in=optimized out>, poll_out=0,

poll_err=0) at socket.c:1071
#13 socket_event_handler (fd=, idx=optimized out>, data=0x2bf51c0, poll_in=, 
poll_out=0, poll_err=0) at socket.c:2240
#14 0x7feeba9fc6a7 in event_dispatch_epoll_handler 
(event_pool=0x22e2d00) at event-epoll.c:384

#15 event_dispatch_epoll (event_pool=0x22e2d00) at event-epoll.c:445
#16 0x00407e93 in main (argc=19, argv=0x7fffea6c7f88) at 
glusterfsd.c:2023

(gdb) f 4
#4  pl_inodelk_client_cleanup (this=0x230d910, ctx=0x7fee700f0c60) at 
inodelk.c:428

428pl_inodelk_log_cleanup (l);
(gdb) p l->pl_inode->refkeeper
$1 = (inode_t *) 0x0
(gdb)

pl_inode->refkeeper was found to be NULL even when there were some 
blocked inodelks in a certain domain of the inode,
which when dereferenced by the epoll thread in the cleanup codepath 
led to a crash.


On inspecting the code (for want of a consistent reproducer), three 
things were found:


1. The function where the crash happens (pl_inodelk_log_cleanup()), 
makes an attempt to resolve the inode to path as can be seen below. 
But the way inode_path() itself
works is to first construct the path based on the given inode's 
ancestry and place it in the buffer provided. And if all else fails, 
the gfid of the inode is placed in a certain format ("").
This eliminates the need for statements from line 4 through 7 
below, thereby "preventing" dereferencing of pl_inode->refkeeper.
Now, although this change prevents the crash altogether, it still 
does not fix the race that led to pl_inode->refkeeper becoming NULL, 
and comes at the cost of
printing "(null)" in the log message on line 9 every time 
pl_inode->refkeeper is found to be NULL, rendering the logged messages 
somewhat useless.



  0 pl_inode = lock->pl_inode;
  1
  2 inode_path (pl_inode->refkeeper, NULL, &path);
  3
  4 if (path)
  5 file = path;
  6 else
  7 file = uuid_utoa (pl_inode->refkeeper->gfid);
8
  9 gf_log (THIS->name, GF_LOG_WARNING,
 10 "releasing lock on %s held by "
 11 "{client=%p, pid=%"PRId64" lk-owner=%s}",
 12 file, lock->client, (uint64_t) lock->client_pid,
 13 lkowner_utoa (&lock->owner));
<\code>
I think this logging code is from the days when gfid handle concept was 
not there. So it wasn't returning  in cases the path is 
not present in the dentries. I believe the else block can be deleted 
safely now.


Pranith


2. There is at least one codepath found that can lead to this crash:
Imagine an inode on which an inodelk operation is attempted by a 
client and is successfully granted too.
   Now, between the time the lock was granted and 
pl_update_refkeeper() was called by this thread, the client could send 
a DISCONNECT event,
   causing cleanup codepath to be executed, where the epoll thread 
crashes on dereferencing pl_inode->refkeeper which is STILL NULL at 
this point.


   Besides, there are still places in locks xlator where the refkeeper 
is NOT updated whenever the lists are modified - for instance in the 
cleanup codepath from a DISCONNECT.


3. Also, pl_update_refkeeper() seems to be not taking into account 
blocked locks on the inode in the __pl_inode_is_empty() check, when it 
should, as there could still be cases
where the granted list could be empty but not the blocked list at 
the time of udpating the refkeeper, in which c

[Gluster-devel] Regarding doing away with refkeeper in locks xlator

2014-06-03 Thread Krutika Dhananjay
Hi, 

Recently there was a crash in locks translator (BZ 1103347, BZ 1097102) with 
the following backtrace: 
(gdb) bt 
#0 uuid_unpack (in=0x8 , uu=0x7fffea6c6a60) at 
../../contrib/uuid/unpack.c:44 
#1 0x7feeba9e19d6 in uuid_unparse_x (uu=, 
out=0x2350fc0 "081bbc7a-7551-44ac-85c7-aad5e2633db9", 
fmt=0x7feebaa08e00 "%08x-%04x-%04x-%02x%02x-%02x%02x%02x%02x%02x%02x") at 
../../contrib/uuid/unparse.c:55 
#2 0x7feeba9be837 in uuid_utoa (uuid=0x8 ) at 
common-utils.c:2138 
#3 0x7feeb06e8a58 in pl_inodelk_log_cleanup (this=0x230d910, 
ctx=0x7fee700f0c60) at inodelk.c:396 
#4 pl_inodelk_client_cleanup (this=0x230d910, ctx=0x7fee700f0c60) at 
inodelk.c:428 
#5 0x7feeb06ddf3a in pl_client_disconnect_cbk (this=0x230d910, 
client=) at posix.c:2550 
#6 0x7feeba9fa2dd in gf_client_disconnect (client=0x27724a0) at 
client_t.c:368 
#7 0x7feeab77ed48 in server_connection_cleanup (this=0x2316390, 
client=0x27724a0, flags=) at server-helpers.c:354 
#8 0x7feeab77ae2c in server_rpc_notify (rpc=, 
xl=0x2316390, event=, data=0x2bf51c0) at server.c:527 
#9 0x7feeba775155 in rpcsvc_handle_disconnect (svc=0x2325980, 
trans=0x2bf51c0) at rpcsvc.c:720 
#10 0x7feeba776c30 in rpcsvc_notify (trans=0x2bf51c0, mydata=, event=, data=0x2bf51c0) at rpcsvc.c:758 
#11 0x7feeba778638 in rpc_transport_notify (this=, 
event=, data=) at rpc-transport.c:512 
#12 0x7feeb115e971 in socket_event_poll_err (fd=, 
idx=, data=0x2bf51c0, poll_in=, 
poll_out=0, 
poll_err=0) at socket.c:1071 
#13 socket_event_handler (fd=, idx=, 
data=0x2bf51c0, poll_in=, poll_out=0, poll_err=0) at 
socket.c:2240 
#14 0x7feeba9fc6a7 in event_dispatch_epoll_handler (event_pool=0x22e2d00) 
at event-epoll.c:384 
#15 event_dispatch_epoll (event_pool=0x22e2d00) at event-epoll.c:445 
#16 0x00407e93 in main (argc=19, argv=0x7fffea6c7f88) at 
glusterfsd.c:2023 
(gdb) f 4 
#4 pl_inodelk_client_cleanup (this=0x230d910, ctx=0x7fee700f0c60) at 
inodelk.c:428 
428 pl_inodelk_log_cleanup (l); 
(gdb) p l->pl_inode->refkeeper 
$1 = (inode_t *) 0x0 
(gdb) 

pl_inode->refkeeper was found to be NULL even when there were some blocked 
inodelks in a certain domain of the inode, 
which when dereferenced by the epoll thread in the cleanup codepath led to a 
crash. 

On inspecting the code (for want of a consistent reproducer), three things were 
found: 

1. The function where the crash happens (pl_inodelk_log_cleanup()), makes an 
attempt to resolve the inode to path as can be seen below. But the way 
inode_path() itself 
works is to first construct the path based on the given inode's ancestry and 
place it in the buffer provided. And if all else fails, the gfid of the inode 
is placed in a certain format (""). 
This eliminates the need for statements from line 4 through 7 below, thereby 
"preventing" dereferencing of pl_inode->refkeeper. 
Now, although this change prevents the crash altogether, it still does not fix 
the race that led to pl_inode->refkeeper becoming NULL, and comes at the cost 
of 
printing "(null)" in the log message on line 9 every time pl_inode->refkeeper 
is found to be NULL, rendering the logged messages somewhat useless. 

 
0 pl_inode = lock->pl_inode; 
1 
2 inode_path (pl_inode->refkeeper, NULL, &path); 
3 
4 if (path) 
5 file = path; 
6 else 
7 file = uuid_utoa (pl_inode->refkeeper->gfid); 
8 
9 gf_log (THIS->name, GF_LOG_WARNING, 
10 "releasing lock on %s held by " 
11 "{client=%p, pid=%"PRId64" lk-owner=%s}", 
12 file, lock->client, (uint64_t) lock->client_pid, 
13 lkowner_utoa (&lock->owner)); 
<\code> 

2. There is at least one codepath found that can lead to this crash: 
Imagine an inode on which an inodelk operation is attempted by a client and is 
successfully granted too. 
Now, between the time the lock was granted and pl_update_refkeeper() was called 
by this thread, the client could send a DISCONNECT event, 
causing cleanup codepath to be executed, where the epoll thread crashes on 
dereferencing pl_inode->refkeeper which is STILL NULL at this point. 

Besides, there are still places in locks xlator where the refkeeper is NOT 
updated whenever the lists are modified - for instance in the cleanup codepath 
from a DISCONNECT. 

3. Also, pl_update_refkeeper() seems to be not taking into account blocked 
locks on the inode in the __pl_inode_is_empty() check, when it should, as there 
could still be cases 
where the granted list could be empty but not the blocked list at the time of 
udpating the refkeeper, in which case pl_inode must still take ref on the 
inode. 

Proposed solution to 2/3: 

1. Do away with refkeeper in pl_inode_t altogether. 
2. Let every lock object (inodelk/entryllk/posix_lock) have an inode_t * member 
to act as a placeholder for the associated inode object on which it is locking. 
3. Let each lock object hold a ref on the inode at the time of its creation and 
unref the inode during its destruction. 

Please let me know what you think of the above. 

-Krutika 


___