Re: [PATCH] Fix use-after-free of q->root_blkg and q->root_rl.blkg

2012-10-21 Thread Jun'ichi Nomura
On 10/19/12 23:53, Vivek Goyal wrote:
> On Thu, Oct 18, 2012 at 02:20:53PM -0700, Tejun Heo wrote:
>> Hey, Vivek.
>>
>> On Thu, Oct 18, 2012 at 09:31:49AM -0400, Vivek Goyal wrote:
>>> Tejun, for the sake of readability, are you fine with keeping the original
>>> check and original patch which I had acked.
>>
>> Can you please send another patch to change that?  It really isn't a
>> related change and I don't wanna mix the two.
> 
> Sure. Jun'ichi, would you like to send that cleanup line in a separate patch? 

OK. I will send that patch.

-- 
Jun'ichi Nomura, NEC Corporation

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Fix use-after-free of q-root_blkg and q-root_rl.blkg

2012-10-21 Thread Jun'ichi Nomura
On 10/19/12 23:53, Vivek Goyal wrote:
 On Thu, Oct 18, 2012 at 02:20:53PM -0700, Tejun Heo wrote:
 Hey, Vivek.

 On Thu, Oct 18, 2012 at 09:31:49AM -0400, Vivek Goyal wrote:
 Tejun, for the sake of readability, are you fine with keeping the original
 check and original patch which I had acked.

 Can you please send another patch to change that?  It really isn't a
 related change and I don't wanna mix the two.
 
 Sure. Jun'ichi, would you like to send that cleanup line in a separate patch? 

OK. I will send that patch.

-- 
Jun'ichi Nomura, NEC Corporation

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Fix use-after-free of q->root_blkg and q->root_rl.blkg

2012-10-19 Thread Vivek Goyal
On Thu, Oct 18, 2012 at 02:20:53PM -0700, Tejun Heo wrote:
> Hey, Vivek.
> 
> On Thu, Oct 18, 2012 at 09:31:49AM -0400, Vivek Goyal wrote:
> > Tejun, for the sake of readability, are you fine with keeping the original
> > check and original patch which I had acked.
> 
> Can you please send another patch to change that?  It really isn't a
> related change and I don't wanna mix the two.

Sure. Jun'ichi, would you like to send that cleanup line in a separate patch? 

Thanks
Vivek
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Fix use-after-free of q-root_blkg and q-root_rl.blkg

2012-10-19 Thread Vivek Goyal
On Thu, Oct 18, 2012 at 02:20:53PM -0700, Tejun Heo wrote:
 Hey, Vivek.
 
 On Thu, Oct 18, 2012 at 09:31:49AM -0400, Vivek Goyal wrote:
  Tejun, for the sake of readability, are you fine with keeping the original
  check and original patch which I had acked.
 
 Can you please send another patch to change that?  It really isn't a
 related change and I don't wanna mix the two.

Sure. Jun'ichi, would you like to send that cleanup line in a separate patch? 

Thanks
Vivek
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Fix use-after-free of q->root_blkg and q->root_rl.blkg

2012-10-18 Thread Tejun Heo
Hey, Vivek.

On Thu, Oct 18, 2012 at 09:31:49AM -0400, Vivek Goyal wrote:
> Tejun, for the sake of readability, are you fine with keeping the original
> check and original patch which I had acked.

Can you please send another patch to change that?  It really isn't a
related change and I don't wanna mix the two.

Thanks.

-- 
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Fix use-after-free of q->root_blkg and q->root_rl.blkg

2012-10-18 Thread Vivek Goyal
On Thu, Oct 18, 2012 at 11:56:34AM +0900, Jun'ichi Nomura wrote:

[..]
> >>> if (ent == >root_blkg->q_node)
> >>
> >> So ent is not >root_blkg->q_node.
> > 
> > If q->root_blkg is NULL, will it not lead to NULL pointer dereference.
> > (q->root_blkg->q_node).
> 
> It's not dereferenced.

Ok. We are taking address of root_blkg->q_node so even if root_blkg=NULL,
address is just offset from null. Little subtle for me. :-)

> 
> >>> ent = ent->next;
> >>> if (ent == >blkg_list)
> >>> return NULL;
> >>
> >> And we return NULL here.
> >>
> >> Ah, yes. You are correct.
> >> We can do without the above hunk.
> > 
> > I would rather prefer to check for this boundary condition early and
> > return instead of letting it fall through all these conditions and
> > then figure out yes we have no next rl. IMO, code becomes easier to
> > understand if nothing else. Otherwise one needs a step by step 
> > explanation as above to show that case of q->root_blkg is covered.
> 
> I have same opinion as yours that it's good for readability.


Tejun, for the sake of readability, are you fine with keeping the original
check and original patch which I had acked.

Thanks
Vivek
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Fix use-after-free of q-root_blkg and q-root_rl.blkg

2012-10-18 Thread Vivek Goyal
On Thu, Oct 18, 2012 at 11:56:34AM +0900, Jun'ichi Nomura wrote:

[..]
  if (ent == q-root_blkg-q_node)
 
  So ent is not q-root_blkg-q_node.
  
  If q-root_blkg is NULL, will it not lead to NULL pointer dereference.
  (q-root_blkg-q_node).
 
 It's not dereferenced.

Ok. We are taking address of root_blkg-q_node so even if root_blkg=NULL,
address is just offset from null. Little subtle for me. :-)

 
  ent = ent-next;
  if (ent == q-blkg_list)
  return NULL;
 
  And we return NULL here.
 
  Ah, yes. You are correct.
  We can do without the above hunk.
  
  I would rather prefer to check for this boundary condition early and
  return instead of letting it fall through all these conditions and
  then figure out yes we have no next rl. IMO, code becomes easier to
  understand if nothing else. Otherwise one needs a step by step 
  explanation as above to show that case of q-root_blkg is covered.
 
 I have same opinion as yours that it's good for readability.


Tejun, for the sake of readability, are you fine with keeping the original
check and original patch which I had acked.

Thanks
Vivek
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Fix use-after-free of q-root_blkg and q-root_rl.blkg

2012-10-18 Thread Tejun Heo
Hey, Vivek.

On Thu, Oct 18, 2012 at 09:31:49AM -0400, Vivek Goyal wrote:
 Tejun, for the sake of readability, are you fine with keeping the original
 check and original patch which I had acked.

Can you please send another patch to change that?  It really isn't a
related change and I don't wanna mix the two.

Thanks.

-- 
tejun
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Fix use-after-free of q->root_blkg and q->root_rl.blkg

2012-10-17 Thread Jun'ichi Nomura
On 10/17/12 22:47, Vivek Goyal wrote:
> On Wed, Oct 17, 2012 at 09:02:22AM +0900, Jun'ichi Nomura wrote:
>> On 10/17/12 08:20, Tejun Heo wrote:
>> -if (ent == >root_blkg->q_node)
>> +if (q->root_blkg && ent == >root_blkg->q_node)
>
> Can we fix it little differently. Little earlier in the code, we check for
> if q->blkg_list is empty, then all the groups are gone, and there are
> no more request lists hence and return NULL.
>
> Current code:
> if (rl == >root_rl) {
> ent = >blkg_list;
>
> Modified code:
> if (rl == >root_rl) {
> ent = >blkg_list;
>   /* There are no more block groups, hence no request lists */
>   if (list_empty(ent))
>   return NULL;
>   }
>>>
>>> Do we need this at all?  q->root_blkg being NULL is completely fine
>>> there and the comparison would work as expected, no?
>>
>> Hmm?
>>
>> If list_empty(ent) and q->root_blkg == NULL,
>>
>>> /* walk to the next list_head, skip root blkcg */
>>> ent = ent->next;
>>
>> ent is >blkg_list again.
>>
>>> if (ent == >root_blkg->q_node)
>>
>> So ent is not >root_blkg->q_node.
> 
> If q->root_blkg is NULL, will it not lead to NULL pointer dereference.
> (q->root_blkg->q_node).

It's not dereferenced.

>>> ent = ent->next;
>>> if (ent == >blkg_list)
>>> return NULL;
>>
>> And we return NULL here.
>>
>> Ah, yes. You are correct.
>> We can do without the above hunk.
> 
> I would rather prefer to check for this boundary condition early and
> return instead of letting it fall through all these conditions and
> then figure out yes we have no next rl. IMO, code becomes easier to
> understand if nothing else. Otherwise one needs a step by step 
> explanation as above to show that case of q->root_blkg is covered.

I have same opinion as yours that it's good for readability.

-- 
Jun'ichi Nomura, NEC Corporation

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Fix use-after-free of q->root_blkg and q->root_rl.blkg

2012-10-17 Thread Vivek Goyal
On Wed, Oct 17, 2012 at 09:02:22AM +0900, Jun'ichi Nomura wrote:
> On 10/17/12 08:20, Tejun Heo wrote:
>  -if (ent == >root_blkg->q_node)
>  +if (q->root_blkg && ent == >root_blkg->q_node)
> >>>
> >>> Can we fix it little differently. Little earlier in the code, we check for
> >>> if q->blkg_list is empty, then all the groups are gone, and there are
> >>> no more request lists hence and return NULL.
> >>>
> >>> Current code:
> >>> if (rl == >root_rl) {
> >>> ent = >blkg_list;
> >>>
> >>> Modified code:
> >>> if (rl == >root_rl) {
> >>> ent = >blkg_list;
> >>>   /* There are no more block groups, hence no request lists */
> >>>   if (list_empty(ent))
> >>>   return NULL;
> >>>   }
> > 
> > Do we need this at all?  q->root_blkg being NULL is completely fine
> > there and the comparison would work as expected, no?
> 
> Hmm?
> 
> If list_empty(ent) and q->root_blkg == NULL,
> 
> > /* walk to the next list_head, skip root blkcg */
> > ent = ent->next;
> 
> ent is >blkg_list again.
> 
> > if (ent == >root_blkg->q_node)
> 
> So ent is not >root_blkg->q_node.

If q->root_blkg is NULL, will it not lead to NULL pointer dereference.
(q->root_blkg->q_node).
 
> 
> > ent = ent->next;
> > if (ent == >blkg_list)
> > return NULL;
> 
> And we return NULL here.
> 
> Ah, yes. You are correct.
> We can do without the above hunk.

I would rather prefer to check for this boundary condition early and
return instead of letting it fall through all these conditions and
then figure out yes we have no next rl. IMO, code becomes easier to
understand if nothing else. Otherwise one needs a step by step 
explanation as above to show that case of q->root_blkg is covered.

Thanks
Vivek
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Fix use-after-free of q-root_blkg and q-root_rl.blkg

2012-10-17 Thread Vivek Goyal
On Wed, Oct 17, 2012 at 09:02:22AM +0900, Jun'ichi Nomura wrote:
 On 10/17/12 08:20, Tejun Heo wrote:
  -if (ent == q-root_blkg-q_node)
  +if (q-root_blkg  ent == q-root_blkg-q_node)
 
  Can we fix it little differently. Little earlier in the code, we check for
  if q-blkg_list is empty, then all the groups are gone, and there are
  no more request lists hence and return NULL.
 
  Current code:
  if (rl == q-root_rl) {
  ent = q-blkg_list;
 
  Modified code:
  if (rl == q-root_rl) {
  ent = q-blkg_list;
/* There are no more block groups, hence no request lists */
if (list_empty(ent))
return NULL;
}
  
  Do we need this at all?  q-root_blkg being NULL is completely fine
  there and the comparison would work as expected, no?
 
 Hmm?
 
 If list_empty(ent) and q-root_blkg == NULL,
 
  /* walk to the next list_head, skip root blkcg */
  ent = ent-next;
 
 ent is q-blkg_list again.
 
  if (ent == q-root_blkg-q_node)
 
 So ent is not q-root_blkg-q_node.

If q-root_blkg is NULL, will it not lead to NULL pointer dereference.
(q-root_blkg-q_node).
 
 
  ent = ent-next;
  if (ent == q-blkg_list)
  return NULL;
 
 And we return NULL here.
 
 Ah, yes. You are correct.
 We can do without the above hunk.

I would rather prefer to check for this boundary condition early and
return instead of letting it fall through all these conditions and
then figure out yes we have no next rl. IMO, code becomes easier to
understand if nothing else. Otherwise one needs a step by step 
explanation as above to show that case of q-root_blkg is covered.

Thanks
Vivek
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Fix use-after-free of q-root_blkg and q-root_rl.blkg

2012-10-17 Thread Jun'ichi Nomura
On 10/17/12 22:47, Vivek Goyal wrote:
 On Wed, Oct 17, 2012 at 09:02:22AM +0900, Jun'ichi Nomura wrote:
 On 10/17/12 08:20, Tejun Heo wrote:
 -if (ent == q-root_blkg-q_node)
 +if (q-root_blkg  ent == q-root_blkg-q_node)

 Can we fix it little differently. Little earlier in the code, we check for
 if q-blkg_list is empty, then all the groups are gone, and there are
 no more request lists hence and return NULL.

 Current code:
 if (rl == q-root_rl) {
 ent = q-blkg_list;

 Modified code:
 if (rl == q-root_rl) {
 ent = q-blkg_list;
   /* There are no more block groups, hence no request lists */
   if (list_empty(ent))
   return NULL;
   }

 Do we need this at all?  q-root_blkg being NULL is completely fine
 there and the comparison would work as expected, no?

 Hmm?

 If list_empty(ent) and q-root_blkg == NULL,

 /* walk to the next list_head, skip root blkcg */
 ent = ent-next;

 ent is q-blkg_list again.

 if (ent == q-root_blkg-q_node)

 So ent is not q-root_blkg-q_node.
 
 If q-root_blkg is NULL, will it not lead to NULL pointer dereference.
 (q-root_blkg-q_node).

It's not dereferenced.

 ent = ent-next;
 if (ent == q-blkg_list)
 return NULL;

 And we return NULL here.

 Ah, yes. You are correct.
 We can do without the above hunk.
 
 I would rather prefer to check for this boundary condition early and
 return instead of letting it fall through all these conditions and
 then figure out yes we have no next rl. IMO, code becomes easier to
 understand if nothing else. Otherwise one needs a step by step 
 explanation as above to show that case of q-root_blkg is covered.

I have same opinion as yours that it's good for readability.

-- 
Jun'ichi Nomura, NEC Corporation

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Fix use-after-free of q->root_blkg and q->root_rl.blkg

2012-10-16 Thread Jun'ichi Nomura
On 10/17/12 08:20, Tejun Heo wrote:
 -  if (ent == >root_blkg->q_node)
 +  if (q->root_blkg && ent == >root_blkg->q_node)
>>>
>>> Can we fix it little differently. Little earlier in the code, we check for
>>> if q->blkg_list is empty, then all the groups are gone, and there are
>>> no more request lists hence and return NULL.
>>>
>>> Current code:
>>> if (rl == >root_rl) {
>>> ent = >blkg_list;
>>>
>>> Modified code:
>>> if (rl == >root_rl) {
>>> ent = >blkg_list;
>>> /* There are no more block groups, hence no request lists */
>>> if (list_empty(ent))
>>> return NULL;
>>> }
> 
> Do we need this at all?  q->root_blkg being NULL is completely fine
> there and the comparison would work as expected, no?

Hmm?

If list_empty(ent) and q->root_blkg == NULL,

> /* walk to the next list_head, skip root blkcg */
> ent = ent->next;

ent is >blkg_list again.

> if (ent == >root_blkg->q_node)

So ent is not >root_blkg->q_node.

> ent = ent->next;
> if (ent == >blkg_list)
> return NULL;

And we return NULL here.

Ah, yes. You are correct.
We can do without the above hunk.

-- 
Jun'ichi Nomura, NEC Corporation

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Fix use-after-free of q->root_blkg and q->root_rl.blkg

2012-10-16 Thread Tejun Heo
Hello,

On Thu, Oct 11, 2012 at 10:31:46AM +0900, Jun'ichi Nomura wrote:
> >> -  if (ent == >root_blkg->q_node)
> >> +  if (q->root_blkg && ent == >root_blkg->q_node)
> > 
> > Can we fix it little differently. Little earlier in the code, we check for
> > if q->blkg_list is empty, then all the groups are gone, and there are
> > no more request lists hence and return NULL.
> > 
> > Current code:
> > if (rl == >root_rl) {
> > ent = >blkg_list;
> > 
> > Modified code:
> > if (rl == >root_rl) {
> > ent = >blkg_list;
> > /* There are no more block groups, hence no request lists */
> > if (list_empty(ent))
> > return NULL;
> > }

Do we need this at all?  q->root_blkg being NULL is completely fine
there and the comparison would work as expected, no?

Thanks.

-- 
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Fix use-after-free of q-root_blkg and q-root_rl.blkg

2012-10-16 Thread Tejun Heo
Hello,

On Thu, Oct 11, 2012 at 10:31:46AM +0900, Jun'ichi Nomura wrote:
  -  if (ent == q-root_blkg-q_node)
  +  if (q-root_blkg  ent == q-root_blkg-q_node)
  
  Can we fix it little differently. Little earlier in the code, we check for
  if q-blkg_list is empty, then all the groups are gone, and there are
  no more request lists hence and return NULL.
  
  Current code:
  if (rl == q-root_rl) {
  ent = q-blkg_list;
  
  Modified code:
  if (rl == q-root_rl) {
  ent = q-blkg_list;
  /* There are no more block groups, hence no request lists */
  if (list_empty(ent))
  return NULL;
  }

Do we need this at all?  q-root_blkg being NULL is completely fine
there and the comparison would work as expected, no?

Thanks.

-- 
tejun
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Fix use-after-free of q-root_blkg and q-root_rl.blkg

2012-10-16 Thread Jun'ichi Nomura
On 10/17/12 08:20, Tejun Heo wrote:
 -  if (ent == q-root_blkg-q_node)
 +  if (q-root_blkg  ent == q-root_blkg-q_node)

 Can we fix it little differently. Little earlier in the code, we check for
 if q-blkg_list is empty, then all the groups are gone, and there are
 no more request lists hence and return NULL.

 Current code:
 if (rl == q-root_rl) {
 ent = q-blkg_list;

 Modified code:
 if (rl == q-root_rl) {
 ent = q-blkg_list;
 /* There are no more block groups, hence no request lists */
 if (list_empty(ent))
 return NULL;
 }
 
 Do we need this at all?  q-root_blkg being NULL is completely fine
 there and the comparison would work as expected, no?

Hmm?

If list_empty(ent) and q-root_blkg == NULL,

 /* walk to the next list_head, skip root blkcg */
 ent = ent-next;

ent is q-blkg_list again.

 if (ent == q-root_blkg-q_node)

So ent is not q-root_blkg-q_node.

 ent = ent-next;
 if (ent == q-blkg_list)
 return NULL;

And we return NULL here.

Ah, yes. You are correct.
We can do without the above hunk.

-- 
Jun'ichi Nomura, NEC Corporation

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Fix use-after-free of q->root_blkg and q->root_rl.blkg

2012-10-11 Thread Vivek Goyal
On Thu, Oct 11, 2012 at 10:31:46AM +0900, Jun'ichi Nomura wrote:

[..]
> Below is the updated version of the patch.
> 
> ==
> blk_put_rl() does not call blkg_put() for q->root_rl because we
> don't take request list reference on q->root_blkg.
> However, if root_blkg is once attached then detached (freed),
> blk_put_rl() is confused by the bogus pointer in q->root_blkg.
> 
> For example, with !CONFIG_BLK_DEV_THROTTLING && CONFIG_CFQ_GROUP_IOSCHED,
> switching IO scheduler from cfq to deadline will cause system stall
> after the following warning with 3.6:
> 
> > WARNING: at /work/build/linux/block/blk-cgroup.h:250 blk_put_rl+0x4d/0x95()
> > Modules linked in: bridge stp llc sunrpc acpi_cpufreq freq_table mperf 
> > ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4
> > Pid: 0, comm: swapper/0 Not tainted 3.6.0 #1
> > Call Trace:
> >[] warn_slowpath_common+0x85/0x9d
> >  [] warn_slowpath_null+0x1a/0x1c
> >  [] blk_put_rl+0x4d/0x95
> >  [] __blk_put_request+0xc3/0xcb
> >  [] blk_finish_request+0x232/0x23f
> >  [] ? blk_end_bidi_request+0x34/0x5d
> >  [] blk_end_bidi_request+0x42/0x5d
> >  [] blk_end_request+0x10/0x12
> >  [] scsi_io_completion+0x207/0x4d5
> >  [] scsi_finish_command+0xfa/0x103
> >  [] scsi_softirq_done+0xff/0x108
> >  [] blk_done_softirq+0x8d/0xa1
> >  [] ? generic_smp_call_function_single_interrupt+0x9f/0xd7
> >  [] __do_softirq+0x102/0x213
> >  [] ? lock_release_holdtime+0xb6/0xbb
> >  [] ? raise_softirq_irqoff+0x9/0x3d
> >  [] call_softirq+0x1c/0x30
> >  [] do_softirq+0x4b/0xa3
> >  [] irq_exit+0x53/0xd5
> >  [] smp_call_function_single_interrupt+0x34/0x36
> >  [] call_function_single_interrupt+0x6f/0x80
> >[] ? mwait_idle+0x94/0xcd
> >  [] ? mwait_idle+0x8b/0xcd
> >  [] cpu_idle+0xbb/0x114
> >  [] rest_init+0xc1/0xc8
> >  [] ? csum_partial_copy_generic+0x16c/0x16c
> >  [] start_kernel+0x3d4/0x3e1
> >  [] ? kernel_init+0x1f7/0x1f7
> >  [] x86_64_start_reservations+0xb8/0xbd
> >  [] x86_64_start_kernel+0x101/0x110
> 
> This patch clears q->root_blkg and q->root_rl.blkg when root blkg
> is destroyed.
> __blk_queue_next_rl(), which uses q->root_blkg without check,
> is changed to exit early when all blkg's are destroyed.
> 

Thanks. This patch looks good to me.

Acked-by: Vivek Goyal 

Vivek

> diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
> index f3b44a6..a31e678 100644
> --- a/block/blk-cgroup.c
> +++ b/block/blk-cgroup.c
> @@ -285,6 +285,13 @@ static void blkg_destroy_all(struct request_queue *q)
>   blkg_destroy(blkg);
>   spin_unlock(>lock);
>   }
> +
> + /*
> +  * root blkg is destroyed.  Just clear the pointer since
> +  * root_rl does not take reference on root blkg.
> +  */
> + q->root_blkg = NULL;
> + q->root_rl.blkg = NULL;
>  }
>  
>  static void blkg_rcu_free(struct rcu_head *rcu_head)
> @@ -326,6 +333,9 @@ struct request_list *__blk_queue_next_rl(struct 
> request_list *rl,
>*/
>   if (rl == >root_rl) {
>   ent = >blkg_list;
> + /* There are no more block groups, hence no request lists */
> + if (list_empty(ent))
> + return NULL;
>   } else {
>   blkg = container_of(rl, struct blkcg_gq, rl);
>   ent = >q_node;
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Fix use-after-free of q-root_blkg and q-root_rl.blkg

2012-10-11 Thread Vivek Goyal
On Thu, Oct 11, 2012 at 10:31:46AM +0900, Jun'ichi Nomura wrote:

[..]
 Below is the updated version of the patch.
 
 ==
 blk_put_rl() does not call blkg_put() for q-root_rl because we
 don't take request list reference on q-root_blkg.
 However, if root_blkg is once attached then detached (freed),
 blk_put_rl() is confused by the bogus pointer in q-root_blkg.
 
 For example, with !CONFIG_BLK_DEV_THROTTLING  CONFIG_CFQ_GROUP_IOSCHED,
 switching IO scheduler from cfq to deadline will cause system stall
 after the following warning with 3.6:
 
  WARNING: at /work/build/linux/block/blk-cgroup.h:250 blk_put_rl+0x4d/0x95()
  Modules linked in: bridge stp llc sunrpc acpi_cpufreq freq_table mperf 
  ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4
  Pid: 0, comm: swapper/0 Not tainted 3.6.0 #1
  Call Trace:
   IRQ  [810453bd] warn_slowpath_common+0x85/0x9d
   [810453ef] warn_slowpath_null+0x1a/0x1c
   [811d5f8d] blk_put_rl+0x4d/0x95
   [811d614a] __blk_put_request+0xc3/0xcb
   [811d71a3] blk_finish_request+0x232/0x23f
   [811d76c3] ? blk_end_bidi_request+0x34/0x5d
   [811d76d1] blk_end_bidi_request+0x42/0x5d
   [811d7728] blk_end_request+0x10/0x12
   [812cdf16] scsi_io_completion+0x207/0x4d5
   [812c6fcf] scsi_finish_command+0xfa/0x103
   [812ce2f8] scsi_softirq_done+0xff/0x108
   [811dcea5] blk_done_softirq+0x8d/0xa1
   [810915d5] ? generic_smp_call_function_single_interrupt+0x9f/0xd7
   [8104cf5b] __do_softirq+0x102/0x213
   [8108a5ec] ? lock_release_holdtime+0xb6/0xbb
   [8104d2b4] ? raise_softirq_irqoff+0x9/0x3d
   [81424dfc] call_softirq+0x1c/0x30
   [81011beb] do_softirq+0x4b/0xa3
   [8104cdb0] irq_exit+0x53/0xd5
   [8102d865] smp_call_function_single_interrupt+0x34/0x36
   [8142486f] call_function_single_interrupt+0x6f/0x80
   EOI  [8101800b] ? mwait_idle+0x94/0xcd
   [81018002] ? mwait_idle+0x8b/0xcd
   [81017811] cpu_idle+0xbb/0x114
   [81401fbd] rest_init+0xc1/0xc8
   [81401efc] ? csum_partial_copy_generic+0x16c/0x16c
   [81cdbd3d] start_kernel+0x3d4/0x3e1
   [81cdb79e] ? kernel_init+0x1f7/0x1f7
   [81cdb2dd] x86_64_start_reservations+0xb8/0xbd
   [81cdb3e3] x86_64_start_kernel+0x101/0x110
 
 This patch clears q-root_blkg and q-root_rl.blkg when root blkg
 is destroyed.
 __blk_queue_next_rl(), which uses q-root_blkg without check,
 is changed to exit early when all blkg's are destroyed.
 

Thanks. This patch looks good to me.

Acked-by: Vivek Goyal vgo...@redhat.com

Vivek

 diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
 index f3b44a6..a31e678 100644
 --- a/block/blk-cgroup.c
 +++ b/block/blk-cgroup.c
 @@ -285,6 +285,13 @@ static void blkg_destroy_all(struct request_queue *q)
   blkg_destroy(blkg);
   spin_unlock(blkcg-lock);
   }
 +
 + /*
 +  * root blkg is destroyed.  Just clear the pointer since
 +  * root_rl does not take reference on root blkg.
 +  */
 + q-root_blkg = NULL;
 + q-root_rl.blkg = NULL;
  }
  
  static void blkg_rcu_free(struct rcu_head *rcu_head)
 @@ -326,6 +333,9 @@ struct request_list *__blk_queue_next_rl(struct 
 request_list *rl,
*/
   if (rl == q-root_rl) {
   ent = q-blkg_list;
 + /* There are no more block groups, hence no request lists */
 + if (list_empty(ent))
 + return NULL;
   } else {
   blkg = container_of(rl, struct blkcg_gq, rl);
   ent = blkg-q_node;
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Fix use-after-free of q->root_blkg and q->root_rl.blkg

2012-10-10 Thread Jun'ichi Nomura
Hi Vivek, thank you for comments.

On 10/11/12 00:59, Vivek Goyal wrote:
> I think patch looks reasonable to me. Just that some more description
> would be nice. In fact, I will prefer some code comments too as I
> had to scratch my head for a while to figure out how did we reach here.
> 
> So looks like we deactivated cfq policy (most likely changed IO
> scheduler). That will destroy all the block groups (disconnect blkg
> from list and drop policy reference on group). If there are any pending
> IOs, then group will not be destroyed till IO is completed. (Because
> of cfqq reference on blkg and because of request list reference on
> blkg).
> 
> Now, all request list take a refenrece on associated blkg except
> q->root_rl. This means when last IO finished, it must have dropped
> the reference on cfqq which will drop reference on associated cfqg/blkg
> and immediately root blkg will be destroyed. And now we will call
> blk_put_rl() and that will try to access root_rl>blkg which has
> been just freed as last IO completed.

Yes, and for completion of any new IOs, blk_put_rl() is misled.

I'll try to extend the description according to your comments.

> 
> So problem here is that we don't take request list reference on
> root blkg and that creates all these corner cases.
> 
> So clearing q->root_blkg and q->root_rl.blkg during policy activation
> makes sense. That means that from queue and request list point of view
> root blkg is gone and you can't get to it. (It might still be around for
> some more time due to pending IOs though).
> 
> Some minor comments below.
> 
>>
>> Signed-off-by: Jun'ichi Nomura 
>>
>> diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
>> index f3b44a6..5015764 100644
>> --- a/block/blk-cgroup.c
>> +++ b/block/blk-cgroup.c
>> @@ -285,6 +285,9 @@ static void blkg_destroy_all(struct request_queue *q)
>>  blkg_destroy(blkg);
>>  spin_unlock(>lock);
>>  }
>> +
>> +q->root_blkg = NULL;
>> +q->root_rl.blkg = NULL;
> 
> I think some of the above description about we not taking root_rl
> reference on root group can go here so that next time I don't have
> to scratch my head for a long time.

I put the following comment:
  /*
   * root blkg is destroyed.  Just clear the pointer since
   * root_rl does not take reference on root blkg.
   */

> 
>>  }
>>  
>>  static void blkg_rcu_free(struct rcu_head *rcu_head)
>> @@ -333,7 +336,7 @@ struct request_list *__blk_queue_next_rl(struct 
>> request_list *rl,
>>  
>>  /* walk to the next list_head, skip root blkcg */
>>  ent = ent->next;
>> -if (ent == >root_blkg->q_node)
>> +if (q->root_blkg && ent == >root_blkg->q_node)
> 
> Can we fix it little differently. Little earlier in the code, we check for
> if q->blkg_list is empty, then all the groups are gone, and there are
> no more request lists hence and return NULL.
> 
> Current code:
> if (rl == >root_rl) {
> ent = >blkg_list;
> 
> Modified code:
> if (rl == >root_rl) {
> ent = >blkg_list;
>   /* There are no more block groups, hence no request lists */
>   if (list_empty(ent))
>   return NULL;
>   }

OK. I changed that.

Below is the updated version of the patch.

==
blk_put_rl() does not call blkg_put() for q->root_rl because we
don't take request list reference on q->root_blkg.
However, if root_blkg is once attached then detached (freed),
blk_put_rl() is confused by the bogus pointer in q->root_blkg.

For example, with !CONFIG_BLK_DEV_THROTTLING && CONFIG_CFQ_GROUP_IOSCHED,
switching IO scheduler from cfq to deadline will cause system stall
after the following warning with 3.6:

> WARNING: at /work/build/linux/block/blk-cgroup.h:250 blk_put_rl+0x4d/0x95()
> Modules linked in: bridge stp llc sunrpc acpi_cpufreq freq_table mperf 
> ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4
> Pid: 0, comm: swapper/0 Not tainted 3.6.0 #1
> Call Trace:
>[] warn_slowpath_common+0x85/0x9d
>  [] warn_slowpath_null+0x1a/0x1c
>  [] blk_put_rl+0x4d/0x95
>  [] __blk_put_request+0xc3/0xcb
>  [] blk_finish_request+0x232/0x23f
>  [] ? blk_end_bidi_request+0x34/0x5d
>  [] blk_end_bidi_request+0x42/0x5d
>  [] blk_end_request+0x10/0x12
>  [] scsi_io_completion+0x207/0x4d5
>  [] scsi_finish_command+0xfa/0x103
>  [] scsi_softirq_done+0xff/0x108
>  [] blk_done_softirq+0x8d/0xa1
>  [] ? generic_smp_call_function_single_interrupt+0x9f/0xd7
>  [] __do_softirq+0x102/0x213
>  [] ? lock_release_holdtime+0xb6/0xbb
>  [] ? raise_softirq_irqoff+0x9/0x3d
>  [] call_softirq+0x1c/0x30
>  [] do_softirq+0x4b/0xa3
>  [] irq_exit+0x53/0xd5
>  [] smp_call_function_single_interrupt+0x34/0x36
>  [] call_function_single_interrupt+0x6f/0x80
>[] ? mwait_idle+0x94/0xcd
>  [] ? mwait_idle+0x8b/0xcd
>  [] cpu_idle+0xbb/0x114
>  [] rest_init+0xc1/0xc8
>  [] ? csum_partial_copy_generic+0x16c/0x16c
>  [] 

Re: [PATCH] Fix use-after-free of q->root_blkg and q->root_rl.blkg

2012-10-10 Thread Vivek Goyal
On Wed, Oct 10, 2012 at 02:11:03PM +0900, Jun'ichi Nomura wrote:
> I got system stall after the following warning with 3.6:
> 
> > WARNING: at /work/build/linux/block/blk-cgroup.h:250 blk_put_rl+0x4d/0x95()
> > Modules linked in: bridge stp llc sunrpc acpi_cpufreq freq_table mperf 
> > ipt_REJEC
> > T nf_conntrack_ipv4 nf_defrag_ipv4
> > Pid: 0, comm: swapper/0 Not tainted 3.6.0 #1
> > Call Trace:
> >[] warn_slowpath_common+0x85/0x9d
> >  [] warn_slowpath_null+0x1a/0x1c
> >  [] blk_put_rl+0x4d/0x95
> >  [] __blk_put_request+0xc3/0xcb
> >  [] blk_finish_request+0x232/0x23f
> >  [] ? blk_end_bidi_request+0x34/0x5d
> >  [] blk_end_bidi_request+0x42/0x5d
> >  [] blk_end_request+0x10/0x12
> >  [] scsi_io_completion+0x207/0x4d5
> >  [] scsi_finish_command+0xfa/0x103
> >  [] scsi_softirq_done+0xff/0x108
> >  [] blk_done_softirq+0x8d/0xa1
> >  [] ? generic_smp_call_function_single_interrupt+0x9f/0xd7
> >  [] __do_softirq+0x102/0x213
> >  [] ? lock_release_holdtime+0xb6/0xbb
> >  [] ? raise_softirq_irqoff+0x9/0x3d
> >  [] call_softirq+0x1c/0x30
> >  [] do_softirq+0x4b/0xa3
> >  [] irq_exit+0x53/0xd5
> >  [] smp_call_function_single_interrupt+0x34/0x36
> >  [] call_function_single_interrupt+0x6f/0x80
> >[] ? mwait_idle+0x94/0xcd
> >  [] ? mwait_idle+0x8b/0xcd
> >  [] cpu_idle+0xbb/0x114
> >  [] rest_init+0xc1/0xc8
> >  [] ? csum_partial_copy_generic+0x16c/0x16c
> >  [] start_kernel+0x3d4/0x3e1
> >  [] ? kernel_init+0x1f7/0x1f7
> >  [] x86_64_start_reservations+0xb8/0xbd
> >  [] x86_64_start_kernel+0x101/0x110
> 
> blk_put_rl() does this:
>  if (rl->blkg && rl->blkg->blkcg != _root)
>  blkg_put(rl->blkg);
> but if rl is q->root_rl, rl->blkg might be a bogus pointer
> because blkcg_deactivate_policy() does not clear q->root_rl.blkg
> after blkg_destroy_all().
> 
> Attached patch works for me.

I think patch looks reasonable to me. Just that some more description
would be nice. In fact, I will prefer some code comments too as I
had to scratch my head for a while to figure out how did we reach here.

So looks like we deactivated cfq policy (most likely changed IO
scheduler). That will destroy all the block groups (disconnect blkg
from list and drop policy reference on group). If there are any pending
IOs, then group will not be destroyed till IO is completed. (Because
of cfqq reference on blkg and because of request list reference on
blkg).

Now, all request list take a refenrece on associated blkg except
q->root_rl. This means when last IO finished, it must have dropped
the reference on cfqq which will drop reference on associated cfqg/blkg
and immediately root blkg will be destroyed. And now we will call
blk_put_rl() and that will try to access root_rl>blkg which has
been just freed as last IO completed.

So problem here is that we don't take request list reference on
root blkg and that creates all these corner cases.

So clearing q->root_blkg and q->root_rl.blkg during policy activation
makes sense. That means that from queue and request list point of view
root blkg is gone and you can't get to it. (It might still be around for
some more time due to pending IOs though).

Some minor comments below.

> 
> Signed-off-by: Jun'ichi Nomura 
> 
> diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
> index f3b44a6..5015764 100644
> --- a/block/blk-cgroup.c
> +++ b/block/blk-cgroup.c
> @@ -285,6 +285,9 @@ static void blkg_destroy_all(struct request_queue *q)
>   blkg_destroy(blkg);
>   spin_unlock(>lock);
>   }
> +
> + q->root_blkg = NULL;
> + q->root_rl.blkg = NULL;

I think some of the above description about we not taking root_rl
reference on root group can go here so that next time I don't have
to scratch my head for a long time.

>  }
>  
>  static void blkg_rcu_free(struct rcu_head *rcu_head)
> @@ -333,7 +336,7 @@ struct request_list *__blk_queue_next_rl(struct 
> request_list *rl,
>  
>   /* walk to the next list_head, skip root blkcg */
>   ent = ent->next;
> - if (ent == >root_blkg->q_node)
> + if (q->root_blkg && ent == >root_blkg->q_node)

Can we fix it little differently. Little earlier in the code, we check for
if q->blkg_list is empty, then all the groups are gone, and there are
no more request lists hence and return NULL.

Current code:
if (rl == >root_rl) {
ent = >blkg_list;

Modified code:
if (rl == >root_rl) {
ent = >blkg_list;
/* There are no more block groups, hence no request lists */
if (list_empty(ent))
return NULL;
}

Thanks
Vivek
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Fix use-after-free of q-root_blkg and q-root_rl.blkg

2012-10-10 Thread Vivek Goyal
On Wed, Oct 10, 2012 at 02:11:03PM +0900, Jun'ichi Nomura wrote:
 I got system stall after the following warning with 3.6:
 
  WARNING: at /work/build/linux/block/blk-cgroup.h:250 blk_put_rl+0x4d/0x95()
  Modules linked in: bridge stp llc sunrpc acpi_cpufreq freq_table mperf 
  ipt_REJEC
  T nf_conntrack_ipv4 nf_defrag_ipv4
  Pid: 0, comm: swapper/0 Not tainted 3.6.0 #1
  Call Trace:
   IRQ  [810453bd] warn_slowpath_common+0x85/0x9d
   [810453ef] warn_slowpath_null+0x1a/0x1c
   [811d5f8d] blk_put_rl+0x4d/0x95
   [811d614a] __blk_put_request+0xc3/0xcb
   [811d71a3] blk_finish_request+0x232/0x23f
   [811d76c3] ? blk_end_bidi_request+0x34/0x5d
   [811d76d1] blk_end_bidi_request+0x42/0x5d
   [811d7728] blk_end_request+0x10/0x12
   [812cdf16] scsi_io_completion+0x207/0x4d5
   [812c6fcf] scsi_finish_command+0xfa/0x103
   [812ce2f8] scsi_softirq_done+0xff/0x108
   [811dcea5] blk_done_softirq+0x8d/0xa1
   [810915d5] ? generic_smp_call_function_single_interrupt+0x9f/0xd7
   [8104cf5b] __do_softirq+0x102/0x213
   [8108a5ec] ? lock_release_holdtime+0xb6/0xbb
   [8104d2b4] ? raise_softirq_irqoff+0x9/0x3d
   [81424dfc] call_softirq+0x1c/0x30
   [81011beb] do_softirq+0x4b/0xa3
   [8104cdb0] irq_exit+0x53/0xd5
   [8102d865] smp_call_function_single_interrupt+0x34/0x36
   [8142486f] call_function_single_interrupt+0x6f/0x80
   EOI  [8101800b] ? mwait_idle+0x94/0xcd
   [81018002] ? mwait_idle+0x8b/0xcd
   [81017811] cpu_idle+0xbb/0x114
   [81401fbd] rest_init+0xc1/0xc8
   [81401efc] ? csum_partial_copy_generic+0x16c/0x16c
   [81cdbd3d] start_kernel+0x3d4/0x3e1
   [81cdb79e] ? kernel_init+0x1f7/0x1f7
   [81cdb2dd] x86_64_start_reservations+0xb8/0xbd
   [81cdb3e3] x86_64_start_kernel+0x101/0x110
 
 blk_put_rl() does this:
  if (rl-blkg  rl-blkg-blkcg != blkcg_root)
  blkg_put(rl-blkg);
 but if rl is q-root_rl, rl-blkg might be a bogus pointer
 because blkcg_deactivate_policy() does not clear q-root_rl.blkg
 after blkg_destroy_all().
 
 Attached patch works for me.

I think patch looks reasonable to me. Just that some more description
would be nice. In fact, I will prefer some code comments too as I
had to scratch my head for a while to figure out how did we reach here.

So looks like we deactivated cfq policy (most likely changed IO
scheduler). That will destroy all the block groups (disconnect blkg
from list and drop policy reference on group). If there are any pending
IOs, then group will not be destroyed till IO is completed. (Because
of cfqq reference on blkg and because of request list reference on
blkg).

Now, all request list take a refenrece on associated blkg except
q-root_rl. This means when last IO finished, it must have dropped
the reference on cfqq which will drop reference on associated cfqg/blkg
and immediately root blkg will be destroyed. And now we will call
blk_put_rl() and that will try to access root_rlblkg which has
been just freed as last IO completed.

So problem here is that we don't take request list reference on
root blkg and that creates all these corner cases.

So clearing q-root_blkg and q-root_rl.blkg during policy activation
makes sense. That means that from queue and request list point of view
root blkg is gone and you can't get to it. (It might still be around for
some more time due to pending IOs though).

Some minor comments below.

 
 Signed-off-by: Jun'ichi Nomura j-nom...@ce.jp.nec.com
 
 diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
 index f3b44a6..5015764 100644
 --- a/block/blk-cgroup.c
 +++ b/block/blk-cgroup.c
 @@ -285,6 +285,9 @@ static void blkg_destroy_all(struct request_queue *q)
   blkg_destroy(blkg);
   spin_unlock(blkcg-lock);
   }
 +
 + q-root_blkg = NULL;
 + q-root_rl.blkg = NULL;

I think some of the above description about we not taking root_rl
reference on root group can go here so that next time I don't have
to scratch my head for a long time.

  }
  
  static void blkg_rcu_free(struct rcu_head *rcu_head)
 @@ -333,7 +336,7 @@ struct request_list *__blk_queue_next_rl(struct 
 request_list *rl,
  
   /* walk to the next list_head, skip root blkcg */
   ent = ent-next;
 - if (ent == q-root_blkg-q_node)
 + if (q-root_blkg  ent == q-root_blkg-q_node)

Can we fix it little differently. Little earlier in the code, we check for
if q-blkg_list is empty, then all the groups are gone, and there are
no more request lists hence and return NULL.

Current code:
if (rl == q-root_rl) {
ent = q-blkg_list;

Modified code:
if (rl == q-root_rl) {
ent = q-blkg_list;
/* There are no more block groups, hence no request lists */
if (list_empty(ent))
return NULL;
}


Re: [PATCH] Fix use-after-free of q-root_blkg and q-root_rl.blkg

2012-10-10 Thread Jun'ichi Nomura
Hi Vivek, thank you for comments.

On 10/11/12 00:59, Vivek Goyal wrote:
 I think patch looks reasonable to me. Just that some more description
 would be nice. In fact, I will prefer some code comments too as I
 had to scratch my head for a while to figure out how did we reach here.
 
 So looks like we deactivated cfq policy (most likely changed IO
 scheduler). That will destroy all the block groups (disconnect blkg
 from list and drop policy reference on group). If there are any pending
 IOs, then group will not be destroyed till IO is completed. (Because
 of cfqq reference on blkg and because of request list reference on
 blkg).
 
 Now, all request list take a refenrece on associated blkg except
 q-root_rl. This means when last IO finished, it must have dropped
 the reference on cfqq which will drop reference on associated cfqg/blkg
 and immediately root blkg will be destroyed. And now we will call
 blk_put_rl() and that will try to access root_rlblkg which has
 been just freed as last IO completed.

Yes, and for completion of any new IOs, blk_put_rl() is misled.

I'll try to extend the description according to your comments.

 
 So problem here is that we don't take request list reference on
 root blkg and that creates all these corner cases.
 
 So clearing q-root_blkg and q-root_rl.blkg during policy activation
 makes sense. That means that from queue and request list point of view
 root blkg is gone and you can't get to it. (It might still be around for
 some more time due to pending IOs though).
 
 Some minor comments below.
 

 Signed-off-by: Jun'ichi Nomura j-nom...@ce.jp.nec.com

 diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
 index f3b44a6..5015764 100644
 --- a/block/blk-cgroup.c
 +++ b/block/blk-cgroup.c
 @@ -285,6 +285,9 @@ static void blkg_destroy_all(struct request_queue *q)
  blkg_destroy(blkg);
  spin_unlock(blkcg-lock);
  }
 +
 +q-root_blkg = NULL;
 +q-root_rl.blkg = NULL;
 
 I think some of the above description about we not taking root_rl
 reference on root group can go here so that next time I don't have
 to scratch my head for a long time.

I put the following comment:
  /*
   * root blkg is destroyed.  Just clear the pointer since
   * root_rl does not take reference on root blkg.
   */

 
  }
  
  static void blkg_rcu_free(struct rcu_head *rcu_head)
 @@ -333,7 +336,7 @@ struct request_list *__blk_queue_next_rl(struct 
 request_list *rl,
  
  /* walk to the next list_head, skip root blkcg */
  ent = ent-next;
 -if (ent == q-root_blkg-q_node)
 +if (q-root_blkg  ent == q-root_blkg-q_node)
 
 Can we fix it little differently. Little earlier in the code, we check for
 if q-blkg_list is empty, then all the groups are gone, and there are
 no more request lists hence and return NULL.
 
 Current code:
 if (rl == q-root_rl) {
 ent = q-blkg_list;
 
 Modified code:
 if (rl == q-root_rl) {
 ent = q-blkg_list;
   /* There are no more block groups, hence no request lists */
   if (list_empty(ent))
   return NULL;
   }

OK. I changed that.

Below is the updated version of the patch.

==
blk_put_rl() does not call blkg_put() for q-root_rl because we
don't take request list reference on q-root_blkg.
However, if root_blkg is once attached then detached (freed),
blk_put_rl() is confused by the bogus pointer in q-root_blkg.

For example, with !CONFIG_BLK_DEV_THROTTLING  CONFIG_CFQ_GROUP_IOSCHED,
switching IO scheduler from cfq to deadline will cause system stall
after the following warning with 3.6:

 WARNING: at /work/build/linux/block/blk-cgroup.h:250 blk_put_rl+0x4d/0x95()
 Modules linked in: bridge stp llc sunrpc acpi_cpufreq freq_table mperf 
 ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4
 Pid: 0, comm: swapper/0 Not tainted 3.6.0 #1
 Call Trace:
  IRQ  [810453bd] warn_slowpath_common+0x85/0x9d
  [810453ef] warn_slowpath_null+0x1a/0x1c
  [811d5f8d] blk_put_rl+0x4d/0x95
  [811d614a] __blk_put_request+0xc3/0xcb
  [811d71a3] blk_finish_request+0x232/0x23f
  [811d76c3] ? blk_end_bidi_request+0x34/0x5d
  [811d76d1] blk_end_bidi_request+0x42/0x5d
  [811d7728] blk_end_request+0x10/0x12
  [812cdf16] scsi_io_completion+0x207/0x4d5
  [812c6fcf] scsi_finish_command+0xfa/0x103
  [812ce2f8] scsi_softirq_done+0xff/0x108
  [811dcea5] blk_done_softirq+0x8d/0xa1
  [810915d5] ? generic_smp_call_function_single_interrupt+0x9f/0xd7
  [8104cf5b] __do_softirq+0x102/0x213
  [8108a5ec] ? lock_release_holdtime+0xb6/0xbb
  [8104d2b4] ? raise_softirq_irqoff+0x9/0x3d
  [81424dfc] call_softirq+0x1c/0x30
  [81011beb] do_softirq+0x4b/0xa3
  [8104cdb0] irq_exit+0x53/0xd5
  [8102d865] smp_call_function_single_interrupt+0x34/0x36
  

[PATCH] Fix use-after-free of q->root_blkg and q->root_rl.blkg

2012-10-09 Thread Jun'ichi Nomura
I got system stall after the following warning with 3.6:

> WARNING: at /work/build/linux/block/blk-cgroup.h:250 blk_put_rl+0x4d/0x95()
> Modules linked in: bridge stp llc sunrpc acpi_cpufreq freq_table mperf 
> ipt_REJEC
> T nf_conntrack_ipv4 nf_defrag_ipv4
> Pid: 0, comm: swapper/0 Not tainted 3.6.0 #1
> Call Trace:
>[] warn_slowpath_common+0x85/0x9d
>  [] warn_slowpath_null+0x1a/0x1c
>  [] blk_put_rl+0x4d/0x95
>  [] __blk_put_request+0xc3/0xcb
>  [] blk_finish_request+0x232/0x23f
>  [] ? blk_end_bidi_request+0x34/0x5d
>  [] blk_end_bidi_request+0x42/0x5d
>  [] blk_end_request+0x10/0x12
>  [] scsi_io_completion+0x207/0x4d5
>  [] scsi_finish_command+0xfa/0x103
>  [] scsi_softirq_done+0xff/0x108
>  [] blk_done_softirq+0x8d/0xa1
>  [] ? generic_smp_call_function_single_interrupt+0x9f/0xd7
>  [] __do_softirq+0x102/0x213
>  [] ? lock_release_holdtime+0xb6/0xbb
>  [] ? raise_softirq_irqoff+0x9/0x3d
>  [] call_softirq+0x1c/0x30
>  [] do_softirq+0x4b/0xa3
>  [] irq_exit+0x53/0xd5
>  [] smp_call_function_single_interrupt+0x34/0x36
>  [] call_function_single_interrupt+0x6f/0x80
>[] ? mwait_idle+0x94/0xcd
>  [] ? mwait_idle+0x8b/0xcd
>  [] cpu_idle+0xbb/0x114
>  [] rest_init+0xc1/0xc8
>  [] ? csum_partial_copy_generic+0x16c/0x16c
>  [] start_kernel+0x3d4/0x3e1
>  [] ? kernel_init+0x1f7/0x1f7
>  [] x86_64_start_reservations+0xb8/0xbd
>  [] x86_64_start_kernel+0x101/0x110

blk_put_rl() does this:
 if (rl->blkg && rl->blkg->blkcg != _root)
 blkg_put(rl->blkg);
but if rl is q->root_rl, rl->blkg might be a bogus pointer
because blkcg_deactivate_policy() does not clear q->root_rl.blkg
after blkg_destroy_all().

Attached patch works for me.

Signed-off-by: Jun'ichi Nomura 

diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
index f3b44a6..5015764 100644
--- a/block/blk-cgroup.c
+++ b/block/blk-cgroup.c
@@ -285,6 +285,9 @@ static void blkg_destroy_all(struct request_queue *q)
blkg_destroy(blkg);
spin_unlock(>lock);
}
+
+   q->root_blkg = NULL;
+   q->root_rl.blkg = NULL;
 }
 
 static void blkg_rcu_free(struct rcu_head *rcu_head)
@@ -333,7 +336,7 @@ struct request_list *__blk_queue_next_rl(struct 
request_list *rl,
 
/* walk to the next list_head, skip root blkcg */
ent = ent->next;
-   if (ent == >root_blkg->q_node)
+   if (q->root_blkg && ent == >root_blkg->q_node)
ent = ent->next;
if (ent == >blkg_list)
return NULL;
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] Fix use-after-free of q-root_blkg and q-root_rl.blkg

2012-10-09 Thread Jun'ichi Nomura
I got system stall after the following warning with 3.6:

 WARNING: at /work/build/linux/block/blk-cgroup.h:250 blk_put_rl+0x4d/0x95()
 Modules linked in: bridge stp llc sunrpc acpi_cpufreq freq_table mperf 
 ipt_REJEC
 T nf_conntrack_ipv4 nf_defrag_ipv4
 Pid: 0, comm: swapper/0 Not tainted 3.6.0 #1
 Call Trace:
  IRQ  [810453bd] warn_slowpath_common+0x85/0x9d
  [810453ef] warn_slowpath_null+0x1a/0x1c
  [811d5f8d] blk_put_rl+0x4d/0x95
  [811d614a] __blk_put_request+0xc3/0xcb
  [811d71a3] blk_finish_request+0x232/0x23f
  [811d76c3] ? blk_end_bidi_request+0x34/0x5d
  [811d76d1] blk_end_bidi_request+0x42/0x5d
  [811d7728] blk_end_request+0x10/0x12
  [812cdf16] scsi_io_completion+0x207/0x4d5
  [812c6fcf] scsi_finish_command+0xfa/0x103
  [812ce2f8] scsi_softirq_done+0xff/0x108
  [811dcea5] blk_done_softirq+0x8d/0xa1
  [810915d5] ? generic_smp_call_function_single_interrupt+0x9f/0xd7
  [8104cf5b] __do_softirq+0x102/0x213
  [8108a5ec] ? lock_release_holdtime+0xb6/0xbb
  [8104d2b4] ? raise_softirq_irqoff+0x9/0x3d
  [81424dfc] call_softirq+0x1c/0x30
  [81011beb] do_softirq+0x4b/0xa3
  [8104cdb0] irq_exit+0x53/0xd5
  [8102d865] smp_call_function_single_interrupt+0x34/0x36
  [8142486f] call_function_single_interrupt+0x6f/0x80
  EOI  [8101800b] ? mwait_idle+0x94/0xcd
  [81018002] ? mwait_idle+0x8b/0xcd
  [81017811] cpu_idle+0xbb/0x114
  [81401fbd] rest_init+0xc1/0xc8
  [81401efc] ? csum_partial_copy_generic+0x16c/0x16c
  [81cdbd3d] start_kernel+0x3d4/0x3e1
  [81cdb79e] ? kernel_init+0x1f7/0x1f7
  [81cdb2dd] x86_64_start_reservations+0xb8/0xbd
  [81cdb3e3] x86_64_start_kernel+0x101/0x110

blk_put_rl() does this:
 if (rl-blkg  rl-blkg-blkcg != blkcg_root)
 blkg_put(rl-blkg);
but if rl is q-root_rl, rl-blkg might be a bogus pointer
because blkcg_deactivate_policy() does not clear q-root_rl.blkg
after blkg_destroy_all().

Attached patch works for me.

Signed-off-by: Jun'ichi Nomura j-nom...@ce.jp.nec.com

diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
index f3b44a6..5015764 100644
--- a/block/blk-cgroup.c
+++ b/block/blk-cgroup.c
@@ -285,6 +285,9 @@ static void blkg_destroy_all(struct request_queue *q)
blkg_destroy(blkg);
spin_unlock(blkcg-lock);
}
+
+   q-root_blkg = NULL;
+   q-root_rl.blkg = NULL;
 }
 
 static void blkg_rcu_free(struct rcu_head *rcu_head)
@@ -333,7 +336,7 @@ struct request_list *__blk_queue_next_rl(struct 
request_list *rl,
 
/* walk to the next list_head, skip root blkcg */
ent = ent-next;
-   if (ent == q-root_blkg-q_node)
+   if (q-root_blkg  ent == q-root_blkg-q_node)
ent = ent-next;
if (ent == q-blkg_list)
return NULL;
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/