Re: [PATCH 1/1] block: CFQ refcounting fix

2005-08-31 Thread Jens Axboe
On Wed, Aug 31 2005, Brian King wrote:
> diff -puN drivers/block/cfq-iosched.c~cfq_refcnt_fix 
> drivers/block/cfq-iosched.c
> --- linux-2.6/drivers/block/cfq-iosched.c~cfq_refcnt_fix  2005-08-30 
> 17:26:55.0 -0500
> +++ linux-2.6-bjking1/drivers/block/cfq-iosched.c 2005-08-31 
> 08:48:30.0 -0500
> @@ -2260,8 +2260,6 @@ static void cfq_put_cfqd(struct cfq_data
>   if (!atomic_dec_and_test(>ref))
>   return;
>  
> - blk_put_queue(q);
> -
>   cfq_shutdown_timer_wq(cfqd);
>   q->elevator->elevator_data = NULL;
>  
> @@ -2318,7 +2316,6 @@ static int cfq_init_queue(request_queue_
>   e->elevator_data = cfqd;
>  
>   cfqd->queue = q;
> - atomic_inc(>refcnt);
>  
>   cfqd->max_queued = q->nr_requests / 4;
>   q->nr_batching = cfq_queued;
> _

That looks better. I'll add this to my outgoing queue, thanks!

-- 
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/1] block: CFQ refcounting fix

2005-08-31 Thread Brian King
Jens Axboe wrote:
> On Wed, Aug 31 2005, Brian King wrote:
> 
>>Jens Axboe wrote:
>>
>>>On Tue, Aug 30 2005, [EMAIL PROTECTED] wrote:
>>>
>>>
I ran across a memory leak related to the cfq scheduler. The cfq
init function increments the refcnt of the associated request_queue.
This refcount gets decremented in cfq's exit function. Since 
blk_cleanup_queue
only calls the elevator exit function when its refcnt goes to zero, the
request_q never gets cleaned up. It didn't look like other io schedulers 
were
incrementing this refcnt, so I removed the refcnt increment and it fixed the
memory leak for me.

To reproduce the problem, simply use cfq and use the scsi_host scan sysfs
attribute to scan "- - -" repeatedly on a scsi host and watch the memory
vanish.
>>>
>>>
>>>Yeah, that actually looks like a dangling reference. I assume you tested
>>>this properly?
>>
>>Yes. I applied the patch, booted my system (which was crashing on
>>bootup before due to out of memory errors due to the leak) ran the
>>scan a few times and verified /proc/meminfo didn't continually
>>decrease like without it, and rebooted again.  If there is anything
>>else you would like me to do, I would be happy to do so.
> 
> 
> I think you need to remove the blk_put_queue() in cfq_put_cfqd() as
> well, otherwise I don't see how this can work without looking at freed
> memory. I'll audit the other paths as well.

Good catch. Here is an updated patch. 


-- 
Brian King
eServer Storage I/O
IBM Linux Technology Center

I ran across a memory leak related to the cfq scheduler. The cfq
init function increments the refcnt of the associated request_queue.
This refcount gets decremented in cfq's exit function. Since blk_cleanup_queue
only calls the elevator exit function when its refcnt goes to zero, the
request_q never gets cleaned up. It didn't look like other io schedulers were
incrementing this refcnt, so I removed the refcnt increment and it fixed the
memory leak for me.

To reproduce the problem, simply use cfq and use the scsi_host scan sysfs
attribute to scan "- - -" repeatedly on a scsi host and watch the memory
vanish.

Signed-off-by: Brian King <[EMAIL PROTECTED]>
---

 linux-2.6-bjking1/drivers/block/cfq-iosched.c |3 ---
 1 files changed, 3 deletions(-)

diff -puN drivers/block/cfq-iosched.c~cfq_refcnt_fix drivers/block/cfq-iosched.c
--- linux-2.6/drivers/block/cfq-iosched.c~cfq_refcnt_fix2005-08-30 
17:26:55.0 -0500
+++ linux-2.6-bjking1/drivers/block/cfq-iosched.c   2005-08-31 
08:48:30.0 -0500
@@ -2260,8 +2260,6 @@ static void cfq_put_cfqd(struct cfq_data
if (!atomic_dec_and_test(>ref))
return;
 
-   blk_put_queue(q);
-
cfq_shutdown_timer_wq(cfqd);
q->elevator->elevator_data = NULL;
 
@@ -2318,7 +2316,6 @@ static int cfq_init_queue(request_queue_
e->elevator_data = cfqd;
 
cfqd->queue = q;
-   atomic_inc(>refcnt);
 
cfqd->max_queued = q->nr_requests / 4;
q->nr_batching = cfq_queued;
_


Re: [PATCH 1/1] block: CFQ refcounting fix

2005-08-31 Thread Jens Axboe
On Wed, Aug 31 2005, Brian King wrote:
> Jens Axboe wrote:
> > On Tue, Aug 30 2005, [EMAIL PROTECTED] wrote:
> > 
> >>I ran across a memory leak related to the cfq scheduler. The cfq
> >>init function increments the refcnt of the associated request_queue.
> >>This refcount gets decremented in cfq's exit function. Since 
> >>blk_cleanup_queue
> >>only calls the elevator exit function when its refcnt goes to zero, the
> >>request_q never gets cleaned up. It didn't look like other io schedulers 
> >>were
> >>incrementing this refcnt, so I removed the refcnt increment and it fixed the
> >>memory leak for me.
> >>
> >>To reproduce the problem, simply use cfq and use the scsi_host scan sysfs
> >>attribute to scan "- - -" repeatedly on a scsi host and watch the memory
> >>vanish.
> > 
> > 
> > Yeah, that actually looks like a dangling reference. I assume you tested
> > this properly?
> 
> Yes. I applied the patch, booted my system (which was crashing on
> bootup before due to out of memory errors due to the leak) ran the
> scan a few times and verified /proc/meminfo didn't continually
> decrease like without it, and rebooted again.  If there is anything
> else you would like me to do, I would be happy to do so.

I think you need to remove the blk_put_queue() in cfq_put_cfqd() as
well, otherwise I don't see how this can work without looking at freed
memory. I'll audit the other paths as well.

-- 
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/1] block: CFQ refcounting fix

2005-08-31 Thread Brian King
Jens Axboe wrote:
> On Tue, Aug 30 2005, [EMAIL PROTECTED] wrote:
> 
>>I ran across a memory leak related to the cfq scheduler. The cfq
>>init function increments the refcnt of the associated request_queue.
>>This refcount gets decremented in cfq's exit function. Since blk_cleanup_queue
>>only calls the elevator exit function when its refcnt goes to zero, the
>>request_q never gets cleaned up. It didn't look like other io schedulers were
>>incrementing this refcnt, so I removed the refcnt increment and it fixed the
>>memory leak for me.
>>
>>To reproduce the problem, simply use cfq and use the scsi_host scan sysfs
>>attribute to scan "- - -" repeatedly on a scsi host and watch the memory
>>vanish.
> 
> 
> Yeah, that actually looks like a dangling reference. I assume you tested
> this properly?

Yes. I applied the patch, booted my system (which was crashing on bootup before
due to out of memory errors due to the leak) ran the scan a few times and 
verified
/proc/meminfo didn't continually decrease like without it, and rebooted again.
If there is anything else you would like me to do, I would be happy to do so.

Thanks

Brian


-- 
Brian King
eServer Storage I/O
IBM Linux Technology Center
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/1] block: CFQ refcounting fix

2005-08-31 Thread Jens Axboe
On Tue, Aug 30 2005, [EMAIL PROTECTED] wrote:
> 
> I ran across a memory leak related to the cfq scheduler. The cfq
> init function increments the refcnt of the associated request_queue.
> This refcount gets decremented in cfq's exit function. Since blk_cleanup_queue
> only calls the elevator exit function when its refcnt goes to zero, the
> request_q never gets cleaned up. It didn't look like other io schedulers were
> incrementing this refcnt, so I removed the refcnt increment and it fixed the
> memory leak for me.
> 
> To reproduce the problem, simply use cfq and use the scsi_host scan sysfs
> attribute to scan "- - -" repeatedly on a scsi host and watch the memory
> vanish.

Yeah, that actually looks like a dangling reference. I assume you tested
this properly?

-- 
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/1] block: CFQ refcounting fix

2005-08-31 Thread Jens Axboe
On Tue, Aug 30 2005, [EMAIL PROTECTED] wrote:
 
 I ran across a memory leak related to the cfq scheduler. The cfq
 init function increments the refcnt of the associated request_queue.
 This refcount gets decremented in cfq's exit function. Since blk_cleanup_queue
 only calls the elevator exit function when its refcnt goes to zero, the
 request_q never gets cleaned up. It didn't look like other io schedulers were
 incrementing this refcnt, so I removed the refcnt increment and it fixed the
 memory leak for me.
 
 To reproduce the problem, simply use cfq and use the scsi_host scan sysfs
 attribute to scan - - - repeatedly on a scsi host and watch the memory
 vanish.

Yeah, that actually looks like a dangling reference. I assume you tested
this properly?

-- 
Jens Axboe

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/1] block: CFQ refcounting fix

2005-08-31 Thread Brian King
Jens Axboe wrote:
 On Tue, Aug 30 2005, [EMAIL PROTECTED] wrote:
 
I ran across a memory leak related to the cfq scheduler. The cfq
init function increments the refcnt of the associated request_queue.
This refcount gets decremented in cfq's exit function. Since blk_cleanup_queue
only calls the elevator exit function when its refcnt goes to zero, the
request_q never gets cleaned up. It didn't look like other io schedulers were
incrementing this refcnt, so I removed the refcnt increment and it fixed the
memory leak for me.

To reproduce the problem, simply use cfq and use the scsi_host scan sysfs
attribute to scan - - - repeatedly on a scsi host and watch the memory
vanish.
 
 
 Yeah, that actually looks like a dangling reference. I assume you tested
 this properly?

Yes. I applied the patch, booted my system (which was crashing on bootup before
due to out of memory errors due to the leak) ran the scan a few times and 
verified
/proc/meminfo didn't continually decrease like without it, and rebooted again.
If there is anything else you would like me to do, I would be happy to do so.

Thanks

Brian


-- 
Brian King
eServer Storage I/O
IBM Linux Technology Center
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/1] block: CFQ refcounting fix

2005-08-31 Thread Jens Axboe
On Wed, Aug 31 2005, Brian King wrote:
 Jens Axboe wrote:
  On Tue, Aug 30 2005, [EMAIL PROTECTED] wrote:
  
 I ran across a memory leak related to the cfq scheduler. The cfq
 init function increments the refcnt of the associated request_queue.
 This refcount gets decremented in cfq's exit function. Since 
 blk_cleanup_queue
 only calls the elevator exit function when its refcnt goes to zero, the
 request_q never gets cleaned up. It didn't look like other io schedulers 
 were
 incrementing this refcnt, so I removed the refcnt increment and it fixed the
 memory leak for me.
 
 To reproduce the problem, simply use cfq and use the scsi_host scan sysfs
 attribute to scan - - - repeatedly on a scsi host and watch the memory
 vanish.
  
  
  Yeah, that actually looks like a dangling reference. I assume you tested
  this properly?
 
 Yes. I applied the patch, booted my system (which was crashing on
 bootup before due to out of memory errors due to the leak) ran the
 scan a few times and verified /proc/meminfo didn't continually
 decrease like without it, and rebooted again.  If there is anything
 else you would like me to do, I would be happy to do so.

I think you need to remove the blk_put_queue() in cfq_put_cfqd() as
well, otherwise I don't see how this can work without looking at freed
memory. I'll audit the other paths as well.

-- 
Jens Axboe

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/1] block: CFQ refcounting fix

2005-08-31 Thread Brian King
Jens Axboe wrote:
 On Wed, Aug 31 2005, Brian King wrote:
 
Jens Axboe wrote:

On Tue, Aug 30 2005, [EMAIL PROTECTED] wrote:


I ran across a memory leak related to the cfq scheduler. The cfq
init function increments the refcnt of the associated request_queue.
This refcount gets decremented in cfq's exit function. Since 
blk_cleanup_queue
only calls the elevator exit function when its refcnt goes to zero, the
request_q never gets cleaned up. It didn't look like other io schedulers 
were
incrementing this refcnt, so I removed the refcnt increment and it fixed the
memory leak for me.

To reproduce the problem, simply use cfq and use the scsi_host scan sysfs
attribute to scan - - - repeatedly on a scsi host and watch the memory
vanish.


Yeah, that actually looks like a dangling reference. I assume you tested
this properly?

Yes. I applied the patch, booted my system (which was crashing on
bootup before due to out of memory errors due to the leak) ran the
scan a few times and verified /proc/meminfo didn't continually
decrease like without it, and rebooted again.  If there is anything
else you would like me to do, I would be happy to do so.
 
 
 I think you need to remove the blk_put_queue() in cfq_put_cfqd() as
 well, otherwise I don't see how this can work without looking at freed
 memory. I'll audit the other paths as well.

Good catch. Here is an updated patch. 


-- 
Brian King
eServer Storage I/O
IBM Linux Technology Center

I ran across a memory leak related to the cfq scheduler. The cfq
init function increments the refcnt of the associated request_queue.
This refcount gets decremented in cfq's exit function. Since blk_cleanup_queue
only calls the elevator exit function when its refcnt goes to zero, the
request_q never gets cleaned up. It didn't look like other io schedulers were
incrementing this refcnt, so I removed the refcnt increment and it fixed the
memory leak for me.

To reproduce the problem, simply use cfq and use the scsi_host scan sysfs
attribute to scan - - - repeatedly on a scsi host and watch the memory
vanish.

Signed-off-by: Brian King [EMAIL PROTECTED]
---

 linux-2.6-bjking1/drivers/block/cfq-iosched.c |3 ---
 1 files changed, 3 deletions(-)

diff -puN drivers/block/cfq-iosched.c~cfq_refcnt_fix drivers/block/cfq-iosched.c
--- linux-2.6/drivers/block/cfq-iosched.c~cfq_refcnt_fix2005-08-30 
17:26:55.0 -0500
+++ linux-2.6-bjking1/drivers/block/cfq-iosched.c   2005-08-31 
08:48:30.0 -0500
@@ -2260,8 +2260,6 @@ static void cfq_put_cfqd(struct cfq_data
if (!atomic_dec_and_test(cfqd-ref))
return;
 
-   blk_put_queue(q);
-
cfq_shutdown_timer_wq(cfqd);
q-elevator-elevator_data = NULL;
 
@@ -2318,7 +2316,6 @@ static int cfq_init_queue(request_queue_
e-elevator_data = cfqd;
 
cfqd-queue = q;
-   atomic_inc(q-refcnt);
 
cfqd-max_queued = q-nr_requests / 4;
q-nr_batching = cfq_queued;
_


Re: [PATCH 1/1] block: CFQ refcounting fix

2005-08-31 Thread Jens Axboe
On Wed, Aug 31 2005, Brian King wrote:
 diff -puN drivers/block/cfq-iosched.c~cfq_refcnt_fix 
 drivers/block/cfq-iosched.c
 --- linux-2.6/drivers/block/cfq-iosched.c~cfq_refcnt_fix  2005-08-30 
 17:26:55.0 -0500
 +++ linux-2.6-bjking1/drivers/block/cfq-iosched.c 2005-08-31 
 08:48:30.0 -0500
 @@ -2260,8 +2260,6 @@ static void cfq_put_cfqd(struct cfq_data
   if (!atomic_dec_and_test(cfqd-ref))
   return;
  
 - blk_put_queue(q);
 -
   cfq_shutdown_timer_wq(cfqd);
   q-elevator-elevator_data = NULL;
  
 @@ -2318,7 +2316,6 @@ static int cfq_init_queue(request_queue_
   e-elevator_data = cfqd;
  
   cfqd-queue = q;
 - atomic_inc(q-refcnt);
  
   cfqd-max_queued = q-nr_requests / 4;
   q-nr_batching = cfq_queued;
 _

That looks better. I'll add this to my outgoing queue, thanks!

-- 
Jens Axboe

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/1] block: CFQ refcounting fix

2005-08-30 Thread brking

I ran across a memory leak related to the cfq scheduler. The cfq
init function increments the refcnt of the associated request_queue.
This refcount gets decremented in cfq's exit function. Since blk_cleanup_queue
only calls the elevator exit function when its refcnt goes to zero, the
request_q never gets cleaned up. It didn't look like other io schedulers were
incrementing this refcnt, so I removed the refcnt increment and it fixed the
memory leak for me.

To reproduce the problem, simply use cfq and use the scsi_host scan sysfs
attribute to scan "- - -" repeatedly on a scsi host and watch the memory
vanish.

Signed-off-by: Brian King <[EMAIL PROTECTED]>
---

 linux-2.6-bjking1/drivers/block/cfq-iosched.c |1 -
 1 files changed, 1 deletion(-)

diff -puN drivers/block/cfq-iosched.c~cfq_refcnt_fix drivers/block/cfq-iosched.c
--- linux-2.6/drivers/block/cfq-iosched.c~cfq_refcnt_fix2005-08-30 
17:26:55.0 -0500
+++ linux-2.6-bjking1/drivers/block/cfq-iosched.c   2005-08-30 
17:26:55.0 -0500
@@ -2318,7 +2318,6 @@ static int cfq_init_queue(request_queue_
e->elevator_data = cfqd;
 
cfqd->queue = q;
-   atomic_inc(>refcnt);
 
cfqd->max_queued = q->nr_requests / 4;
q->nr_batching = cfq_queued;
_
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/1] block: CFQ refcounting fix

2005-08-30 Thread brking

I ran across a memory leak related to the cfq scheduler. The cfq
init function increments the refcnt of the associated request_queue.
This refcount gets decremented in cfq's exit function. Since blk_cleanup_queue
only calls the elevator exit function when its refcnt goes to zero, the
request_q never gets cleaned up. It didn't look like other io schedulers were
incrementing this refcnt, so I removed the refcnt increment and it fixed the
memory leak for me.

To reproduce the problem, simply use cfq and use the scsi_host scan sysfs
attribute to scan - - - repeatedly on a scsi host and watch the memory
vanish.

Signed-off-by: Brian King [EMAIL PROTECTED]
---

 linux-2.6-bjking1/drivers/block/cfq-iosched.c |1 -
 1 files changed, 1 deletion(-)

diff -puN drivers/block/cfq-iosched.c~cfq_refcnt_fix drivers/block/cfq-iosched.c
--- linux-2.6/drivers/block/cfq-iosched.c~cfq_refcnt_fix2005-08-30 
17:26:55.0 -0500
+++ linux-2.6-bjking1/drivers/block/cfq-iosched.c   2005-08-30 
17:26:55.0 -0500
@@ -2318,7 +2318,6 @@ static int cfq_init_queue(request_queue_
e-elevator_data = cfqd;
 
cfqd-queue = q;
-   atomic_inc(q-refcnt);
 
cfqd-max_queued = q-nr_requests / 4;
q-nr_batching = cfq_queued;
_
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/