Re: [PATCH] fix NULL-pointer dereference on scsi_run_queue

2012-08-08 Thread Bart Van Assche
On 08/08/12 03:42, Chanho Min wrote: > Thank you for the explanation. It look correct. Let's check one more thing. > What If __scsi_remove_device doesn't release device? : reference count > is more than 2. > So We lost starved_list but device is exist. Is there any issue about this? As far as I

Re: [PATCH] fix NULL-pointer dereference on scsi_run_queue

2012-08-08 Thread Bart Van Assche
On 08/08/12 03:42, Chanho Min wrote: Thank you for the explanation. It look correct. Let's check one more thing. What If __scsi_remove_device doesn't release device? : reference count is more than 2. So We lost starved_list but device is exist. Is there any issue about this? As far as I can

Re: [PATCH] fix NULL-pointer dereference on scsi_run_queue

2012-08-07 Thread Chanho Min
> I'm afraid that without the second half of that patch the following race > is still possible: > - sdev reference count drops to zero while scsi_run_queue() is in > progress and while that sdev is on the starved_list of its SCSI host; > scsi_device_dev_release_usercontext() call is scheduled

Re: [PATCH] fix NULL-pointer dereference on scsi_run_queue

2012-08-07 Thread Mike Christie
On 08/06/2012 12:56 PM, Bart Van Assche wrote: > On 08/04/12 22:36, Mike Christie wrote: >> On 08/04/2012 03:18 PM, Bart Van Assche wrote: >>> On 08/04/12 16:46, Mike Christie wrote: I think we have to have scsi-ml do a get_device when a sdev is added to the starved entry and then do a

Re: [PATCH] fix NULL-pointer dereference on scsi_run_queue

2012-08-07 Thread Bart Van Assche
On 08/07/12 08:53, Chanho Min wrote: > In addition, Is it ironic that we are careful to use put_device at > scsi_request_fn?. If we trigger the ->remove(), > It occur a oops. What about the removal of unlock/lock as patch bellow? > > diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c

Re: [PATCH] fix NULL-pointer dereference on scsi_run_queue

2012-08-07 Thread Bart Van Assche
On 08/07/12 08:53, Chanho Min wrote: > On Tue, Aug 7, 2012 at 2:56 AM, Bart Van Assche wrote: >> Indeed. How about the patch below ? Scsi devices are removed from >> starved_list after blk_cleanup_queue() and before put_device(). That >> guarantees that inside scsi_run_queue() get_device() under

Re: [PATCH] fix NULL-pointer dereference on scsi_run_queue

2012-08-07 Thread Chanho Min
On Tue, Aug 7, 2012 at 2:56 AM, Bart Van Assche wrote: > Indeed. How about the patch below ? Scsi devices are removed from > starved_list after blk_cleanup_queue() and before put_device(). That > guarantees that inside scsi_run_queue() get_device() under host lock > will succeed. Thanks, IMHO,

Re: [PATCH] fix NULL-pointer dereference on scsi_run_queue

2012-08-07 Thread Chanho Min
On Tue, Aug 7, 2012 at 2:56 AM, Bart Van Assche bvanass...@acm.org wrote: Indeed. How about the patch below ? Scsi devices are removed from starved_list after blk_cleanup_queue() and before put_device(). That guarantees that inside scsi_run_queue() get_device() under host lock will succeed.

Re: [PATCH] fix NULL-pointer dereference on scsi_run_queue

2012-08-07 Thread Bart Van Assche
On 08/07/12 08:53, Chanho Min wrote: On Tue, Aug 7, 2012 at 2:56 AM, Bart Van Assche bvanass...@acm.org wrote: Indeed. How about the patch below ? Scsi devices are removed from starved_list after blk_cleanup_queue() and before put_device(). That guarantees that inside scsi_run_queue()

Re: [PATCH] fix NULL-pointer dereference on scsi_run_queue

2012-08-07 Thread Bart Van Assche
On 08/07/12 08:53, Chanho Min wrote: In addition, Is it ironic that we are careful to use put_device at scsi_request_fn?. If we trigger the -remove(), It occur a oops. What about the removal of unlock/lock as patch bellow? diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index

Re: [PATCH] fix NULL-pointer dereference on scsi_run_queue

2012-08-07 Thread Mike Christie
On 08/06/2012 12:56 PM, Bart Van Assche wrote: On 08/04/12 22:36, Mike Christie wrote: On 08/04/2012 03:18 PM, Bart Van Assche wrote: On 08/04/12 16:46, Mike Christie wrote: I think we have to have scsi-ml do a get_device when a sdev is added to the starved entry and then do a put_device when

Re: [PATCH] fix NULL-pointer dereference on scsi_run_queue

2012-08-07 Thread Chanho Min
I'm afraid that without the second half of that patch the following race is still possible: - sdev reference count drops to zero while scsi_run_queue() is in progress and while that sdev is on the starved_list of its SCSI host; scsi_device_dev_release_usercontext() call is scheduled but

Re: [PATCH] fix NULL-pointer dereference on scsi_run_queue

2012-08-06 Thread Bart Van Assche
On 08/04/12 22:36, Mike Christie wrote: > On 08/04/2012 03:18 PM, Bart Van Assche wrote: >> On 08/04/12 16:46, Mike Christie wrote: >>> I think we have to have scsi-ml do a get_device when a sdev is added to >>> the starved entry and then do a put_device when it is removed (must do >>> these under

Re: [PATCH] fix NULL-pointer dereference on scsi_run_queue

2012-08-06 Thread Bart Van Assche
On 08/04/12 22:36, Mike Christie wrote: On 08/04/2012 03:18 PM, Bart Van Assche wrote: On 08/04/12 16:46, Mike Christie wrote: I think we have to have scsi-ml do a get_device when a sdev is added to the starved entry and then do a put_device when it is removed (must do these under the host

Re: [PATCH] fix NULL-pointer dereference on scsi_run_queue

2012-08-04 Thread Mike Christie
On 08/04/2012 03:18 PM, Bart Van Assche wrote: > On 08/04/12 16:46, Mike Christie wrote: >> I think we have to have scsi-ml do a get_device when a sdev is added to >> the starved entry and then do a put_device when it is removed (must do >> these under the host lock for the starved entry case

Re: [PATCH] fix NULL-pointer dereference on scsi_run_queue

2012-08-04 Thread Bart Van Assche
On 08/04/12 16:46, Mike Christie wrote: > I think we have to have scsi-ml do a get_device when a sdev is added to > the starved entry and then do a put_device when it is removed (must do > these under the host lock for the starved entry case too). I am not sure > if that is just a

Re: [PATCH] fix NULL-pointer dereference on scsi_run_queue

2012-08-04 Thread Mike Christie
On 08/04/2012 04:01 AM, Bart Van Assche wrote: > On 08/02/12 08:41, Chanho Min wrote: >> This patch is to fix a oops from a torn down device. When >> scsi_run_queue process starved queues, scsi_request_fn can race with >> scsi_remove_device. In this case, rarely, scsi_request_fn release the >>

Re: [PATCH] fix NULL-pointer dereference on scsi_run_queue

2012-08-04 Thread Bart Van Assche
On 08/02/12 08:41, Chanho Min wrote: > This patch is to fix a oops from a torn down device. When > scsi_run_queue process starved queues, scsi_request_fn can race with > scsi_remove_device. In this case, rarely, scsi_request_fn release the > last reference and set sdev->request_queue to NULL. It

Re: [PATCH] fix NULL-pointer dereference on scsi_run_queue

2012-08-04 Thread Bart Van Assche
On 08/02/12 08:41, Chanho Min wrote: This patch is to fix a oops from a torn down device. When scsi_run_queue process starved queues, scsi_request_fn can race with scsi_remove_device. In this case, rarely, scsi_request_fn release the last reference and set sdev-request_queue to NULL. It result

Re: [PATCH] fix NULL-pointer dereference on scsi_run_queue

2012-08-04 Thread Mike Christie
On 08/04/2012 04:01 AM, Bart Van Assche wrote: On 08/02/12 08:41, Chanho Min wrote: This patch is to fix a oops from a torn down device. When scsi_run_queue process starved queues, scsi_request_fn can race with scsi_remove_device. In this case, rarely, scsi_request_fn release the last

Re: [PATCH] fix NULL-pointer dereference on scsi_run_queue

2012-08-04 Thread Bart Van Assche
On 08/04/12 16:46, Mike Christie wrote: I think we have to have scsi-ml do a get_device when a sdev is added to the starved entry and then do a put_device when it is removed (must do these under the host lock for the starved entry case too). I am not sure if that is just a hack/papering-over

Re: [PATCH] fix NULL-pointer dereference on scsi_run_queue

2012-08-04 Thread Mike Christie
On 08/04/2012 03:18 PM, Bart Van Assche wrote: On 08/04/12 16:46, Mike Christie wrote: I think we have to have scsi-ml do a get_device when a sdev is added to the starved entry and then do a put_device when it is removed (must do these under the host lock for the starved entry case too). I am

Re: [PATCH] fix NULL-pointer dereference on scsi_run_queue

2012-08-02 Thread Mike Christie
On 08/02/2012 04:34 AM, James Bottomley wrote: > On Thu, 2012-08-02 at 18:28 +0900, Chanho Min wrote: >> On Thu, Aug 2, 2012 at 5:57 PM, James Bottomley >> wrote: >>> On Thu, 2012-08-02 at 17:41 +0900, Chanho Min wrote: This patch is to fix a oops from a torn down device. When

Re: [PATCH] fix NULL-pointer dereference on scsi_run_queue

2012-08-02 Thread Chanho Min
> Does it occur with that patch applied? I'm trying to reproduce it with that patch. but, It is unlikely to be fixed. because scsi_run_queue is invoked from scsi_requeue_run_queue, not scsi_requeue_command. > If it does, the likely fix would be to take a copy of the queue ... but > I'd like to

Re: [PATCH] fix NULL-pointer dereference on scsi_run_queue

2012-08-02 Thread James Bottomley
On Thu, 2012-08-02 at 18:28 +0900, Chanho Min wrote: > On Thu, Aug 2, 2012 at 5:57 PM, James Bottomley > wrote: > > On Thu, 2012-08-02 at 17:41 +0900, Chanho Min wrote: > >> This patch is to fix a oops from a torn down device. When > >> scsi_run_queue process starved queues, scsi_request_fn can

Re: [PATCH] fix NULL-pointer dereference on scsi_run_queue

2012-08-02 Thread Chanho Min
On Thu, Aug 2, 2012 at 5:57 PM, James Bottomley wrote: > On Thu, 2012-08-02 at 17:41 +0900, Chanho Min wrote: >> This patch is to fix a oops from a torn down device. When >> scsi_run_queue process starved queues, scsi_request_fn can race with >> scsi_remove_device. In this case, rarely,

Re: [PATCH] fix NULL-pointer dereference on scsi_run_queue

2012-08-02 Thread James Bottomley
On Thu, 2012-08-02 at 17:41 +0900, Chanho Min wrote: > This patch is to fix a oops from a torn down device. When > scsi_run_queue process starved queues, scsi_request_fn can race with > scsi_remove_device. In this case, rarely, scsi_request_fn release the > last reference and set

Re: [PATCH] fix NULL-pointer dereference on scsi_run_queue

2012-08-02 Thread James Bottomley
On Thu, 2012-08-02 at 17:41 +0900, Chanho Min wrote: This patch is to fix a oops from a torn down device. When scsi_run_queue process starved queues, scsi_request_fn can race with scsi_remove_device. In this case, rarely, scsi_request_fn release the last reference and set sdev-request_queue to

Re: [PATCH] fix NULL-pointer dereference on scsi_run_queue

2012-08-02 Thread Chanho Min
On Thu, Aug 2, 2012 at 5:57 PM, James Bottomley james.bottom...@hansenpartnership.com wrote: On Thu, 2012-08-02 at 17:41 +0900, Chanho Min wrote: This patch is to fix a oops from a torn down device. When scsi_run_queue process starved queues, scsi_request_fn can race with scsi_remove_device.

Re: [PATCH] fix NULL-pointer dereference on scsi_run_queue

2012-08-02 Thread James Bottomley
On Thu, 2012-08-02 at 18:28 +0900, Chanho Min wrote: On Thu, Aug 2, 2012 at 5:57 PM, James Bottomley james.bottom...@hansenpartnership.com wrote: On Thu, 2012-08-02 at 17:41 +0900, Chanho Min wrote: This patch is to fix a oops from a torn down device. When scsi_run_queue process starved

Re: [PATCH] fix NULL-pointer dereference on scsi_run_queue

2012-08-02 Thread Chanho Min
Does it occur with that patch applied? I'm trying to reproduce it with that patch. but, It is unlikely to be fixed. because scsi_run_queue is invoked from scsi_requeue_run_queue, not scsi_requeue_command. If it does, the likely fix would be to take a copy of the queue ... but I'd like to

Re: [PATCH] fix NULL-pointer dereference on scsi_run_queue

2012-08-02 Thread Mike Christie
On 08/02/2012 04:34 AM, James Bottomley wrote: On Thu, 2012-08-02 at 18:28 +0900, Chanho Min wrote: On Thu, Aug 2, 2012 at 5:57 PM, James Bottomley james.bottom...@hansenpartnership.com wrote: On Thu, 2012-08-02 at 17:41 +0900, Chanho Min wrote: This patch is to fix a oops from a torn down