Re: WARNING in block layer triggered in 3.17-rc3

2014-09-09 Thread Tejun Heo
Hello, Alan.

On Tue, Sep 09, 2014 at 11:08:22AM -0400, Alan Stern wrote:
> Sorry, the meaning wasn't clear.  I meant that the existing code 
> doesn't work.  The patch seems to work okay (except that I can't use 
> queue_flag_test_and_set because the queue isn't locked at that point).  
> I'll submit the patch shortly.

Given it's fully synchronized, the following should work fine and
probably is less misleading than using atomic test_and_set.

if (!blk_queue_init_done) {
queue_flag_set_unlocked(QUEUE_FLAG_INIT_DONE, q);
blk_queue_bypass_end(q);
}

Thanks.

-- 
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: WARNING in block layer triggered in 3.17-rc3

2014-09-09 Thread Alan Stern
On Tue, 9 Sep 2014, Tejun Heo wrote:

> Hello,
> 
> On Mon, Sep 08, 2014 at 01:42:44PM -0400, Alan Stern wrote:
> > > This looks like a nasty hack.  In theory the QUEUE_FLAG_INIT_DONE should
> > > be unset on blk_unregister_queue() to match the teardown; it's only
> > > accident it isn't.  del_gendisk() in sd_remove() is supposed to tear a
> > > lot of queue stuff down.
> > 
> > It's not clear what the operative assumptions are.  The comment in
> > blk_register_queue() implies that bypass is active only because it was
> > set up that way when the queue was created.  The fact that
> > blk_unregister_queue() doesn't call blk_queue_bypass_start() seems to
> > support this view -- although it could also be a simple oversight.
> > 
> > Hopefully Tejun can clear this iup.
> 
> Maintaining the initial bypass till queue registration is an
> optimization because shutting down a fully functional queue is a
> costly operation and there are drivers which set and destroy queues
> repeatedly while probing, so, yeah, it's really a special case for
> when the queue is being registered for the first time.
> 
> > > Your hack seems to indicate that this doesn't work on the add->del->add
> > > transtion of a gendisk.
> > 
> > Indeed, it does not work.
> 
> As such, the change you suggested makes perfect sense to me.  Why
> wouldn't it work?

Sorry, the meaning wasn't clear.  I meant that the existing code 
doesn't work.  The patch seems to work okay (except that I can't use 
queue_flag_test_and_set because the queue isn't locked at that point).  
I'll submit the patch shortly.

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: WARNING in block layer triggered in 3.17-rc3

2014-09-08 Thread Tejun Heo
Hello,

On Mon, Sep 08, 2014 at 01:42:44PM -0400, Alan Stern wrote:
> > This looks like a nasty hack.  In theory the QUEUE_FLAG_INIT_DONE should
> > be unset on blk_unregister_queue() to match the teardown; it's only
> > accident it isn't.  del_gendisk() in sd_remove() is supposed to tear a
> > lot of queue stuff down.
> 
> It's not clear what the operative assumptions are.  The comment in
> blk_register_queue() implies that bypass is active only because it was
> set up that way when the queue was created.  The fact that
> blk_unregister_queue() doesn't call blk_queue_bypass_start() seems to
> support this view -- although it could also be a simple oversight.
> 
> Hopefully Tejun can clear this iup.

Maintaining the initial bypass till queue registration is an
optimization because shutting down a fully functional queue is a
costly operation and there are drivers which set and destroy queues
repeatedly while probing, so, yeah, it's really a special case for
when the queue is being registered for the first time.

> > Your hack seems to indicate that this doesn't work on the add->del->add
> > transtion of a gendisk.
> 
> Indeed, it does not work.

As such, the change you suggested makes perfect sense to me.  Why
wouldn't it work?

Thanks.

-- 
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: WARNING in block layer triggered in 3.17-rc3

2014-09-08 Thread Alan Stern
On Mon, 8 Sep 2014, James Bottomley wrote:

> > Jens and James, it appears the problem is in blk_register_queue().  The 
> > code does this:
> > 
> > /*
> >  * Initialization must be complete by now.  Finish the initial
> >  * bypass from queue allocation.
> >  */
> > queue_flag_set_unlocked(QUEUE_FLAG_INIT_DONE, q);
> > blk_queue_bypass_end(q);
> > 
> > This doesn't work well if the queue is unregistered later and then
> > registered again -- which is what happens when the sd driver is unbound
> > from a device and then bound again.  It looks like the code should be:
> > 
> > if (!queue_flag_test_and_set(QUEUE_FLAG_INIT_DONE, q))
> > blk_queue_bypass_end(q);
> > 
> > Do you agree?  If so, I'll send in patch.
> 
> This looks like a nasty hack.  In theory the QUEUE_FLAG_INIT_DONE should
> be unset on blk_unregister_queue() to match the teardown; it's only
> accident it isn't.  del_gendisk() in sd_remove() is supposed to tear a
> lot of queue stuff down.

It's not clear what the operative assumptions are.  The comment in
blk_register_queue() implies that bypass is active only because it was
set up that way when the queue was created.  The fact that
blk_unregister_queue() doesn't call blk_queue_bypass_start() seems to
support this view -- although it could also be a simple oversight.

Hopefully Tejun can clear this iup.

>  However, the problem looks to be the mismatch
> in assumptions.  The way SCSI binding works, the queue belongs to the
> underlying device so we always assumed we could add and remove upper
> drivers ... there's even a case for this if you don't want a disk but
> want to attach sg instead.  However, it's not the common use case.
> 
> The block model now seems to tie a lot of queue set up and teardown to
> add and remove of the gendisk which is counter to these assumptions.  As
> long as we can go from del->add without calling the ->release function
> on the queue, everything works.  Most of the operations seem
> symmetrical, so perhaps this is only the bypass doing too much.
> 
> The ideal is that disk teardown only does as much as disk setup, so the
> mid layer can still use the underlying queue on the device.
> 
> This bypass code is not very well documented.  However, your problem
> seems to be caused by this change:
> 
> commit 776687bce42bb22cce48b5da950e48ebbb9a948f
> Author: Tejun Heo 
> Date:   Tue Jul 1 10:29:17 2014 -0600
> 
> block, blk-mq: draining can't be skipped even if bypass_depth was
> non-zero
> 
> Your hack seems to indicate that this doesn't work on the add->del->add
> transtion of a gendisk.

Indeed, it does not work.

Tejun, for more details about the failure see the initial message in 
this thread:

http://marc.info/?l=linux-kernel&m=140993911413862&w=2

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: WARNING in block layer triggered in 3.17-rc3

2014-09-08 Thread Shirish Pargaonkar
see this issue/warning in 3.12 also.

On Mon, Sep 8, 2014 at 10:51 AM, James Bottomley
 wrote:
> On Mon, 2014-09-08 at 10:51 -0400, Alan Stern wrote:
>> On Sun, 7 Sep 2014, Shirish Pargaonkar wrote:
>>
>> > I think the problem is, when a gendisk is detached, its request queue
>> > is not put in bypass mode
>> > cause when it is re-attached, code tries to put it out of bypass mode,
>> > hence the warning.
>> >
>> > So either of these should work, I have not tested it, just coded it up.
>>
>> I'm pretty sure that both of your solutions are wrong.
>>
>> Jens and James, it appears the problem is in blk_register_queue().  The
>> code does this:
>>
>>   /*
>>* Initialization must be complete by now.  Finish the initial
>>* bypass from queue allocation.
>>*/
>>   queue_flag_set_unlocked(QUEUE_FLAG_INIT_DONE, q);
>>   blk_queue_bypass_end(q);
>>
>> This doesn't work well if the queue is unregistered later and then
>> registered again -- which is what happens when the sd driver is unbound
>> from a device and then bound again.  It looks like the code should be:
>>
>>   if (!queue_flag_test_and_set(QUEUE_FLAG_INIT_DONE, q))
>>   blk_queue_bypass_end(q);
>>
>> Do you agree?  If so, I'll send in patch.
>
> This looks like a nasty hack.  In theory the QUEUE_FLAG_INIT_DONE should
> be unset on blk_unregister_queue() to match the teardown; it's only
> accident it isn't.  del_gendisk() in sd_remove() is supposed to tear a
> lot of queue stuff down.  However, the problem looks to be the mismatch
> in assumptions.  The way SCSI binding works, the queue belongs to the
> underlying device so we always assumed we could add and remove upper
> drivers ... there's even a case for this if you don't want a disk but
> want to attach sg instead.  However, it's not the common use case.
>
> The block model now seems to tie a lot of queue set up and teardown to
> add and remove of the gendisk which is counter to these assumptions.  As
> long as we can go from del->add without calling the ->release function
> on the queue, everything works.  Most of the operations seem
> symmetrical, so perhaps this is only the bypass doing too much.
>
> The ideal is that disk teardown only does as much as disk setup, so the
> mid layer can still use the underlying queue on the device.
>
> This bypass code is not very well documented.  However, your problem
> seems to be caused by this change:
>
> commit 776687bce42bb22cce48b5da950e48ebbb9a948f
> Author: Tejun Heo 
> Date:   Tue Jul 1 10:29:17 2014 -0600
>
> block, blk-mq: draining can't be skipped even if bypass_depth was
> non-zero
>
> Your hack seems to indicate that this doesn't work on the add->del->add
> transtion of a gendisk.
>
> James
>
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: WARNING in block layer triggered in 3.17-rc3

2014-09-08 Thread James Bottomley
On Mon, 2014-09-08 at 10:51 -0400, Alan Stern wrote:
> On Sun, 7 Sep 2014, Shirish Pargaonkar wrote:
> 
> > I think the problem is, when a gendisk is detached, its request queue
> > is not put in bypass mode
> > cause when it is re-attached, code tries to put it out of bypass mode,
> > hence the warning.
> > 
> > So either of these should work, I have not tested it, just coded it up.
> 
> I'm pretty sure that both of your solutions are wrong.
> 
> Jens and James, it appears the problem is in blk_register_queue().  The 
> code does this:
> 
>   /*
>* Initialization must be complete by now.  Finish the initial
>* bypass from queue allocation.
>*/
>   queue_flag_set_unlocked(QUEUE_FLAG_INIT_DONE, q);
>   blk_queue_bypass_end(q);
> 
> This doesn't work well if the queue is unregistered later and then
> registered again -- which is what happens when the sd driver is unbound
> from a device and then bound again.  It looks like the code should be:
> 
>   if (!queue_flag_test_and_set(QUEUE_FLAG_INIT_DONE, q))
>   blk_queue_bypass_end(q);
> 
> Do you agree?  If so, I'll send in patch.

This looks like a nasty hack.  In theory the QUEUE_FLAG_INIT_DONE should
be unset on blk_unregister_queue() to match the teardown; it's only
accident it isn't.  del_gendisk() in sd_remove() is supposed to tear a
lot of queue stuff down.  However, the problem looks to be the mismatch
in assumptions.  The way SCSI binding works, the queue belongs to the
underlying device so we always assumed we could add and remove upper
drivers ... there's even a case for this if you don't want a disk but
want to attach sg instead.  However, it's not the common use case.

The block model now seems to tie a lot of queue set up and teardown to
add and remove of the gendisk which is counter to these assumptions.  As
long as we can go from del->add without calling the ->release function
on the queue, everything works.  Most of the operations seem
symmetrical, so perhaps this is only the bypass doing too much.

The ideal is that disk teardown only does as much as disk setup, so the
mid layer can still use the underlying queue on the device.

This bypass code is not very well documented.  However, your problem
seems to be caused by this change:

commit 776687bce42bb22cce48b5da950e48ebbb9a948f
Author: Tejun Heo 
Date:   Tue Jul 1 10:29:17 2014 -0600

block, blk-mq: draining can't be skipped even if bypass_depth was
non-zero

Your hack seems to indicate that this doesn't work on the add->del->add
transtion of a gendisk.

James



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: WARNING in block layer triggered in 3.17-rc3

2014-09-08 Thread Shirish Pargaonkar
So should a request queue be in bypass mode when the device is being detached
and queue is being unregistereed because requests can get queued up?


On Mon, Sep 8, 2014 at 9:51 AM, Alan Stern  wrote:
> On Sun, 7 Sep 2014, Shirish Pargaonkar wrote:
>
>> I think the problem is, when a gendisk is detached, its request queue
>> is not put in bypass mode
>> cause when it is re-attached, code tries to put it out of bypass mode,
>> hence the warning.
>>
>> So either of these should work, I have not tested it, just coded it up.
>
> I'm pretty sure that both of your solutions are wrong.
>
> Jens and James, it appears the problem is in blk_register_queue().  The
> code does this:
>
> /*
>  * Initialization must be complete by now.  Finish the initial
>  * bypass from queue allocation.
>  */
> queue_flag_set_unlocked(QUEUE_FLAG_INIT_DONE, q);
> blk_queue_bypass_end(q);
>
> This doesn't work well if the queue is unregistered later and then
> registered again -- which is what happens when the sd driver is unbound
> from a device and then bound again.  It looks like the code should be:
>
> if (!queue_flag_test_and_set(QUEUE_FLAG_INIT_DONE, q))
> blk_queue_bypass_end(q);
>
> Do you agree?  If so, I'll send in patch.
>
> Alan Stern
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: WARNING in block layer triggered in 3.17-rc3

2014-09-08 Thread Alan Stern
On Sun, 7 Sep 2014, Shirish Pargaonkar wrote:

> I think the problem is, when a gendisk is detached, its request queue
> is not put in bypass mode
> cause when it is re-attached, code tries to put it out of bypass mode,
> hence the warning.
> 
> So either of these should work, I have not tested it, just coded it up.

I'm pretty sure that both of your solutions are wrong.

Jens and James, it appears the problem is in blk_register_queue().  The 
code does this:

/*
 * Initialization must be complete by now.  Finish the initial
 * bypass from queue allocation.
 */
queue_flag_set_unlocked(QUEUE_FLAG_INIT_DONE, q);
blk_queue_bypass_end(q);

This doesn't work well if the queue is unregistered later and then
registered again -- which is what happens when the sd driver is unbound
from a device and then bound again.  It looks like the code should be:

if (!queue_flag_test_and_set(QUEUE_FLAG_INIT_DONE, q))
blk_queue_bypass_end(q);

Do you agree?  If so, I'll send in patch.

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: WARNING in block layer triggered in 3.17-rc3

2014-09-07 Thread Shirish Pargaonkar
I think the problem is, when a gendisk is detached, its request queue
is not put in bypass mode
cause when it is re-attached, code tries to put it out of bypass mode,
hence the warning.

So either of these should work, I have not tested it, just coded it up.

diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
index 4db5abf..f94c7ba 100644
--- a/block/blk-sysfs.c
+++ b/block/blk-sysfs.c
@@ -600,6 +600,8 @@ void blk_unregister_queue(struct gendisk *disk)
  if (q->request_fn)
  elv_unregister_queue(q);

+ blk_queue_bypss_start(q);
+
  kobject_uevent(&q->kobj, KOBJ_REMOVE);
  kobject_del(&q->kobj);
  blk_trace_remove_sysfs(disk_to_dev(disk));





diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index 2c2041c..1c7ef55f 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -3054,15 +3054,26 @@ static int sd_probe(struct device *dev)
 static int sd_remove(struct device *dev)
 {
  struct scsi_disk *sdkp;
+ struct request_queue *q;
+ struct scsi_device *sdp;
  dev_t devt;

  sdkp = dev_get_drvdata(dev);
+ q = sdkp->disk->queue;
  devt = disk_devt(sdkp->disk);
  scsi_autopm_get_device(sdkp->device);

  async_synchronize_full_domain(&scsi_sd_pm_domain);
  async_synchronize_full_domain(&scsi_sd_probe_domain);
  device_del(&sdkp->dev);
+
+ if (q) {
+ spin_lock_irq(q->queue_lock);
+ q->bypass_depth++;
+ queue_flag_set(QUEUE_FLAG_BYPASS, q);
+ spin_unlock_irq(q->queue_lock);
+ }
+
  del_gendisk(sdkp->disk);
  sd_shutdown(dev);



On Fri, Sep 5, 2014 at 12:45 PM, Alan Stern  wrote:
> James and Jens:
>
> I got a WARNING when unbinding the sd driver from a USB flash drive and
> then binding it back again.  Here's where the flash drive gets probed
> initially:
>
> [  143.300886] usb-storage 4-8:1.0: usb_probe_interface
> [  143.300911] usb-storage 4-8:1.0: usb_probe_interface - got id
> [  143.300930] usb-storage 4-8:1.0: USB Mass Storage device detected
> [  143.318239] scsi host0: usb-storage 4-8:1.0
> [  143.359979] scsi 0:0:0:0: Direct-Access Ut165USB2FlashStorage 0.00 
> PQ: 0 ANSI: 2
> [  143.376366] usbcore: registered new interface driver usb-storage
> [  143.468464] sd 0:0:0:0: [sda] 7892040 512-byte logical blocks: (4.04 
> GB/3.76 GiB)
> [  143.481725] sd 0:0:0:0: [sda] Write Protect is off
> [  143.485712] sd 0:0:0:0: [sda] Mode Sense: 00 00 00 00
> [  143.487064] sd 0:0:0:0: [sda] Asking for cache data failed
> [  143.498428] sd 0:0:0:0: [sda] Assuming drive cache: write through
> [  143.656797]  sda: sda1
> [  143.676922] sd 0:0:0:0: [sda] Attached SCSI removable disk
>
> Then I did
>
> echo 0:0:0:0 >/sys/bus/scsi/drivers/sd_mod/unbind
>
> followed by
>
> echo 0:0:0:0 >/sys/bus/scsi/drivers/sd_mod/bind
>
> which resulted in:
>
> [  165.079557] sd 0:0:0:0: [sda] 7892040 512-byte logical blocks: (4.04 
> GB/3.76 GiB)
> [  165.093510] sd 0:0:0:0: [sda] Write Protect is off
> [  165.104388] sd 0:0:0:0: [sda] Mode Sense: 00 00 00 00
> [  165.105632] sd 0:0:0:0: [sda] Asking for cache data failed
> [  165.115136] sd 0:0:0:0: [sda] Assuming drive cache: write through
> [  165.142950]  sda: sda1
> [  165.156480] [ cut here ]
> [  165.159912] WARNING: CPU: 0 PID: 29 at block/blk-core.c:473 
> blk_queue_bypass_end+0x4d/0x62()
> [  165.160030] Modules linked in: sd_mod usb_storage scsi_mod hid_generic 
> usbhid hid pcspkr evdev i915 cfbfillrect cfbimgblt i2c_algo_bit cfbcopyarea 
> video fbcon backlight bitblit softcursor font ehci_pci drm_kms_helper 
> uhci_hcd ehci_hcd ohci_pci ohci_hcd drm i2ccore usbcore e100 mii usb_common 
> fb fbdev fan processor button thermal_sys
> [  165.160030] CPU: 0 PID: 29 Comm: kworker/u4:1 Not tainted 
> 3.17.0-rc3AS-dirty #12
> [  165.160030] Hardware name: Hewlett-Packard HP dx2000 MT (EE004AA)/08FCh, 
> BIOS 1.1711/24/2005
> [  165.160030] Workqueue: events_unbound async_run_entry_fn
> [  165.160030]  c105d2ef  ea569e10 c12771c0  ea569e28 
> c102c5fb c114907d
> [  165.160030]  ecc18000 ea619c20 ecc18000 ea569e38 c102c676 0009 
>  ea569e44
> [  165.160030]  c114907d ea619c20 ea569e5c c114b97d ea569e5c ea619c20 
> ed74e400 ea619c2c
> [  165.160030] Call Trace:
> [  165.160030]  [] ? console_unlock+0x37e/0x3ab
> [  165.160030]  [] dump_stack+0x49/0x73
> [  165.160030]  [] warn_slowpath_common+0x5c/0x73
> [  165.160030]  [] ? blk_queue_bypass_end+0x4d/0x62
> [  165.160030]  [] warn_slowpath_null+0xf/0x13
> [  165.160030]  [] blk_queue_bypass_end+0x4d/0x62
> [  165.160030]  [] blk_register_queue+0x8f/0xc4
> [  165.160030]  [] add_disk+0x2bc/0x3a8
> [  165.160030]  [] sd_probe_async+0xf5/0x17b [sd_mod]
> [  165.160030]  [] async_run_entry_fn+0x59/0xf9
> [  165.160030]  [] process_one_work+0x187/0x2ac
> [  165.160030]  [] ? process_one_work+0x129/0x2ac
> [  165.160030]  [] worker_thread+0x1b1/0x26b
> [  165.160030]  [] ? process_scheduled_works+0x21/0x21
> [  165.160030]  [] kthread+0x82/0x87
> [  165.160030]  [] ? _raw_spin_unlock_irq+0x22/0x3f
> [  165.160030]  [] ? radix_tree_tag_set+0x3f/0xa5
> [  165.16

Re: WARNING in block layer triggered in 3.17-rc3

2014-09-07 Thread Ming Lei
On Sat, Sep 6, 2014 at 1:45 AM, Alan Stern  wrote:
> James and Jens:
>
> I got a WARNING when unbinding the sd driver from a USB flash drive and
> then binding it back again.  Here's where the flash drive gets probed
> initially:
>
> [  143.300886] usb-storage 4-8:1.0: usb_probe_interface
> [  143.300911] usb-storage 4-8:1.0: usb_probe_interface - got id
> [  143.300930] usb-storage 4-8:1.0: USB Mass Storage device detected
> [  143.318239] scsi host0: usb-storage 4-8:1.0
> [  143.359979] scsi 0:0:0:0: Direct-Access Ut165USB2FlashStorage 0.00 
> PQ: 0 ANSI: 2
> [  143.376366] usbcore: registered new interface driver usb-storage
> [  143.468464] sd 0:0:0:0: [sda] 7892040 512-byte logical blocks: (4.04 
> GB/3.76 GiB)
> [  143.481725] sd 0:0:0:0: [sda] Write Protect is off
> [  143.485712] sd 0:0:0:0: [sda] Mode Sense: 00 00 00 00
> [  143.487064] sd 0:0:0:0: [sda] Asking for cache data failed
> [  143.498428] sd 0:0:0:0: [sda] Assuming drive cache: write through
> [  143.656797]  sda: sda1
> [  143.676922] sd 0:0:0:0: [sda] Attached SCSI removable disk
>
> Then I did
>
> echo 0:0:0:0 >/sys/bus/scsi/drivers/sd_mod/unbind
>
> followed by
>
> echo 0:0:0:0 >/sys/bus/scsi/drivers/sd_mod/bind

I can reproduce it on virtio-blk too with scsi-mq, and can't duplicate
it on non-scsi devices, so looks there are refcounts leak in scsi subsystem?


Thanks,
-- 
Ming Lei
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


WARNING in block layer triggered in 3.17-rc3

2014-09-05 Thread Alan Stern
James and Jens:

I got a WARNING when unbinding the sd driver from a USB flash drive and
then binding it back again.  Here's where the flash drive gets probed 
initially:

[  143.300886] usb-storage 4-8:1.0: usb_probe_interface
[  143.300911] usb-storage 4-8:1.0: usb_probe_interface - got id
[  143.300930] usb-storage 4-8:1.0: USB Mass Storage device detected
[  143.318239] scsi host0: usb-storage 4-8:1.0
[  143.359979] scsi 0:0:0:0: Direct-Access Ut165USB2FlashStorage 0.00 
PQ: 0 ANSI: 2
[  143.376366] usbcore: registered new interface driver usb-storage
[  143.468464] sd 0:0:0:0: [sda] 7892040 512-byte logical blocks: (4.04 GB/3.76 
GiB)
[  143.481725] sd 0:0:0:0: [sda] Write Protect is off
[  143.485712] sd 0:0:0:0: [sda] Mode Sense: 00 00 00 00
[  143.487064] sd 0:0:0:0: [sda] Asking for cache data failed
[  143.498428] sd 0:0:0:0: [sda] Assuming drive cache: write through
[  143.656797]  sda: sda1
[  143.676922] sd 0:0:0:0: [sda] Attached SCSI removable disk

Then I did

echo 0:0:0:0 >/sys/bus/scsi/drivers/sd_mod/unbind

followed by

echo 0:0:0:0 >/sys/bus/scsi/drivers/sd_mod/bind

which resulted in:

[  165.079557] sd 0:0:0:0: [sda] 7892040 512-byte logical blocks: (4.04 GB/3.76 
GiB)
[  165.093510] sd 0:0:0:0: [sda] Write Protect is off
[  165.104388] sd 0:0:0:0: [sda] Mode Sense: 00 00 00 00
[  165.105632] sd 0:0:0:0: [sda] Asking for cache data failed
[  165.115136] sd 0:0:0:0: [sda] Assuming drive cache: write through
[  165.142950]  sda: sda1
[  165.156480] [ cut here ]
[  165.159912] WARNING: CPU: 0 PID: 29 at block/blk-core.c:473 
blk_queue_bypass_end+0x4d/0x62()
[  165.160030] Modules linked in: sd_mod usb_storage scsi_mod hid_generic 
usbhid hid pcspkr evdev i915 cfbfillrect cfbimgblt i2c_algo_bit cfbcopyarea 
video fbcon backlight bitblit softcursor font ehci_pci drm_kms_helper uhci_hcd 
ehci_hcd ohci_pci ohci_hcd drm i2ccore usbcore e100 mii usb_common fb fbdev fan 
processor button thermal_sys
[  165.160030] CPU: 0 PID: 29 Comm: kworker/u4:1 Not tainted 3.17.0-rc3AS-dirty 
#12
[  165.160030] Hardware name: Hewlett-Packard HP dx2000 MT (EE004AA)/08FCh, 
BIOS 1.1711/24/2005
[  165.160030] Workqueue: events_unbound async_run_entry_fn
[  165.160030]  c105d2ef  ea569e10 c12771c0  ea569e28 c102c5fb 
c114907d
[  165.160030]  ecc18000 ea619c20 ecc18000 ea569e38 c102c676 0009  
ea569e44
[  165.160030]  c114907d ea619c20 ea569e5c c114b97d ea569e5c ea619c20 ed74e400 
ea619c2c
[  165.160030] Call Trace:
[  165.160030]  [] ? console_unlock+0x37e/0x3ab
[  165.160030]  [] dump_stack+0x49/0x73
[  165.160030]  [] warn_slowpath_common+0x5c/0x73
[  165.160030]  [] ? blk_queue_bypass_end+0x4d/0x62
[  165.160030]  [] warn_slowpath_null+0xf/0x13
[  165.160030]  [] blk_queue_bypass_end+0x4d/0x62
[  165.160030]  [] blk_register_queue+0x8f/0xc4
[  165.160030]  [] add_disk+0x2bc/0x3a8
[  165.160030]  [] sd_probe_async+0xf5/0x17b [sd_mod]
[  165.160030]  [] async_run_entry_fn+0x59/0xf9
[  165.160030]  [] process_one_work+0x187/0x2ac
[  165.160030]  [] ? process_one_work+0x129/0x2ac
[  165.160030]  [] worker_thread+0x1b1/0x26b
[  165.160030]  [] ? process_scheduled_works+0x21/0x21
[  165.160030]  [] kthread+0x82/0x87
[  165.160030]  [] ? _raw_spin_unlock_irq+0x22/0x3f
[  165.160030]  [] ? radix_tree_tag_set+0x3f/0xa5
[  165.160030]  [] ret_from_kernel_thread+0x21/0x30
[  165.160030]  [] ? __kthread_parkme+0x50/0x50
[  165.160030] ---[ end trace 31df765b6ea80892 ]---
[  165.308986] sd 0:0:0:0: [sda] Attached SCSI removable disk

I don't know what's going on here, but it looks like something doesn't 
get cleaned up properly during the unbind operation.

This was in vanilla 3.17-rc3.

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/