Re: USB device cannot be reconnected and khubd blocked for more than 120 seconds

2013-01-16 Thread Tejun Heo
Hello, Alan. On Tue, Jan 15, 2013 at 11:01:15PM -0500, Alan Stern wrote: The current domain implementation is somewhere inbetween. It's not completely simplistic system and at the same time not developed enough to do properly stacked flushing. I like your idea of chronological

Re: USB device cannot be reconnected and khubd blocked for more than 120 seconds

2013-01-16 Thread Alan Stern
On Wed, 16 Jan 2013, Tejun Heo wrote: Hello, Alan. On Tue, Jan 15, 2013 at 11:01:15PM -0500, Alan Stern wrote: The current domain implementation is somewhere inbetween. It's not completely simplistic system and at the same time not developed enough to do properly stacked flushing.

Re: USB device cannot be reconnected and khubd blocked for more than 120 seconds

2013-01-16 Thread Tejun Heo
Hello, Alan. On Wed, Jan 16, 2013 at 12:01:53PM -0500, Alan Stern wrote: The problem here is that flush everything which comes before me is used to order async jobs. e.g. after async jobs probe the hardware they order themselves by flushing before registering them, so unless I don't

Re: USB device cannot be reconnected and khubd blocked for more than 120 seconds

2013-01-16 Thread Alan Stern
On Wed, 16 Jan 2013, Tejun Heo wrote: Hello, Alan. On Wed, Jan 16, 2013 at 12:01:53PM -0500, Alan Stern wrote: The problem here is that flush everything which comes before me is used to order async jobs. e.g. after async jobs probe the hardware they order themselves by flushing

Re: USB device cannot be reconnected and khubd blocked for more than 120 seconds

2013-01-15 Thread Linus Torvalds
[ Added Tejun to the discussion, since he's the async go-to-guy ] On Mon, Jan 14, 2013 at 10:23 PM, Ming Lei ming@canonical.com wrote: But I have another idea to address the problem, and let module code call async_synchronize_full() only if the module requires that explicitly, so how

Re: USB device cannot be reconnected and khubd blocked for more than 120 seconds

2013-01-15 Thread Linus Torvalds
On Tue, Jan 15, 2013 at 9:36 AM, Linus Torvalds torva...@linux-foundation.org wrote: This kind of let's randomly encourage people to write subtly buggy code that has magical timing dependencies, so that the developer won't likely even see it because he has fast disks etc code is totally

Re: USB device cannot be reconnected and khubd blocked for more than 120 seconds

2013-01-15 Thread Alan Stern
On Tue, 15 Jan 2013, Linus Torvalds wrote: Tejun, comments? You can see the whole thread on lkml, but the basic problem is that the module loading doing the unconditional async_synchronize_full() has caused problems, because we have - load module A - module A does per-controller async

Re: USB device cannot be reconnected and khubd blocked for more than 120 seconds

2013-01-15 Thread Tejun Heo
Hello, Linus. On Tue, Jan 15, 2013 at 09:36:57AM -0800, Linus Torvalds wrote: Tejun, comments? You can see the whole thread on lkml, but the basic problem is that the module loading doing the unconditional async_synchronize_full() has caused problems, because we have - load module A -

Re: USB device cannot be reconnected and khubd blocked for more than 120 seconds

2013-01-15 Thread Tejun Heo
Hello, Alan. On Tue, Jan 15, 2013 at 01:20:58PM -0500, Alan Stern wrote: It may not be so easy. When the SCSI async thread probes the new disk, it has to do I/O. So it needs to use a scheduler. But maybe it could use a built-in trivial scheduler until the proper one is loaded. Then the

Re: USB device cannot be reconnected and khubd blocked for more than 120 seconds

2013-01-15 Thread Linus Torvalds
On Tue, Jan 15, 2013 at 10:32 AM, Tejun Heo t...@kernel.org wrote: I think the root problem here, apart from request_module() from block - which is a bit nasty but making that part completely async would too be quite nasty albeit in a different way - is that async_synchronize_full() is way

Re: USB device cannot be reconnected and khubd blocked for more than 120 seconds

2013-01-15 Thread Tejun Heo
Hello, Linus Will continue on another reply but this one is relevant so... On Tue, Jan 15, 2013 at 10:18:45AM -0800, Linus Torvalds wrote: Tejun, is there a good way for code to see I'm running in async context? Then we could do something like Almost. With a bit of modification we can ask

Re: USB device cannot be reconnected and khubd blocked for more than 120 seconds

2013-01-15 Thread Tejun Heo
cc'ing Arjan. Arjan, the original thread can be read from http://thread.gmane.org/gmane.linux.kernel/1420814 Hello, again. On Tue, Jan 15, 2013 at 12:18:01PM -0800, Linus Torvalds wrote: I think that is a good solution if it works, but look out: we need to synchronize across *all* domains,

Re: USB device cannot be reconnected and khubd blocked for more than 120 seconds

2013-01-15 Thread Arjan van de Ven
For now, I'm gonna implement simple I'm not gonna wait for myself self-deadlock avoidance. If this needs any more sophistication, I think we better reimplement it so that we can explicitly match up and track who's gonna wait for what instead of throwing everything into a single cookie space

Re: USB device cannot be reconnected and khubd blocked for more than 120 seconds

2013-01-15 Thread Tejun Heo
Hello, Arjan. On Tue, Jan 15, 2013 at 04:25:54PM -0800, Arjan van de Ven wrote: async fundamentally had the concept of a monotonic increasing number, and that you could always wait for everyone before me. then people (like me) wanted exceptions to what everyone means ;-( I'm ok with going

Re: USB device cannot be reconnected and khubd blocked for more than 120 seconds

2013-01-15 Thread Linus Torvalds
On Tue, Jan 15, 2013 at 3:50 PM, Tejun Heo t...@kernel.org wrote: For now, I'm gonna implement simple I'm not gonna wait for myself self-deadlock avoidance. You can't really do that. Or rather, it won't *help*. The thing is, the module loading in particular is not necessarily happening in the

Re: USB device cannot be reconnected and khubd blocked for more than 120 seconds

2013-01-15 Thread Linus Torvalds
On Tue, Jan 15, 2013 at 4:36 PM, Linus Torvalds torva...@linux-foundation.org wrote: There's a reason I asked for a warning for this. Or the let's flag the current thread if it ever started anything asynchronous. Because it's complicated. Btw, the sequence counter (that is *not* taking

Re: USB device cannot be reconnected and khubd blocked for more than 120 seconds

2013-01-15 Thread Tejun Heo
On Tue, Jan 15, 2013 at 04:36:34PM -0800, Linus Torvalds wrote: The thing is, the module loading in particular is not necessarily happening in the same context as what *started* the module loading. A module loader will request the module from user space, and then later user space - through

Re: USB device cannot be reconnected and khubd blocked for more than 120 seconds

2013-01-15 Thread Ming Lei
On Wed, Jan 16, 2013 at 1:36 AM, Linus Torvalds torva...@linux-foundation.org wrote: Because it's not just sd.c that uses async_schedule(), and would need the async synchronize. It's floppy.c, it's generic scsi scanning (so scsi tapes etc), and it's libata-core.c. As discussed previously,

Re: USB device cannot be reconnected and khubd blocked for more than 120 seconds

2013-01-15 Thread Alan Stern
On Tue, 15 Jan 2013, Tejun Heo wrote: Hello, Arjan. On Tue, Jan 15, 2013 at 04:25:54PM -0800, Arjan van de Ven wrote: async fundamentally had the concept of a monotonic increasing number, and that you could always wait for everyone before me. then people (like me) wanted exceptions to

Re: USB device cannot be reconnected and khubd blocked for more than 120 seconds

2013-01-14 Thread Oliver Neukum
On Monday 14 January 2013 11:47:57 Ming Lei wrote: [ 181.175323] echo 0 /proc/sys/kernel/hung_task_timeout_secs disables this message. [ 181.183624] modprobeD c04f1920 0 2462 2461 0x [ 181.183685] [c04f1920] (__schedule+0x5fc/0x6d4) from [c005eba4]

Re: USB device cannot be reconnected and khubd blocked for more than 120 seconds

2013-01-14 Thread Ming Lei
On Mon, Jan 14, 2013 at 4:22 PM, Oliver Neukum oli...@neukum.org wrote: OK, your trace is totally different. If your hangs are related, as is likely, my explanation goes out of the window. If I run 'shutdown' after unplugging usb storage device, another hang trace same with Alex's can be

Re: USB device cannot be reconnected and khubd blocked for more than 120 seconds

2013-01-14 Thread Alex Riesen
On Mon, Jan 14, 2013 at 3:39 AM, Alan Stern st...@rowland.harvard.edu wrote: On Sun, 13 Jan 2013, Oliver Neukum wrote: This is not a USB problem. You need to involve the SCSI people. khubd just stops working because disconnects are processed in its context and the removal deadlocks. The why

Re: USB device cannot be reconnected and khubd blocked for more than 120 seconds

2013-01-14 Thread Linus Torvalds
On Sun, Jan 13, 2013 at 11:15 PM, Ming Lei ming@canonical.com wrote: The deadlock problem is caused by calling request_module() inside async function of do_scan_async(), and it was introduced by Linus's below commit: commit d6de2c80e9d758d2e36c21699117db6178c0f517 Author: Linus Torvalds

Re: USB device cannot be reconnected and khubd blocked for more than 120 seconds

2013-01-14 Thread Alan Stern
On Mon, 14 Jan 2013, Linus Torvalds wrote: - from view of driver, introducing async_synchronize_full() after do_one_initcall() inside do_init_module() is like a sync probe for drivers built as module, and cause this kind of deadlock easily. So could we revert the commit and fix the

Re: USB device cannot be reconnected and khubd blocked for more than 120 seconds

2013-01-14 Thread Linus Torvalds
On Mon, Jan 14, 2013 at 10:04 AM, Alan Stern st...@rowland.harvard.edu wrote: How about skipping that call if the current thread is one of the async helpers? Is it possible to detect when that happens? Or maybe such a check should go inside async_synchronize_full() itself. Do we have some

Re: USB device cannot be reconnected and khubd blocked for more than 120 seconds

2013-01-14 Thread Ming Lei
On Tue, Jan 15, 2013 at 1:30 AM, Linus Torvalds torva...@linux-foundation.org wrote: On Sun, Jan 13, 2013 at 11:15 PM, Ming Lei ming@canonical.com wrote: The deadlock problem is caused by calling request_module() inside async function of do_scan_async(), and it was introduced by Linus's

Re: USB device cannot be reconnected and khubd blocked for more than 120 seconds

2013-01-14 Thread Ming Lei
On Tue, Jan 15, 2013 at 9:53 AM, Ming Lei ming@canonical.com wrote: I will try to figure out one patch to address the scsi block async probe issue first, and see if it can fix the problem by moving add_disk() into sd_probe() and calling async_synchronize_full_domain(scsi_sd_probe_domain)

Re: USB device cannot be reconnected and khubd blocked for more than 120 seconds

2013-01-13 Thread Alex Riesen
On Sat, Jan 12, 2013 at 11:52 PM, Alan Stern st...@rowland.harvard.edu wrote: On Sat, 12 Jan 2013, Alex Riesen wrote: Now, who would be interested to handle this kind of misconfiguration ... So the whole thing was a false alarm? Yes, almost. What about khubd hanging when machine is shutdown?

Re: USB device cannot be reconnected and khubd blocked for more than 120 seconds

2013-01-13 Thread Alan Stern
On Sun, 13 Jan 2013, Alex Riesen wrote: On Sat, Jan 12, 2013 at 11:52 PM, Alan Stern st...@rowland.harvard.edu wrote: On Sat, 12 Jan 2013, Alex Riesen wrote: Now, who would be interested to handle this kind of misconfiguration ... So the whole thing was a false alarm? Yes, almost.

Re: USB device cannot be reconnected and khubd blocked for more than 120 seconds

2013-01-13 Thread Alex Riesen
On Sun, Jan 13, 2013 at 5:56 PM, Alan Stern st...@rowland.harvard.edu wrote: On Sun, 13 Jan 2013, Alex Riesen wrote: Yes, almost. What about khubd hanging when machine is shutdown? What about it? I have trouble understanding all the descriptions you have provided so far, because you talk

Re: USB device cannot be reconnected and khubd blocked for more than 120 seconds

2013-01-13 Thread Oliver Neukum
On Sunday 13 January 2013 18:42:49 Alex Riesen wrote: On Sun, Jan 13, 2013 at 5:56 PM, Alan Stern st...@rowland.harvard.edu wrote: On Sun, 13 Jan 2013, Alex Riesen wrote: Yes, almost. What about khubd hanging when machine is shutdown? What about it? I have trouble understanding all the

Re: USB device cannot be reconnected and khubd blocked for more than 120 seconds

2013-01-13 Thread Alan Stern
On Sun, 13 Jan 2013, Oliver Neukum wrote: On Sunday 13 January 2013 18:42:49 Alex Riesen wrote: On Sun, Jan 13, 2013 at 5:56 PM, Alan Stern st...@rowland.harvard.edu wrote: On Sun, 13 Jan 2013, Alex Riesen wrote: Yes, almost. What about khubd hanging when machine is shutdown?

Re: USB device cannot be reconnected and khubd blocked for more than 120 seconds

2013-01-13 Thread Ming Lei
On Mon, Jan 14, 2013 at 1:42 AM, Alex Riesen raa.l...@gmail.com wrote: 1. Compile a kernel with deadline elevator as module 2. Boot into it, make sure the elevator is selected (I used elevator=deadline in the kernel command line) 3. Insert a FAT formatted mass storage device in an USB2 port

Re: USB device cannot be reconnected and khubd blocked for more than 120 seconds

2013-01-12 Thread Lan Tianyu
On 2013年1月12日 15:48:59, Alex Riesen wrote: On Fri, Jan 11, 2013 at 10:04 PM, Alex Riesen raa.l...@gmail.com wrote: Hi, the USB stick (an Cruzer Titanium 2GB) was not recognized at any of the USB ports of this system (an System76 lemu4 laptop, XHCI device) after it was removed. If I attempt to

Re: USB device cannot be reconnected and khubd blocked for more than 120 seconds

2013-01-12 Thread Alan Stern
On Sat, 12 Jan 2013, Alex Riesen wrote: On Fri, Jan 11, 2013 at 10:04 PM, Alex Riesen raa.l...@gmail.com wrote: Hi, the USB stick (an Cruzer Titanium 2GB) was not recognized at any of the USB ports of this system (an System76 lemu4 laptop, XHCI device) after it was removed. If I

Re: USB device cannot be reconnected and khubd blocked for more than 120 seconds

2013-01-12 Thread Alex Riesen
On Sat, Jan 12, 2013 at 6:37 PM, Alan Stern st...@rowland.harvard.edu wrote: On Sat, 12 Jan 2013, Alex Riesen wrote: One more detail: I usually use the noop elevator. That time it was the deadline. And I just reproduced it easily with deadline. I doubt the elevator has anything to do with

Re: USB device cannot be reconnected and khubd blocked for more than 120 seconds

2013-01-12 Thread Alex Riesen
On Sat, Jan 12, 2013 at 6:37 PM, Alan Stern st...@rowland.harvard.edu wrote: On Sat, 12 Jan 2013, Alex Riesen wrote: On Fri, Jan 11, 2013 at 10:04 PM, Alex Riesen raa.l...@gmail.com wrote: the USB stick (an Cruzer Titanium 2GB) was not recognized at any of the USB ports of this system (an

Re: USB device cannot be reconnected and khubd blocked for more than 120 seconds

2013-01-12 Thread Alex Riesen
On Sat, Jan 12, 2013 at 8:39 PM, Alex Riesen raa.l...@gmail.com wrote: On Sat, Jan 12, 2013 at 6:37 PM, Alan Stern st...@rowland.harvard.edu wrote: On Sat, 12 Jan 2013, Alex Riesen wrote: One more detail: I usually use the noop elevator. That time it was the deadline. And I just reproduced it

Re: USB device cannot be reconnected and khubd blocked for more than 120 seconds

2013-01-12 Thread Alan Stern
On Sat, 12 Jan 2013, Alex Riesen wrote: On Sat, Jan 12, 2013 at 8:39 PM, Alex Riesen raa.l...@gmail.com wrote: On Sat, Jan 12, 2013 at 6:37 PM, Alan Stern st...@rowland.harvard.edu wrote: On Sat, 12 Jan 2013, Alex Riesen wrote: One more detail: I usually use the noop elevator. That time

Re: USB device cannot be reconnected and khubd blocked for more than 120 seconds

2013-01-11 Thread Alex Riesen
On Fri, Jan 11, 2013 at 10:04 PM, Alex Riesen raa.l...@gmail.com wrote: Hi, the USB stick (an Cruzer Titanium 2GB) was not recognized at any of the USB ports of this system (an System76 lemu4 laptop, XHCI device) after it was removed. If I attempt to insert it again in any of the ports (one