Re: [PATCH] net: pegasus: Remove deprecated create_singlethread_workqueue
On Tue, Aug 30, 2016 at 10:02:47PM +0530, Bhaktipriya Shridhar wrote: > The workqueue "pegasus_workqueue" queues a single work item per pegasus > instance and hence it doesn't require execution ordering. Hence, > alloc_workqueue has been used to replace the deprecated > create_singlethread_workqueue instance. > > The WQ_MEM_RECLAIM flag has been set to ensure forward progress under > memory pressure since it's a network driver. > > Since there are fixed number of work items, explicit concurrency > limit is unnecessary here. > > Signed-off-by: Bhaktipriya Shridhar <bhaktipriy...@gmail.com> Acked-by: Tejun Heo <t...@kernel.org> Thanks. -- tejun -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] whci: Remove deprecated create_singlethread_workqueue
On Sat, Aug 13, 2016 at 09:20:40PM +0530, Bhaktipriya Shridhar wrote: > alloc_ordered_workqueue replaces the deprecated > create_singlethread_workqueue. > > The workqueue "workqueue" has multiple workitems which may require > ordering. Hence, a dedicated ordered workqueue has been used. > Since the workqueue is not being used on a memory reclaim path, > WQ_MEM_RECLAIM has not been set. > > Signed-off-by: Bhaktipriya Shridhar <bhaktipriy...@gmail.com> Acked-by: Tejun Heo <t...@kernel.org> Thanks. -- tejun -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] usb: host: u132-hcd: Remove deprecated create_singlethread_workqueue
Hello, On Tue, Aug 02, 2016 at 03:29:48PM +0200, Oliver Neukum wrote: > Apparently if that is the requirement USB will have to define its own > set of flags to use in such contexts. But still the calls as currently > executed work. Taking away WQ_MEM_RECLAIM would create danger of > introducing a regression. The issue with __GFP_DIRECT_RECLAIM already > exists and can be fixed. Alright, let's go with WQ_MEM_RECLAIM then. Thanks. -- tejun -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] usb: host: u132-hcd: Remove deprecated create_singlethread_workqueue
Hello, On Mon, Aug 01, 2016 at 03:50:36PM +0200, Michal Hocko wrote: > > All that would do is deferring the deadlock, right? I'm not sure it > > makes a lot of sense to protect an IO path against memory pressure > > half-way. It either can be depended during memory reclaim or it > > can't. > > Completely agreed! If the rescuer thread can block on a memory > allocation be it GFP_NOIO or others it is basically useless. ... > > Can MM people please chime in? The question is about USB stoage > > devices and memory reclaim. USB doesn't guarantee forward progress > > under memory pressure but tries a best-effort attempt with GFP_NOIO > > and ATOMIC. Is this the right thing to do? > > If any real IO depends on those devices then this is not sufficient and > they need some form of guarantee for progress (aka mempool). Oliver, Alan, what do you think? If USB itself can't operate without allocating memory during transactions, whatever USB storage drivers are doing isn't all that meaningful. Can we proceed with the workqueue patches? Also, it could be that the only thing GFP_NOIO and GFP_ATOMIC are doing is increasing the chance of IO failures under memory pressure. Maybe it'd be a good idea to reconsider the approach? Thanks. -- tejun -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] usb: lvstest: Remove deprecated create_singlethread_workqueue
On Thu, Jul 28, 2016 at 02:15:11PM +0530, Bhaktipriya Shridhar wrote: > The workqueue has a single work item(>rh_work) and hence > doesn't require ordering. Also, it is not being used on a memory > reclaim path. Hence, the singlethreaded workqueue has been replaced > with the use of system_wq. > > System workqueues have been able to handle high level of concurrency > for a long time now and hence it's not required to have a singlethreaded > workqueue just to gain concurrency. Unlike a dedicated per-cpu workqueue > created with create_singlethread_workqueue(), system_wq allows multiple > work items to overlap executions even on the same CPU; however, a > per-cpu workqueue doesn't have any CPU locality or global ordering > guarantee unless the target CPU is explicitly specified and thus the > increase of local concurrency shouldn't make any difference. > > The work item has been flushed in lvs_rh_disconnect() to ensure that > there are no pending tasks while disconnecting the driver. > > Signed-off-by: Bhaktipriya Shridhar <bhaktipriy...@gmail.com> Acked-by: Tejun Heo <t...@kernel.org> Thanks. -- tejun -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] usb: dwc2: Remove deprecated create_singlethread_workqueue
On Thu, Jul 28, 2016 at 01:57:29PM +0530, Bhaktipriya Shridhar wrote: > alloc_ordered_workqueue replaces the deprecated > create_singlethread_workqueue. > > There are multiple work items on the work queue, which require > ordering. Hence, an ordered workqueue has been used. > > The workqueue "wq_otg" is not being used on a memory reclaim path. > Hence, WQ_MEM_RECLAIM has not been set. > > Signed-off-by: Bhaktipriya Shridhar <bhaktipriy...@gmail.com> Acked-by: Tejun Heo <t...@kernel.org> Thanks. -- tejun -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] usb: host: u132-hcd: Remove deprecated create_singlethread_workqueue
Hello, Alan. On Wed, Jul 27, 2016 at 04:45:19PM -0400, Alan Stern wrote: > > Hmm... That doesn't really make them dependable during memory reclaim. > > True. But it does mean that they can't cause a deadlock by waiting > indefinitely for some other memory to be paged out to the very device > they are on the access pathway for. > > > What happens when those allocations fail? > > The same thing that happens when any allocation fails -- the original > I/O request fails with -ENOMEM or the equivalent. In the case of > usb-storage, this is likely to trigger error recovery, which will need > to allocate memory of its own... A bad situation to get into. All that would do is deferring the deadlock, right? I'm not sure it makes a lot of sense to protect an IO path against memory pressure half-way. It either can be depended during memory reclaim or it can't. The use of GFP_NOIO / ATOMIC is probably increases the chance of IO errors under moderate memory pressure too when there are dependable memory backing devices (and there better be) which can push things forward if called upon. Can MM people please chime in? The question is about USB stoage devices and memory reclaim. USB doesn't guarantee forward progress under memory pressure but tries a best-effort attempt with GFP_NOIO and ATOMIC. Is this the right thing to do? Thanks. -- tejun -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] usb: host: u132-hcd: Remove deprecated create_singlethread_workqueue
Hello, Alan. On Wed, Jul 27, 2016 at 02:54:51PM -0400, Alan Stern wrote: > > Hmm... I didn't know the whole USB stack could operate without > > allocating memory. Does usb stack have mempools and stuff all the way > > through? > > No -- the USB stack does need to allocate memory in order to operate. > But it is careful to use GFP_NOIO or GFP_ATOMIC for allocations that > might be on the block-device path. Hmm... That doesn't really make them dependable during memory reclaim. What happens when those allocations fail? Thanks. -- tejun -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] usb: host: u132-hcd: Remove deprecated create_singlethread_workqueue
Hello, Oliver. On Wed, Jul 27, 2016 at 11:29:56AM +0200, Oliver Neukum wrote: > On Wed, 2016-07-27 at 14:50 +0530, Bhaktipriya Shridhar wrote: > > The workqueue "workqueue" has multiple workitems which may require > > ordering. Hence, a dedicated ordered workqueue has been used. > > Since the workqueue is not being used on a memory reclaim path, > > WQ_MEM_RECLAIM has not been set. > > That is incorrect. The work queue is used by the HCD to handle > TDs, which are parts of basic IO. The HCD in turn is used by > usb-storage and uas, which are block drivers and those are obviously > used on the memory reclaim path. Hmm... I didn't know the whole USB stack could operate without allocating memory. Does usb stack have mempools and stuff all the way through? Thanks. -- tejun -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2] USB: appledisplay: Remove deprecated create_singlethread_workqueue
On Tue, Jul 26, 2016 at 10:19:18PM +0530, Bhaktipriya Shridhar wrote: > The workqueue "wq" is involved in controlling the brightness of an > Apple Cinema Display over USB. > > It has a single work item(>work) per appledisplay and hence > doesn't require ordering. Also, it is not being used on a memory > reclaim path. > > Hence, the singlethreaded workqueue has been replaced with the use of > system_wq. > > System workqueues have been able to handle high level of concurrency > for a long time now and hence it's not required to have a singlethreaded > workqueue just to gain concurrency. Unlike a dedicated per-cpu workqueue > created with create_singlethread_workqueue(), system_wq allows multiple > work items to overlap executions even on the same CPU; however, a > per-cpu workqueue doesn't have any CPU locality or global ordering > guarantee unless the target CPU is explicitly specified and thus the > increase of local concurrency shouldn't make any difference. > > The work item is self-requeueing and needs to wait for the in-flight > work item to finish before proceeding with destruction. > Hence, it has been sync cancelled in appledisplay_disconnect(). > This also ensures that there are no pending tasks while disconnecting > the driver. > > Signed-off-by: Bhaktipriya Shridhar <bhaktipriy...@gmail.com> Acked-by: Tejun Heo <t...@kernel.org> Thanks. -- tejun -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] USB: appledisplay: Remove deprecated create_singlethread_workqueue
On Tue, Jul 26, 2016 at 11:11:18AM +0530, Bhaktipriya Shridhar wrote: > The workqueue "wq" is involved in controlling the brightness of an > Apple Cinema Display over USB. > > It has a single work item(>work) per appledisplay and hence > doesn't require ordering. Also, it is not being used on a memory > reclaim path. > > Hence, the singlethreaded workqueue has been replaced with the use of > system_wq. > > System workqueues have been able to handle high level of concurrency > for a long time now and hence it's not required to have a singlethreaded > workqueue just to gain concurrency. Unlike a dedicated per-cpu workqueue > created with create_singlethread_workqueue(), system_wq allows multiple > work items to overlap executions even on the same CPU; however, a > per-cpu workqueue doesn't have any CPU locality or global ordering > guarantee unless the target CPU is explicitly specified and thus the > increase of local concurrency shouldn't make any difference. > > The work item is self-requeueing and needs to wait for the in-flight > work item to finish before proceeding with destruction. > Hence, it has been sync cancelled in appledisplay_disconnect(). > This also ensures that there are no pending tasks while disconnecting the > driver. > > Signed-off-by: Bhaktipriya Shridhar <bhaktipriy...@gmail.com> ... > @@ -159,7 +158,7 @@ static int appledisplay_bl_update_status(struct > backlight_device *bd) > pdata->msgdata, 2, > ACD_USB_TIMEOUT); > mutex_unlock(>sysfslock); > - > + ??? Other than that, Acked-by: Tejun Heo <t...@kernel.org> Thanks. -- tejun -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] usb: ftdi-elan: Remove deprecated create_singlethread_workqueue
On Tue, Jul 26, 2016 at 10:47:20AM +0530, Bhaktipriya Shridhar wrote: > The status workqueue is involved in initializing the Uxxx and polling > the Uxxx until a supported PCMCIA CardBus device is detected. > It then starts the command and respond workqueues and then loads the > module that handles the device, after which it just polls the Uxxx > looking for card ejects. > > The command and respond workqueues are involved in implementing a command > sequencer for communicating with the firmware on the other side of > the FTDI chip in the Uxxx. > > These workqueues have only a single work item each and hence they do not > require ordering. Also, none of the above workqueues are being used on a > memory recliam path. Hence, the singlethreaded workqueues have been > replaced with the use of system_wq. > > System workqueues have been able to handle high level of concurrency > for a long time now and hence it's not required to have a singlethreaded > workqueue just to gain concurrency. Unlike a dedicated per-cpu workqueue > created with create_singlethread_workqueue(), system_wq allows multiple > work items to overlap executions even on the same CPU; however, a > per-cpu workqueue doesn't have any CPU locality or global ordering > guarantee unless the target CPU is explicitly specified and thus the > increase of local concurrency shouldn't make any difference. > > The work items have been sync cancelled because they are self-requeueing > and need to wait for the in-flight work item to finish before proceeding > with destruction. Hence, they have been sync cancelled in > ftdi_status_cancel_work(), ftdi_command_cancel_work() and > ftdi_response_cancel_work(). These functions are called in > ftdi_elan_exit() to ensure that there are no pending work items while > disconnecting the driver. > > Signed-off-by: Bhaktipriya Shridhar <bhaktipriy...@gmail.com> Acked-by: Tejun Heo <t...@kernel.org> Thanks. -- tejun -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Freezable workqueue blocks non-freezable workqueue during the system resume process
Hello, Jan, Alan. On Tue, Mar 15, 2016 at 10:25:43AM +0100, Jan Kara wrote: > > The kernel does suspend device drivers; that is, it invokes their > > suspend callbacks. But it doesn't "freeze" them in any sense. Once a > > driver has been suspended, it assumes it won't receive any I/O requests > > until it has been resumed. Therefore the kernel first has to prevent > > all the upper layers from generating such requests and/or sending them > > to the low-level drivers. > > OK, so Tejun and you should talk together because you both seem to want > something else... If I understand it right, Tejun wants suspended devices > to just queue requests that have been submitted after these devices were > suspended and complete them once they are resumed... Yeah, I suppose that's why we have the code base we do now. I don't think freezing kernel threads is the right mechanism to plug IO devices during suspend. It's way too error-prone and causes a dependency nightmare as it acts essentially as a system-wide lock. More complex drivers already plug themselves which are necessary no matter what as upper layers or some kthreads aren't the only sources of commands to devices. We can plug at block layer for IOs coming down from higher layers. We can even provide a mechanism to plug certain kthreads if necessary but they should be contained in the driver - e.g. the suspend callback specifically blocking certain specific kthreads - instead of the vague "the system is generally stopped now and it seems to work most of the time" that we're doing now. Thanks. -- tejun -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Freezable workqueue blocks non-freezable workqueue during the system resume process
Hello, Jan. On Thu, Mar 03, 2016 at 10:33:10AM +0100, Jan Kara wrote: > > Ugh... that's nasty. I wonder whether the right thing to do is making > > writeback workers non-freezable. IOs are supposed to be blocked from > > lower layer anyway. Jan, what do you think? > > Well no, at least currently IO is not blocked in lower layers AFAIK - for > that you'd need to freeze block devices & filesystems and there are issues At least libata does and I think SCSI does too, but yeah, there probably are drivers which depend on block layer blocking IOs, which btw is a pretty fragile way to go about as upper layers might not be the only source of activities. > with that (Jiri Kosina was the last one which was trying to make this work > AFAIR). And I think you need to stop writeback (and generally any IO) to be > generated so that it doesn't interact in a strange way with device drivers > being frozen. So IMO until suspend freezes filesystems & devices properly > you have to freeze writeback workqueue. I still think the right thing to do is plugging that block layer or low level drivers. It's like we're trying to plug multiple sources when we can plug the point where they come together anyway. Thanks. -- tejun -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Freezable workqueue blocks non-freezable workqueue during the system resume process
Hello, (cc'ing Jan) On Fri, Feb 26, 2016 at 02:19:20PM +0800, Peter Chen wrote: > On Thu, Feb 25, 2016 at 05:01:12PM -0500, Tejun Heo wrote: > > Hello, Peter. > > > > On Wed, Feb 24, 2016 at 03:24:30PM +0800, Peter Chen wrote: > > > > You might want to complain to the block-layer people about this. I > > > > don't know if anything can be done to fix it. > > > > > > > > Or maybe flush_work and flush_delayed_work can be changed to avoid > > > > blocking if the workqueue is frozen. Tejun? > > > > > > > > > > I have a patch to show the root cause of this issue. > > > > > > http://www.spinics.net/lists/linux-usb/msg136815.html > > > > I don't get it. Why would it deadlock? Shouldn't things get rolling > > once the workqueues are thawed? > > The workqueue writeback can't be thawed due to driver's resume > (dpm_complete) is lock nested, and can't be finished. Ugh... that's nasty. I wonder whether the right thing to do is making writeback workers non-freezable. IOs are supposed to be blocked from lower layer anyway. Jan, what do you think? Thanks. -- tejun -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Freezable workqueue blocks non-freezable workqueue during the system resume process
Hello, Peter. On Wed, Feb 24, 2016 at 03:24:30PM +0800, Peter Chen wrote: > > You might want to complain to the block-layer people about this. I > > don't know if anything can be done to fix it. > > > > Or maybe flush_work and flush_delayed_work can be changed to avoid > > blocking if the workqueue is frozen. Tejun? > > > > I have a patch to show the root cause of this issue. > > http://www.spinics.net/lists/linux-usb/msg136815.html I don't get it. Why would it deadlock? Shouldn't things get rolling once the workqueues are thawed? Thanks. -- tejun -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 02/23] usb-gadget: use per-attribute show and store methods
On Wed, Sep 30, 2015 at 11:32:19AM -0500, Felipe Balbi wrote: > On Wed, Sep 30, 2015 at 12:20:46PM -0400, Tejun Heo wrote: > > On Wed, Sep 30, 2015 at 11:19:25AM -0500, Felipe Balbi wrote: > > > On Mon, Sep 28, 2015 at 03:35:14PM +0200, Christoph Hellwig wrote: > > > > On Sun, Sep 27, 2015 at 10:50:53AM -0500, Felipe Balbi wrote: > > > > > this (and the other helper below) could be macros just fine. > > > > > > > > They could, but they shouldn't. Inlines are always preferable over > > > > function-like macros. > > > > > > says who ? And why ? > > > > Documentation/CodingStyle > > container_of() is type-safe, what is an inline function bringing as benefit ? It's a general preference. Because there's enough benefit to going with inline functions and there's extra benefit to be gained from having consistent style of code and documentation, as a general rule, we prefer inline functions over macros. If you have specific technical arguments why macro is better, sure; otherwise, follow the conventions for consistency if for nothing else. -- tejun -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 02/23] usb-gadget: use per-attribute show and store methods
On Wed, Sep 30, 2015 at 11:19:25AM -0500, Felipe Balbi wrote: > On Mon, Sep 28, 2015 at 03:35:14PM +0200, Christoph Hellwig wrote: > > On Sun, Sep 27, 2015 at 10:50:53AM -0500, Felipe Balbi wrote: > > > this (and the other helper below) could be macros just fine. > > > > They could, but they shouldn't. Inlines are always preferable over > > function-like macros. > > says who ? And why ? Documentation/CodingStyle -- tejun -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 02/23] usb-gadget: use per-attribute show and store methods
On Wed, Sep 30, 2015 at 11:43:01AM -0500, Felipe Balbi wrote: > Seems like there are *ton* of uses of container_of() wrapped within a simple > macro. What convention are you talking about, again ? The convention of using inline functions over macros when possible. We don't do that all the time but that's where we wanna be in general. > And again, what benefit is an inline function bringing in this specific > case ? As for a technical reason, we know the macro definition will be > copied Verbatim into the caller body. GCC might decide to not inline > those helpers (unlikely, but could). That's really grasping at straws. Let's not go there. For simple stuff like container_of(), a more valid reason would be "it's shorter and sweeter and AFAICS doesn't have any known downsides of using macros" but this ultimately is a bike-shedding problem. You asked where and why we said we prefer inline functions so that's the answer (aside from individual technical advantages). It's not an absolute rule but we're better off if we try to keep things consistent if possible. As for this specific case, I don't know. I might do the macro too just because it's less typing and I don't really care but at the same time I'd just switch to inline if somebody points it out. -- tejun -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 01/23] configfs: add show and store methods to struct configfs_attribute
Hello, Christoph. On Fri, Sep 25, 2015 at 06:49:38AM -0700, Christoph Hellwig wrote: > Add methods to struct configfs_attribute to directly show and store > attributes without adding boilerplate code to every user. In addition > to the methods this also adds 3 helper macros to define read/write, > read-only and write-only attributes with a single line of code. This looks great to me. It doesn't look like the whole attribute thing played out as hoped. It'd be great if we can move away from it. Thanks a lot for the work! -- tejun -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHv3 1/4] phy: phy-core: Make GENERIC_PHY an invisible option
On Fri, May 29, 2015 at 06:07:18PM +0530, Kishon Vijay Abraham I wrote: Tejun, Maxime, Sylwester, Kyungmin On Thursday 23 April 2015 04:34 AM, Arun Ramamurthy wrote: Most of the phy providers use select to enable GENERIC_PHY. Since select is only recommended when the config is not visible, GENERIC_PHY is changed an invisible option. To maintain consistency, all phy providers are changed to select GENERIC_PHY and all non-phy drivers use depends on when the phy framework is explicity required. USB_MUSB_OMAP2PLUS has a cyclic dependency, so it is left as select. Signed-off-by: Arun Ramamurthy arun.ramamur...@broadcom.com Need your ACK for this patch. For ATA part, Acked-by: Tejun Heo t...@kernel.org Thanks. -- tejun -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 24/32] usb: use %*pb[l] to print bitmaps including cpumasks and nodemasks
printk and friends can now formap bitmaps using '%*pb[l]'. cpumask and nodemask also provide cpumask_pr_args() and nodemask_pr_args() respectively which can be used to generate the two printf arguments necessary to format the specified cpu/nodemask. * drivers/uwb/drp.c::uwb_drp_handle_alien_drp() was formatting mas.bm into a buffer but never used it. Removed. This patch is dependent on the following two patches. lib/vsprintf: implement bitmap printing through '%*pb[l]' cpumask, nodemask: implement cpumask/nodemask_pr_args() Please wait till the forementioned patches are merged to mainline before applying to subsystem trees. Signed-off-by: Tejun Heo t...@kernel.org Cc: Andrew Morton a...@linux-foundation.org Cc: Greg Kroah-Hartman gre...@linuxfoundation.org Cc: linux-usb@vger.kernel.org --- drivers/usb/host/whci/debug.c | 7 ++- drivers/usb/wusbcore/reservation.c | 5 ++--- drivers/usb/wusbcore/wa-rpipe.c| 5 ++--- drivers/usb/wusbcore/wusbhc.c | 7 ++- drivers/uwb/drp.c | 2 -- drivers/uwb/uwb-debug.c| 16 +--- 6 files changed, 13 insertions(+), 29 deletions(-) diff --git a/drivers/usb/host/whci/debug.c b/drivers/usb/host/whci/debug.c index ba61dae..774b89d 100644 --- a/drivers/usb/host/whci/debug.c +++ b/drivers/usb/host/whci/debug.c @@ -86,17 +86,14 @@ static void qset_print(struct seq_file *s, struct whc_qset *qset) static int di_print(struct seq_file *s, void *p) { struct whc *whc = s-private; - char buf[72]; int d; for (d = 0; d whc-n_devices; d++) { struct di_buf_entry *di = whc-di_buf[d]; - bitmap_scnprintf(buf, sizeof(buf), -(unsigned long *)di-availability_info, UWB_NUM_MAS); - seq_printf(s, DI[%d]\n, d); - seq_printf(s, availability: %s\n, buf); + seq_printf(s, availability: %*pb\n, + UWB_NUM_MAS, (unsigned long *)di-availability_info); seq_printf(s, %c%c key idx: %d dev addr: %d\n, (di-addr_sec_info WHC_DI_SECURE) ? 'S' : ' ', (di-addr_sec_info WHC_DI_DISABLE) ? 'D' : ' ', diff --git a/drivers/usb/wusbcore/reservation.c b/drivers/usb/wusbcore/reservation.c index d5efd0f..7b1b2e2 100644 --- a/drivers/usb/wusbcore/reservation.c +++ b/drivers/usb/wusbcore/reservation.c @@ -49,14 +49,13 @@ static void wusbhc_rsv_complete_cb(struct uwb_rsv *rsv) struct wusbhc *wusbhc = rsv-pal_priv; struct device *dev = wusbhc-dev; struct uwb_mas_bm mas; - char buf[72]; dev_dbg(dev, %s: state = %d\n, __func__, rsv-state); switch (rsv-state) { case UWB_RSV_STATE_O_ESTABLISHED: uwb_rsv_get_usable_mas(rsv, mas); - bitmap_scnprintf(buf, sizeof(buf), mas.bm, UWB_NUM_MAS); - dev_dbg(dev, established reservation: %s\n, buf); + dev_dbg(dev, established reservation: %*pb\n, + UWB_NUM_MAS, mas.bm); wusbhc_bwa_set(wusbhc, rsv-stream, mas); break; case UWB_RSV_STATE_NONE: diff --git a/drivers/usb/wusbcore/wa-rpipe.c b/drivers/usb/wusbcore/wa-rpipe.c index a80c5d2..c7ecdbe 100644 --- a/drivers/usb/wusbcore/wa-rpipe.c +++ b/drivers/usb/wusbcore/wa-rpipe.c @@ -496,10 +496,9 @@ void wa_rpipes_destroy(struct wahc *wa) struct device *dev = wa-usb_iface-dev; if (!bitmap_empty(wa-rpipe_bm, wa-rpipes)) { - char buf[256]; WARN_ON(1); - bitmap_scnprintf(buf, sizeof(buf), wa-rpipe_bm, wa-rpipes); - dev_err(dev, BUG: pipes not released on exit: %s\n, buf); + dev_err(dev, BUG: pipes not released on exit: %*pb\n, + wa-rpipes, wa-rpipe_bm); } kfree(wa-rpipe_bm); } diff --git a/drivers/usb/wusbcore/wusbhc.c b/drivers/usb/wusbcore/wusbhc.c index 3e1ba51..94f401a 100644 --- a/drivers/usb/wusbcore/wusbhc.c +++ b/drivers/usb/wusbcore/wusbhc.c @@ -496,11 +496,8 @@ static void __exit wusbcore_exit(void) { clear_bit(0, wusb_cluster_id_table); if (!bitmap_empty(wusb_cluster_id_table, CLUSTER_IDS)) { - char buf[256]; - bitmap_scnprintf(buf, sizeof(buf), wusb_cluster_id_table, -CLUSTER_IDS); - printk(KERN_ERR BUG: WUSB Cluster IDs not released - on exit: %s\n, buf); + printk(KERN_ERR BUG: WUSB Cluster IDs not released on exit: %*pb\n, + CLUSTER_IDS, wusb_cluster_id_table); WARN_ON(1); } usb_unregister_notify(wusb_usb_notifier); diff --git a/drivers/uwb/drp.c b/drivers/uwb/drp.c index 05b7bd7..8fc1b78 100644 --- a/drivers/uwb/drp.c +++ b/drivers/uwb/drp.c @@ -619,11 +619,9 @@ static void uwb_drp_handle_alien_drp(struct uwb_rc *rc, struct uwb_ie_drp
Re: [PATCH v4 00/11] scsi: fix module reference mismatch for scsi host
Hello, Akinobu. On Tue, Jan 20, 2015 at 11:57:37PM +0900, Akinobu Mita wrote: The reason I didn't move sht from the core driver to the LLDDs for fixing ufs and ums-* in the first place is to avoid exporting many symbols for callbacks in sht. But I realized that we can do it without that many exported symbols by creating a single function that returns a kmemdup()ed sht with a few change including -module. Hmmm, libata already exports most of the necessary symbols. libahci or platform drivers might have to export more but that shouldn't be much. For libata, pushing sht's to the leaf drivers would make far more sense as sht's already get inherited and modified along the hierarchy. Thanks. -- tejun -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4 00/11] scsi: fix module reference mismatch for scsi host
On Mon, Jan 19, 2015 at 12:05:58AM +0900, Akinobu Mita wrote: While accessing a scsi_device, the use count of the underlying LLDD module is incremented. The module reference is retrieved through .module field of struct scsi_host_template. This mapping between scsi_device and underlying LLDD module works well except some drivers which consist with the core driver and the actual LLDDs and scsi_host_template is defined in the core driver. In these cases, the actual LLDDs can be unloaded even if the scsi_device is being accessed. This patch series fixes the module reference mismatch problem for ufs, usb-storage, esp_scsi, ahci_platform, and pata_platform drivers by moving owner module reference field from struct scsi_host_template to struct Scsi_Host and allowing the LLDDs to set their correct module reference. Hmmm, can't we just move sht definitions to actual LLDDs? libata can do that easily. sht, as its name implies, is the template for creating the scsi_hosts of a given type. We're now just moving module ownership from sht definition site to whatever callsite the actual instance is being created which can also be wrapped in a separate layer requiring explicit propagation. Why not just propagate sht's directly? What's the difference? Thanks. -- tejun -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 4/4] usb: hub: rename khubd to hub_wq in documentation and comments
On Thu, Sep 18, 2014 at 06:15:02PM +0200, Petr Mládek wrote: The easiest solution would be to allocate the work queue with the flag WQ_UNBOUND and max_active = 1. It will force serialization of all work items. Please use alloc_ordered_workqueue() for that purpose. WQ_UNBOUND + max_active == 1 no longer guarantees single thread of execution. Thanks. -- tejun -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 4/4] usb: hub: rename khubd to hub_wq in documentation and comments
Hello, Alan, Petr. On Wed, Sep 17, 2014 at 01:36:26PM -0400, Alan Stern wrote: - /* If khubd ever becomes multithreaded, this will need a lock */ + /* If hub_wq ever becomes multithreaded, this will need a lock */ if (udev-wusb) { devnum = udev-portnum + 1; BUG_ON(test_bit(devnum, bus-devmap.devicemap)); You probably didn't notice when changing this comment. But in fact, workqueues _are_ multithreaded. Therefore you need to add a lock to this routine. Haven't read the code but if this function is called from a single work_struct, workqueue guarantees that there's only single thread of execution at any given time. A work item is never executed concurrently no matter what. Thanks. -- tejun -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/4] usb: hub: convert khubd into workqueue
On Fri, Sep 12, 2014 at 02:21:05PM +0200, Petr Mladek wrote: There is no need to have separate kthread for handling USB hub events. It is more elegant to use the workqueue framework. The workqueue is allocated as unbound, cpu intensive, and freezable. I'd just go with WQ_FREEZABLE. As a general rule of thumb, please only use attributes which are strictly necessary. If the default behavior turns out to be not so good for typical wq usages (and AFAICS khubd seems fairly typical), we should be changing the default behavior. Each usage deviating complicates the long-term prospect. Thanks. -- tejun -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] USB: serial: fix sysfs-attribute removal deadlock
On Mon, Apr 28, 2014 at 08:39:47AM +0800, Li Zhong wrote: Yes, maybe try to get the module reference is not bad before writing to driver attributes, as it doesn't make much sense to really call the callbacks for the driver attribute if the driver is being unload. Please don't do that spuriously. Active protection is the primary mechanism for that sort of protection and adding spurious things just make them confusing. And after we get the reference, it is safe for us to break the active. But if we don't have such real cases(lockdep warnings), we actually don't need to break it. Yeah, for cases where active protection should be broken, other measures should be taken to prevent the underlying data structure / code from going away. Thanks. -- tejun -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] USB: serial: fix sysfs-attribute removal deadlock
On Thu, Apr 24, 2014 at 04:29:15PM +0800, Li Zhong wrote: On Wed, 2014-04-23 at 10:19 -0400, Tejun Heo wrote: cc'ing Li Zhong who's working on a simliar issue in the following thread and quoting whole body. http://thread.gmane.org/gmane.linux.kernel/1680706 Li, this is another variation of the same problem. Maybe this can be covered by your work too? It seems to me that it is about write something to driver attribute, and driver unloading. If so, maybe it's not easy to reuse the help functions created for device attribute, and device removing. But I guess the idea to break the active protection could still be applied here: Maybe we could try_module_get() here (like the other option suggested by Johan?), and break active protection if we could get the module, something like below? I don't get why try_module_get() matters here. We can't call into -store if the object at hand is already destroyed and the underlying module can't go away if the target device is still alive. try_module_get() doesn't actually protect the object. Why does that matter? This is self removal, right? Can you please take a look at kernfs_remove_self()? Thanks. -- tejun -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] USB: serial: fix sysfs-attribute removal deadlock
cc'ing Li Zhong who's working on a simliar issue in the following thread and quoting whole body. http://thread.gmane.org/gmane.linux.kernel/1680706 Li, this is another variation of the same problem. Maybe this can be covered by your work too? Thanks. On Wed, Apr 23, 2014 at 11:32:19AM +0200, Johan Hovold wrote: Fix driver new_id sysfs-attribute removal deadlock by making sure to not hold any locks that the attribute operations grab when removing the attribute. Specifically, usb_serial_deregister holds the table mutex when deregistering the driver, which includes removing the new_id attribute. This can lead to a deadlock as writing to new_id increments the attribute's active count before trying to grab the same mutex in usb_serial_probe. The deadlock can easily be triggered by inserting a sleep in usb_serial_deregister and writing the id of an unbound device to new_id during module unload. As the table mutex (in this case) is used to prevent subdriver unload during probe, it should be sufficient to only hold the lock while manipulating the usb-serial driver list during deregister. A racing probe will then either fail to find a matching subdriver or fail to get the corresponding module reference. Since v3.15-rc1 this also triggers the following lockdep warning: == [ INFO: possible circular locking dependency detected ] 3.15.0-rc2 #123 Tainted: GW --- modprobe/190 is trying to acquire lock: (s_active#4){.+}, at: [c0167aa0] kernfs_remove_by_name_ns+0x4c/0x94 but task is already holding lock: (table_lock){+.+.+.}, at: [bf004d84] usb_serial_deregister+0x3c/0x78 [usbserial] which lock already depends on the new lock. the existing dependency chain (in reverse order) is: - #1 (table_lock){+.+.+.}: [c0075f84] __lock_acquire+0x1694/0x1ce4 [c0076de8] lock_acquire+0xb4/0x154 [c03af3cc] _raw_spin_lock+0x4c/0x5c [c02bbc24] usb_store_new_id+0x14c/0x1ac [bf007eb4] new_id_store+0x68/0x70 [usbserial] [c025f568] drv_attr_store+0x30/0x3c [c01690e0] sysfs_kf_write+0x5c/0x60 [c01682c0] kernfs_fop_write+0xd4/0x194 [c010881c] vfs_write+0xbc/0x198 [c0108e4c] SyS_write+0x4c/0xa0 [c000f880] ret_fast_syscall+0x0/0x48 - #0 (s_active#4){.+}: [c03a7a28] print_circular_bug+0x68/0x2f8 [c0076218] __lock_acquire+0x1928/0x1ce4 [c0076de8] lock_acquire+0xb4/0x154 [c0166b70] __kernfs_remove+0x254/0x310 [c0167aa0] kernfs_remove_by_name_ns+0x4c/0x94 [c0169fb8] remove_files.isra.1+0x48/0x84 [c016a2fc] sysfs_remove_group+0x58/0xac [c016a414] sysfs_remove_groups+0x34/0x44 [c02623b8] driver_remove_groups+0x1c/0x20 [c0260e9c] bus_remove_driver+0x3c/0xe4 [c026235c] driver_unregister+0x38/0x58 [bf007fb4] usb_serial_bus_deregister+0x84/0x88 [usbserial] [bf004db4] usb_serial_deregister+0x6c/0x78 [usbserial] [bf005330] usb_serial_deregister_drivers+0x2c/0x4c [usbserial] [bf016618] usb_serial_module_exit+0x14/0x1c [sierra] [c009d6cc] SyS_delete_module+0x184/0x210 [c000f880] ret_fast_syscall+0x0/0x48 other info that might help us debug this: Possible unsafe locking scenario: CPU0CPU1 lock(table_lock); lock(s_active#4); lock(table_lock); lock(s_active#4); *** DEADLOCK *** 1 lock held by modprobe/190: #0: (table_lock){+.+.+.}, at: [bf004d84] usb_serial_deregister+0x3c/0x78 [usbserial] stack backtrace: CPU: 0 PID: 190 Comm: modprobe Tainted: GW 3.15.0-rc2 #123 [c0015e10] (unwind_backtrace) from [c0013728] (show_stack+0x20/0x24) [c0013728] (show_stack) from [c03a9a54] (dump_stack+0x24/0x28) [c03a9a54] (dump_stack) from [c03a7cac] (print_circular_bug+0x2ec/0x2f8) [c03a7cac] (print_circular_bug) from [c0076218] (__lock_acquire+0x1928/0x1ce4) [c0076218] (__lock_acquire) from [c0076de8] (lock_acquire+0xb4/0x154) [c0076de8] (lock_acquire) from [c0166b70] (__kernfs_remove+0x254/0x310) [c0166b70] (__kernfs_remove) from [c0167aa0] (kernfs_remove_by_name_ns+0x4c/0x94) [c0167aa0] (kernfs_remove_by_name_ns) from [c0169fb8] (remove_files.isra.1+0x48/0x84) [c0169fb8] (remove_files.isra.1) from [c016a2fc] (sysfs_remove_group+0x58/0xac) [c016a2fc] (sysfs_remove_group) from [c016a414] (sysfs_remove_groups+0x34/0x44) [c016a414] (sysfs_remove_groups) from [c02623b8] (driver_remove_groups+0x1c/0x20) [c02623b8] (driver_remove_groups) from [c0260e9c] (bus_remove_driver+0x3c/0xe4) [c0260e9c] (bus_remove_driver) from [c026235c] (driver_unregister+0x38/0x58) [c026235c] (driver_unregister) from [bf007fb4] (usb_serial_bus_deregister+0x84/0x88 [usbserial]) [bf007fb4]
Re: [PATCH libata/for-3.15-fixes] libata: drop COMPILE_TEST from AHCI_XGENE
Hello, On Thu, Apr 03, 2014 at 04:32:24PM +0200, Bartlomiej Zolnierkiewicz wrote: So, apparently, this isn't enough as this would allow enabling PHY_XGENE regardless of HAS_IOMEM or OF. From kconfig-language.txt, PHY_XGENE has the following dependencies: depends on HAS_IOMEM OF (ARM64 || COMPILE_TEST) So it should be OK. ... Please note that ARM64 implies that HAS_IOMEM and OF are _always_ selected. This is the reason why removing || COMPILE_TEST from AHCI_XGENE dependencies is sufficient to fix the issue (though I have to admit that relying indirectly on ARM64 selects is a bit hacky). This being okay is so not obvious. Let's either float all dependencies to the higher level drivers or just use depends on. I really dislike mixing the two and playing games with the subtle combinations. For now, I'm making it all depends on. Please feel free to improve on it as necessary. Hmmm, the alternative idea is to remove PHY_XGENE reference from AHCI_XGENE altogether and add: default y if AHCI_XGENE to PHY_XGENE instead (which seems to be not as hacky as solution with select PHY_XGENE and also more user-friendly than depends on PHY_XGENE one). The config dependencies should be indicative the actual dependency. AHCI_XGENE is useless without PHY_XGENE; then, the config dependency *must* represent that. That's the priority. It is true that users probably won't care about PHY_XGENE or any PHY drivers probably, but, again, let's please not mix depends and selects in contrived manner to hit the just right mixture of the two. Let's please do it systematically. Float all actual deps to the surface to the options that the users configuring care about and propagate down to phy drivers using select. Our config system *really* doesn't work with mixed depends and selects. I mean, think about the suggested solution. It works only because there's no actual linkage dependency between the two, ARM64 implies HAS_IOMEM OF, AND we omitted COMPILE_TEST for a completely obscure reason. I don't think that's acceptable. Thanks. -- tejun -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH libata/for-3.15-fixes] libata: drop COMPILE_TEST from AHCI_XGENE
On Wed, Apr 02, 2014 at 11:53:57AM -0400, Tejun Heo wrote: config AHCI_XGENE tristate APM X-Gene 6.0Gbps AHCI SATA host controller support - depends on ARM64 || COMPILE_TEST + depends on ARM64 select PHY_XGENE So, an alternative would be doing select PHY_XGENE if ARM64 butI don't think it'd be worthwhile to go above and beyond for COMPILE_TEST and it might cause other linkage issues down the road. Thanks. -- tejun -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH libata/for-3.15-fixes] libata: drop COMPILE_TEST from AHCI_XGENE
On Wed, Apr 02, 2014 at 11:53:57AM -0400, Tejun Heo wrote: Applied the following patch to libata/for-3.15-fixes. Thanks. --- 8 --- From 9c23f2cf7f6e107e85eef57fdf3049a93b6e157c Mon Sep 17 00:00:00 2001 From: Tejun Heo t...@kernel.org Date: Wed, 2 Apr 2014 11:47:04 -0400 AHCI_XGENE is only applicable on ARM64 but it can also be enabled for compile testing; however, AHCI_XGENE selects PHY_XGENE which has other arch specific dependencies. This leads to the following warning when enabling it on other archs for compile testing. warning: (AHCI_XGENE) selects PHY_XGENE which has unmet direct dependencies (HAS_IOMEM OF (ARM64 || COMPILE_TEST)) Let's drop COMPILE_TEST from AHCI_XGENE. Signed-off-by: Tejun Heo t...@kernel.org Reported-by: Linus Torvalds torva...@linux-foundation.org Cc: Loc Ho l...@apm.com Cc: Bartlomiej Zolnierkiewicz b.zolnier...@samsung.com So, apparently, this isn't enough as this would allow enabling PHY_XGENE regardless of HAS_IOMEM or OF. From kconfig-language.txt, Note: select should be used with care. select will force a symbol to a value without visiting the dependencies. By abusing select you are able to select a symbol FOO even if FOO depends on BAR that is not set. In general use select only for non-visible symbols (no prompts anywhere) and for symbols with no dependencies. That will limit the usefulness but on the other hand avoid the illegal configurations all over. We can add all the necessary dependencies to AHCI_XGENE but I think the the right thing to do is turning it into a proper dependency. Will prep another patch. Thanks. -- tejun -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH libata/for-3.15-fixes] libata: make AHCI_XGENE depend on PHY_XGENE
AHCI_XGENE is only applicable on ARM64 but it can also be enabled for compile testing; however, AHCI_XGENE selects PHY_XGENE which has other arch specific dependencies. This leads to the following warning when enabling it on other archs for compile testing. warning: (AHCI_XGENE) selects PHY_XGENE which has unmet direct dependencies (HAS_IOMEM OF (ARM64 || COMPILE_TEST)) Selecting a config option which itself has dependencies can easily lead to broken configurations. For now, let's just make AHCI_XGENE depend on PHY_XGENE which has all the necessary dependencies already. Signed-off-by: Tejun Heo t...@kernel.org Reported-by: Linus Torvalds torva...@linux-foundation.org Cc: Loc Ho l...@apm.com Cc: Bartlomiej Zolnierkiewicz b.zolnier...@samsung.com Cc: Kishon Vijay Abraham I kis...@ti.com --- drivers/ata/Kconfig |3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/ata/Kconfig b/drivers/ata/Kconfig index 20e03a7..2e4da3b 100644 --- a/drivers/ata/Kconfig +++ b/drivers/ata/Kconfig @@ -134,8 +134,7 @@ config AHCI_SUNXI config AHCI_XGENE tristate APM X-Gene 6.0Gbps AHCI SATA host controller support - depends on ARM64 || COMPILE_TEST - select PHY_XGENE + depends on PHY_XGENE help This option enables support for APM X-Gene SoC SATA host controller. -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 5/9] usb: don't use PREPARE_DELAYED_WORK
Hello, If this is actually safe, let's do it from the get-go. Thanks! --- 8 --- PREPARE_[DELAYED_]WORK() are being phased out. They have few users and a nasty surprise in terms of reentrancy guarantee as workqueue considers work items to be different if they don't have the same work function. usb_hub-init_work is multiplexed with multiple work functions; however, the work item is never queued while in-flight, so we can simply use INIT_DELAYED_WORK() before each queueing. It would probably be best to route this with other related updates through the workqueue tree. Lightly tested. v2: Greg and Alan confirm that the work item is never queued while in-flight. Simply use INIT_DELAYED_WORK(). Signed-off-by: Tejun Heo t...@kernel.org Cc: Greg Kroah-Hartman gre...@linuxfoundation.org Cc: Alan Stern st...@rowland.harvard.edu Cc: linux-usb@vger.kernel.org --- drivers/usb/core/hub.c |4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) --- a/drivers/usb/core/hub.c +++ b/drivers/usb/core/hub.c @@ -1040,7 +1040,7 @@ static void hub_activate(struct usb_hub */ if (type == HUB_INIT) { delay = hub_power_on(hub, false); - PREPARE_DELAYED_WORK(hub-init_work, hub_init_func2); + INIT_DELAYED_WORK(hub-init_work, hub_init_func2); schedule_delayed_work(hub-init_work, msecs_to_jiffies(delay)); @@ -1194,7 +1194,7 @@ static void hub_activate(struct usb_hub /* Don't do a long sleep inside a workqueue routine */ if (type == HUB_INIT2) { - PREPARE_DELAYED_WORK(hub-init_work, hub_init_func3); + INIT_DELAYED_WORK(hub-init_work, hub_init_func3); schedule_delayed_work(hub-init_work, msecs_to_jiffies(delay)); return; /* Continues at init3: below */ -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 5/9] usb: don't use PREPARE_DELAYED_WORK
Hey, Alan. On Sat, Feb 22, 2014 at 06:03:04PM -0500, Alan Stern wrote: then a single init work could be queued to the system_unbound_wq which doesn't care about running times. This sort of thing sounds like the best approach. Tejun, do you want to rewrite the patch, getting rid of the hub_init_func3 and HUB_INIT3 business entirely? Or would you like me to do it? I'll doing the minimal patch in this series and then following up with the update is probably better. I can put the patches in a separate branch so that it can be easily pulled. Thanks. -- tejun -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 5/9] usb: don't use PREPARE_DELAYED_WORK
On Fri, Feb 21, 2014 at 10:06:05AM -0500, Alan Stern wrote: I think it should be fine to use INIT_DELAYED_WORK(), but Alan would know best. Alan? That's right; INIT_DELAYED_WORK() should be fine. Provided there's no problem doing it from within the previous work routine. Yeah, that's completely fine. Once a work item starts execution, workqueue doesn't care about it at all - even freeing and recycling it is fine. Thanks. -- tejun -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 5/9] usb: don't use PREPARE_DELAYED_WORK
PREPARE_[DELAYED_]WORK() are being phased out. They have few users and a nasty surprise in terms of reentrancy guarantee as workqueue considers work items to be different if they don't have the same work function. usb_hub-init_work is multiplexed with multiple work functions. Introduce hub_init_workfn() which invokes usb_hub-init_workfn and always use it as the work function and update the users to set the -init_workfn field instead of overriding the work function using PREPARE_DELAYED_WORK(). It looks like that the work items are never queued while in-flight, so simply using INIT_DELAYED_WORK() before each queueing could be enough. This patch performs equivalent conversion just in case but we probably wanna clean it up later if that's the case. It would probably be best to route this with other related updates through the workqueue tree. Lightly tested. Signed-off-by: Tejun Heo t...@kernel.org Cc: Greg Kroah-Hartman gre...@linuxfoundation.org Cc: linux-usb@vger.kernel.org --- drivers/usb/core/hub.c | 13 ++--- drivers/usb/core/hub.h | 1 + 2 files changed, 11 insertions(+), 3 deletions(-) diff --git a/drivers/usb/core/hub.c b/drivers/usb/core/hub.c index 64ea219..2bc61c0 100644 --- a/drivers/usb/core/hub.c +++ b/drivers/usb/core/hub.c @@ -1040,7 +1040,7 @@ static void hub_activate(struct usb_hub *hub, enum hub_activation_type type) */ if (type == HUB_INIT) { delay = hub_power_on(hub, false); - PREPARE_DELAYED_WORK(hub-init_work, hub_init_func2); + hub-init_workfn = hub_init_func2; schedule_delayed_work(hub-init_work, msecs_to_jiffies(delay)); @@ -1194,7 +1194,7 @@ static void hub_activate(struct usb_hub *hub, enum hub_activation_type type) /* Don't do a long sleep inside a workqueue routine */ if (type == HUB_INIT2) { - PREPARE_DELAYED_WORK(hub-init_work, hub_init_func3); + hub-init_workfn = hub_init_func3; schedule_delayed_work(hub-init_work, msecs_to_jiffies(delay)); return; /* Continues at init3: below */ @@ -1634,6 +1634,13 @@ static void hub_disconnect(struct usb_interface *intf) kref_put(hub-kref, hub_release); } +static void hub_init_workfn(struct work_struct *work) +{ + struct usb_hub *hub = container_of(to_delayed_work(work), + struct usb_hub, init_work); + hub-init_workfn(work); +} + static int hub_probe(struct usb_interface *intf, const struct usb_device_id *id) { struct usb_host_interface *desc; @@ -1728,7 +1735,7 @@ descriptor_error: hub-intfdev = intf-dev; hub-hdev = hdev; INIT_DELAYED_WORK(hub-leds, led_work); - INIT_DELAYED_WORK(hub-init_work, NULL); + INIT_DELAYED_WORK(hub-init_work, hub_init_workfn); usb_get_intf(intf); usb_set_intfdata (intf, hub); diff --git a/drivers/usb/core/hub.h b/drivers/usb/core/hub.h index df629a3..ef81463 100644 --- a/drivers/usb/core/hub.h +++ b/drivers/usb/core/hub.h @@ -72,6 +72,7 @@ struct usb_hub { unsignedhas_indicators:1; u8 indicator[USB_MAXCHILDREN]; struct delayed_work leds; + work_func_t init_workfn; struct delayed_work init_work; struct usb_port **ports; }; -- 1.8.5.3 -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH driver-core-linus] kernfs: make kernfs_deactivate() honor KERNFS_LOCKDEP flag
kernfs_deactivate() forgot to check whether KERNFS_LOCKDEP is set before performing lockdep annotations and ends up feeding uninitialized lockdep_map to lockdep triggering warning like the following on USB stick hotunplug. usb 1-2: USB disconnect, device number 2 INFO: trying to register non-static key. the code is fine but needs lockdep annotation. turning off the locking correctness validator. CPU: 1 PID: 62 Comm: khubd Not tainted 3.13.0-work+ #82 Hardware name: empty empty/S3992, BIOS 080011 10/26/2007 880065ca7f60 88013a4ffa08 81cfb6bd 0002 88013a4ffac8 810f8530 88013a4fc710 0002 8801 82a3db50 0001 88013a4fc710 Call Trace: [81cfb6bd] dump_stack+0x4e/0x7a [810f8530] __lock_acquire+0x1910/0x1e70 [810f931a] lock_acquire+0x9a/0x1d0 [8127c75e] kernfs_deactivate+0xee/0x130 [8127d4c8] kernfs_addrm_finish+0x38/0x60 [8127d701] kernfs_remove_by_name_ns+0x51/0xa0 [8127b4f1] remove_files.isra.1+0x41/0x80 [8127b7e7] sysfs_remove_group+0x47/0xa0 [8127b873] sysfs_remove_groups+0x33/0x50 [8177d66d] device_remove_attrs+0x4d/0x80 [8177e25e] device_del+0x12e/0x1d0 [819722c2] usb_disconnect+0x122/0x1a0 [819749b5] hub_thread+0x3c5/0x1290 [810c6a6d] kthread+0xed/0x110 [81d0a56c] ret_from_fork+0x7c/0xb0 Fix it by making kernfs_deactivate() perform lockdep annotations only if KERNFS_LOCKDEP is set. Signed-off-by: Tejun Heo t...@kernel.org Reported-by: Fabio Estevam feste...@gmail.com Reported-by: Alan Stern st...@rowland.harvard.edu --- fs/kernfs/dir.c | 12 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/fs/kernfs/dir.c b/fs/kernfs/dir.c index 5104cf5..bd6e18b 100644 --- a/fs/kernfs/dir.c +++ b/fs/kernfs/dir.c @@ -187,19 +187,23 @@ static void kernfs_deactivate(struct kernfs_node *kn) kn-u.completion = (void *)wait; - rwsem_acquire(kn-dep_map, 0, 0, _RET_IP_); + if (kn-flags KERNFS_LOCKDEP) + rwsem_acquire(kn-dep_map, 0, 0, _RET_IP_); /* atomic_add_return() is a mb(), put_active() will always see * the updated kn-u.completion. */ v = atomic_add_return(KN_DEACTIVATED_BIAS, kn-active); if (v != KN_DEACTIVATED_BIAS) { - lock_contended(kn-dep_map, _RET_IP_); + if (kn-flags KERNFS_LOCKDEP) + lock_contended(kn-dep_map, _RET_IP_); wait_for_completion(wait); } - lock_acquired(kn-dep_map, _RET_IP_); - rwsem_release(kn-dep_map, 1, _RET_IP_); + if (kn-flags KERNFS_LOCKDEP) { + lock_acquired(kn-dep_map, _RET_IP_); + rwsem_release(kn-dep_map, 1, _RET_IP_); + } } /** -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [usb-storage] UAS hangs khubd on USB disconnect
Hello, guys. (cc'ing Greg) On Fri, Dec 13, 2013 at 01:19:36PM -0500, Alan Stern wrote: On Fri, 13 Dec 2013, Sarah Sharp wrote: Given the way things work now, I suspect these warnings are truly harmless. We could simply get rid of the WARN in sysfs_remove_group. The alternative is to call device_del for SCSI targets earlier on, such as when their hosts are unregistered. I don't know how James would feel about this approach. It would be difficult because targets use their own reference counts instead of relying on the usual device refcounting mechanism. Thanks for looking into this. I think just getting rid of the WARN would be sufficient. Can you make a patch for that? Easily. The downside is that there would no longer be any warning when someone tries to remove a wrong subdirectory by mistake. The patch still won't help with the UAS issues with scsi_init_shared_tag_map though. I wasn't clear on the reason for that problem. Does it also arise from late device_del for scsi_target? I could try to change the way that works, if anybody (Hans?) would like to test it. While the recent sysfs changes made this issue more visible, Greg wants to make sure that devices are removed from leaf up in all cases and keep the warning to ensure that. Would there be a way fix SCSI removal ordering? Thanks. -- tejun -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [3.8-rc3 - 3.8-rc4 regression] Re: [PATCH] module, async: async_synchronize_full() on module init iff async is used
Hello, On Tue, Dec 03, 2013 at 08:28:43AM -0600, Josh Hunt wrote: You're right. Thanks for pointing this out. I did not realize there was a bug in the init script. The version of initramfs-tools I was using had the following bug: https://bugs.launchpad.net/ubuntu/+source/initramfs-tools/+bug/1215911 Updating to 0.99ubuntu13.4 of initramfs-tools resolved my boot hangs. I did try using the workaround as suggested by Linus. In my setup the dm_init() code was hit, however it still appeared to be too late at times. I also tried moving the call to async_synchronize_full() above the for loop and it still had the same issue (patch attached.) Out of around 10 reboot tests it failed to find root 1 or 2 times. The ubuntu scripts don't ever actually call do_mount() if it can't find the device. It seems to rely on some udev functionality to tell it when the device is present, and if that fails it just bails out. This change has introduced a regression. However, I only noticed it b/c my init script had a bug which caused it not to wait around for the device to appear. Hmmm so, read the bug report, digged and asked around a bit. Here's the root problem - ubuntu's initramfs uses a tool to wait for the root device which uses libudev to listen for the device event; unfortunately, its rx buffer is not set large enough and the receiver isn't fast enough, which means that netlink broadcast messages from the kernel can overrun the buffer. When that happens, it sets an error on the socket, so the next recv fails with -ENOBUFS. If that happens, the wait for root aborts immediately and initramfs proceeds to mount non-existent root device. The only thing which changes by these patches is the timing of events. The problem likely wasn't as exposed before because things were slow enough so that either the messages could be consumed fast enough or there's enough delay between libata module load and the root device wait hiding the bug in the wait logic. So, yeah, it's a full blown timing bug. I'm not sure what we can do to work around from kernel side except for randomly slowing things down or forcefully enlarging rx buffer size. There really is no interlocking to take advantage of. :( Thanks. -- tejun -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v1 1/2] lib/scatterlist: export sg_miter_skip()
On Tue, Nov 26, 2013 at 12:43:37PM +0800, Ming Lei wrote: sg_copy_buffer() can't meet demand for some drrivers(such usb mass storage), so we have to use the sg_miter_* APIs to access sg buffer, then need export sg_miter_skip() for these drivers. The API is needed for converting to sg_miter_* APIs in USB storage driver for accessing sg buffer. Acked-by: Andrew Morton a...@linux-foundation.org Cc: FUJITA Tomonori fujita.tomon...@lab.ntt.co.jp Cc: Tejun Heo t...@kernel.org Cc: Jens Axboe ax...@kernel.dk Signed-off-by: Ming Lei ming@canonical.com Reviewed-by: Tejun Heo t...@kernel.org I suppose this should go through -mm? Thanks. -- tejun -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [3.8-rc3 - 3.8-rc4 regression] Re: [PATCH] module, async: async_synchronize_full() on module init iff async is used
Hello, On Tue, Nov 26, 2013 at 04:12:41PM -0600, Josh Hunt wrote: I should have clarified that I'm not using dm/md in my setup. I know the modules are getting loaded in the log I attached, but root is not a md/dm device. Can you please still try it? The init script is broken and we're now just trying to restore just enough of the old behavior so that the issue is not exposed. The boot script in use seems to load md/dm modules after storage drivers and use their termination as the signal for storage ready, so it could be a good enough bandaid even if you're not using dm/md. Thanks. -- tejun -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 39/51] DMA-API: others: use dma_set_coherent_mask()
On Fri, Sep 20, 2013 at 12:11:38AM +0100, Russell King wrote: The correct way for a driver to specify the coherent DMA mask is not to directly access the field in the struct device, but to use dma_set_coherent_mask(). Only arch and bus code should access this member directly. Convert all direct write accesses to using the correct API. Signed-off-by: Russell King rmk+ker...@arm.linux.org.uk Acked-by: Tejun Heo t...@kernel.org The patch is pretty widely spread. I don't mind how it gets routed but what's the plan? Thanks. -- tejun -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 39/51] DMA-API: others: use dma_set_coherent_mask()
On Fri, Sep 20, 2013 at 07:16:52AM -0500, Tejun Heo wrote: On Fri, Sep 20, 2013 at 12:11:38AM +0100, Russell King wrote: The correct way for a driver to specify the coherent DMA mask is not to directly access the field in the struct device, but to use dma_set_coherent_mask(). Only arch and bus code should access this member directly. Convert all direct write accesses to using the correct API. Signed-off-by: Russell King rmk+ker...@arm.linux.org.uk Acked-by: Tejun Heo t...@kernel.org The patch is pretty widely spread. I don't mind how it gets routed but what's the plan? Hmm... maybe hte pata_ixp4xx_cf.c part should be moved to the one which updates pata_octeon_cf.c? Thanks. -- tejun -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 39/51] DMA-API: others: use dma_set_coherent_mask()
Hey, On Fri, Sep 20, 2013 at 03:00:18PM +0100, Russell King - ARM Linux wrote: Another would be if subsystem maintainers are happy that I carry them, I can add the acks, and then later on towards the end of the cycle, provide a branch subsystem maintainers could pull. Or... if you can think of something easier... I'm happy with the latter method and it's likely that you'll end up carrying at least some of the patches through your tree anyway. Please feel free to add my acks to all libata related patches and carry them through your tree. Thanks and have fun routing. -- tejun -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [3.8-rc3 - 3.8-rc4 regression] Re: [PATCH] module, async: async_synchronize_full() on module init iff async is used
Hello, Jonathan. On Mon, Aug 12, 2013 at 12:04:11AM -0700, Jonathan Nieder wrote: My laptop fails to boot[1] with the message 'Volume group data not found'. Bisects to v3.8-rc4~17 (the above commit). Reverting that commit on top of current master (d92581fcad18, 2013-08-10) produces a working kernel. dmesg output from that working kernel attached. More details, including .config, at [2]. Any ideas for tracking this down? Which initrd / boot script are you using? It looks like lvm assemble scripts are running before sdX are detected leading to volume assembly failure. Before the patch, any module loading would end up synchronizing async probes but after the patch modprobe invocations which don't schedule them won't be. Does your boot script happen to run multiple modprobes in parallel and proceed to configure lvm without waiting for modprobes of libata drivers to finish? Thanks. -- tejun -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] chipidea: Use devm_request_irq()
Hello, On Wed, Jul 31, 2013 at 11:44:34AM +0200, Uwe Kleine-König wrote: OK, so the possible problem is that remove is called while the irq is still active. That means you have to assert that all resources the irq If your driver destruction path is running while your irq handler is still running, it's a crappy / broken driver. You need a deactivation step whether you're using devm or not. IRQs can be shared and the device should be in a quiesced state before the driver detaches itself. Note that you can queue deactivation routine using devm. For an example, please take a look at drivers/ata/libata-core.c::ata_host_start(). handler is using (e.g. ioremap, clk_prepare_enable) are only freed *after* the irq is done. For ioremap that means it must be done using devm_ioremap_resource. For a clock it's not that easy because the irq handler has to assert that a used clk is kept prepared which can only be done using clk_prepare which in turn is not allowed in an irq handler. Hmm. So the only possible fixes are - devm* can be told to also care about clk_disable_unprepare - after disabling irqs in the remove callback wait for all active irqs to be done. (i.e. call synchronize_irq(irq)) - don't use devm_request_irq Again, the right thing to do is having a proper deactivation step. This is nothing devm can do automatically. There's no way for it to find out that the device is actually quiesced. Let's say it waits for the current instance of irq handler to finish. How would it know that it won't start again between the flushing of the current instance and irq deregistration. Add an explicit deactivation step using devres_alloc(). I'm not sure that devm_ guarantees any ordering in the cleanups it does so I'd not like to rely on the first option either, if there were some guarantee of that it'd help. The nice thing about explicitly freeing the IRQ is that you can tell that all this stuff is safe by inspection. devm_* at least uses list_for_each_entry_reverse (drivers/base/devres.c:release_nodes()). Without this guarantee devm_ would not make much sense IMHO. devm guarantees that the destruction callbacks are called in the reverse order of registration. Thanks. -- tejun -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] chipidea: Use devm_request_irq()
On Wed, Jul 31, 2013 at 12:28:53PM +0200, Uwe Kleine-König wrote: Well, you cannot avoid assuming that the irq is still active when your driver's remove callback is called. But I agree about crappyness at the end of the destruction path. The problem is that crap is as easy as: probe(..) { clk = devm_get_clk(...); clk_prepare_enable(clk); writel(1, base + IRQENABLE); devm_request_irq(...); } remove(..) { writel(0, base + IRQENABLE); clk_disable_unprepare(clk); } and I think there are more and more drivers doing that. Oh, so, the problem is that the driver is mixing devm and non-devm resource management and ending up messing the order of shutdown? Well, the obvious solution is using devm for clk too. devm does provide constructs to build custom destruction sequence so that such calls can be mixed but it's a bit of hassle and mostly meant for driver midlayers (libata, firewire, usb and so on) so that they can provide nicely wrapped init/exit helpers for low level drivers. In general, partial devm conversion can easily lead to subtle shutdown order problems and it's best to go all the way. Thanks. -- tejun -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] chipidea: Use devm_request_irq()
Hello, On Wed, Jul 31, 2013 at 12:18:53PM +0100, Mark Brown wrote: I'm not sure I understand how this relates the problem. The main issue here is that for the shared IRQ case quiescing the device doesn't make any difference since one of the other users of the interrupt could cause the interrupt handler to be called regardless of what the hardware is doing. This means that we need to guarantee that anything the interrupt handler relies on has not been deallocated before the interrupt handler is unregistered. Yeah, if all resources are allocated using devm - note that you can hook in non-devm resources using devres_alloc() - all resources which would be necessary for the interrupt handler would have been allocated before the irq was allocated, right? And thus they'll of course released after the IRQ is freed. The problem arises when devm and non-devm releases are mixed as non-devm ones would happen before all devm ones messing up the release sequencing. OK, that's helpful. It'd be good to document this if it's something the API is intending to guarantee, though - devres.txt doesn't mention Oh, it's definitely guaranteed. Nothing would work otherwise. this and it's not something I'd intuitively expect to be the case. Oops... I guess I forgot to mention that. Care to submit a patch? Thanks. -- tejun -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] chipidea: Use devm_request_irq()
On Wed, Jul 31, 2013 at 12:50:27PM +0100, Mark Brown wrote: Most things would work just fine - most of the uses of devm_ are just resource allocations that can safely be freed in essentially any order. It doesn't really matter if you free the driver's private structure before you free the clock that's pointing to it or whatever since neither has any real connection to the other. If you have DMA / IRQ / command engine deactivations in devm path which often is the case with full conversions, freeing any resources including DMA areas and host private data in the wrong order is a horrible idea. It's worse as it won't really be noticeable in most cases. Thanks. -- tejun -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] chipidea: Use devm_request_irq()
Hello, On Wed, Jul 31, 2013 at 02:27:08PM +0100, Mark Brown wrote: It's really only interrupts that affect most devices - if there's DMA or anything going on after the remove() then as you said earlier the driver is probably doing something wrong. Hmmm... it depends on the specific driver is converted but if the deactivation sequence - shutting down of command engine - is also handled by devm as in libata and if you have non-devres resource free in the exit path, you have the same problem. Again, in general, tearing things down in the order which isn't the reverse of allocation is a bad idea. Adhering to the order isn't that hard and will result in far less craziness in the long term. Thanks. -- tejun -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] chipidea: Use devm_request_irq()
On Wed, Jul 31, 2013 at 02:57:51PM +0100, Mark Brown wrote: That's the only API I've ever heard of doing that. Everything else is just using it to do deallocation. I'm not sure why or what you're trying to argue here but take a look at devm_pwm_release() for example. It calls back into low level driver free routine. Are you arguing that it'd be a good idea to release pci regions before this is complete? It's just stupid to do any differently. There's nothing to argue about. -- tejun -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] chipidea: Use devm_request_irq()
hello, On Wed, Jul 31, 2013 at 09:55:26PM +0800, Peter Chen wrote: I think the main point is we should allocate managed resource which is used at interrupt handler before devm_request_irq, and all resources used at interrupt handler should be managed. If we use non-managed resource at interrupt handler, but using managed interrupt handler, things still will go wrong if there is an odd (unexpected) interrupt after we finish deactivation at removal. In general, applying devm partially isn't a good idea. It's very easy to get into trouble thanks to release order dependency which isn't immediately noticeable and there have been actual bugs caused by that. The strategies which seem to work are either * Convert everything over to devm by wrapping deactivation in a devres callback too. As long as your init sequence is sane (ie. irq shouldn't be request before things used by irq are ready). * Allocate all resources using devres but shut down the execution engine in the remove_one(). Again, as all releases are controlled by devres, you won't have to worry about messing up the release sequencing. Thanks. -- tejun -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] chipidea: Use devm_request_irq()
On Wed, Jul 31, 2013 at 04:25:23PM +0100, Mark Brown wrote: What I'm saying is that in essentially all the users I've seen devm is only being used for things like kfree() or clk_put() which aren't really connected in any way and can happen in any order. This (coupled with the lack of documentation that this is supported) is why people are nervous about anything that relies on ordering with this stuff - aside from ATA everything is just using this for straight frees. Yeah, sure, thank you very much for your input. It is of course strictly ordered and the documentation needs to be updated. -- tejun -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2] async: fix __lowest_in_progress()
083b804c4d3e1e3d0eace56bdbc0f674946d2847 (async: use workqueue for worker pool) made it possible that async jobs are moved from pending to running out-of-order. While pending async jobs will be queued and dispatched for execution in the same order, nothing guarantees they'll enter 1) move self to the running queue of async_run_entry_fn() in the same order. Before the conversion, async implemented its own worker pool. An async worker, upon being woken up, fetches the first item from the pending list, which kept the executing lists sorted. The conversion to workqueue was done by adding work_struct to each async_entry and async just schedules the work item. The queueing and dispatching of such work items are still in order but now each worker thread is associated with a specific async_entry and moves that specific async_entry to the executing list. So, depending on which worker reaches that point earlier, which is non-deterministic, we may end up moving an async_entry with larger cookie before one with smaller one. This broke __lowest_in_progress(). running-domain may not be properly sorted and is not guaranteed to contain lower cookies than pending list when not empty. Fix it by ensuring sort-inserting to the running list and always looking at both pending and running when trying to determine the lowest cookie. Over time, the async synchronization implementation became quite messy. We better restructure it such that each async_entry is linked to two lists - one global and one per domain - and not move it when execution starts. There's no reason to distinguish pending and running. They behave the same for synchronization purposes. v2: Description updated to better explain why it's broken. Signed-off-by: Tejun Heo t...@kernel.org Cc: Arjan van de Ven ar...@linux.intel.com Cc: sta...@vger.kernel.org --- Linus, I've updated the description to better explain why it's broken. The code is ugly but cleanup patches are already ready, so it will be cleaned up during 3.9-rc1. How should this be routed? Thanks. kernel/async.c | 27 --- 1 file changed, 20 insertions(+), 7 deletions(-) --- a/kernel/async.c +++ b/kernel/async.c @@ -86,18 +86,27 @@ static atomic_t entry_count; */ static async_cookie_t __lowest_in_progress(struct async_domain *running) { + async_cookie_t first_running = next_cookie; /* infinity value */ + async_cookie_t first_pending = next_cookie; /* ditto */ struct async_entry *entry; + /* +* Both running and pending lists are sorted but not disjoint. +* Take the first cookies from both and return the min. +*/ if (!list_empty(running-domain)) { entry = list_first_entry(running-domain, typeof(*entry), list); - return entry-cookie; + first_running = entry-cookie; } - list_for_each_entry(entry, async_pending, list) - if (entry-running == running) - return entry-cookie; + list_for_each_entry(entry, async_pending, list) { + if (entry-running == running) { + first_pending = entry-cookie; + break; + } + } - return next_cookie; /* infinity value */ + return min(first_running, first_pending); } static async_cookie_t lowest_in_progress(struct async_domain *running) @@ -118,13 +127,17 @@ static void async_run_entry_fn(struct wo { struct async_entry *entry = container_of(work, struct async_entry, work); + struct async_entry *pos; unsigned long flags; ktime_t uninitialized_var(calltime), delta, rettime; struct async_domain *running = entry-running; - /* 1) move self to the running queue */ + /* 1) move self to the running queue, make sure it stays sorted */ spin_lock_irqsave(async_lock, flags); - list_move_tail(entry-list, running-domain); + list_for_each_entry_reverse(pos, running-domain, list) + if (entry-cookie pos-cookie) + break; + list_move_tail(entry-list, pos-list); spin_unlock_irqrestore(async_lock, flags); /* 2) run (and print duration) */ -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 2/2] block: don't request module during elevator init
From 21c3c5d2800733b7a276725b8e1ae49a694adc1a Mon Sep 17 00:00:00 2001 From: Tejun Heo t...@kernel.org Date: Tue, 22 Jan 2013 16:48:03 -0800 Block layer allows selecting an elevator which is built as a module to be selected as system default via kernel param elevator=. This is achieved by automatically invoking request_module() whenever a new block device is initialized and the elevator is not available. This led to an interesting deadlock problem involving async and module init. Block device probing running off an async job invokes request_module(). While the module is being loaded, it performs async_synchronize_full() which ends up waiting for the async job which is already waiting for request_module() to finish, leading to deadlock. Invoking request_module() from deep in block device init path is already nasty in itself. It seems best to avoid these situations from the beginning by moving on-demand module loading out of block init path. The previous patch made sure that the default elevator module is loaded early during boot if available. This patch removes on-demand loading of the default elevator from elevator init path. As the module would have been loaded during boot, userland-visible behavior difference should be minimal. For more details, please refer to the following thread. http://thread.gmane.org/gmane.linux.kernel/1420814 v2: The bool parameter was named @request_module which conflicted with request_module(). This built okay w/ CONFIG_MODULES because request_module() was defined as a macro. W/o CONFIG_MODULES, it causes build breakage. Rename the parameter to @try_loading. Reported by Fengguang. Signed-off-by: Tejun Heo t...@kernel.org Cc: Jens Axboe ax...@kernel.dk Cc: Arjan van de Ven ar...@linux.intel.com Cc: Linus Torvalds torva...@linux-foundation.org Cc: Alex Riesen raa.l...@gmail.com Cc: Fengguang Wu fengguang...@intel.com --- Minor revision to fix build breakage on !CONFIG_MODULES reported by Fengguang. The patch has been queued to wq/for-3.9-async-deadlock-fixes. Thanks. block/elevator.c | 19 --- 1 file changed, 12 insertions(+), 7 deletions(-) diff --git a/block/elevator.c b/block/elevator.c index c2d61d5..603b2c1 100644 --- a/block/elevator.c +++ b/block/elevator.c @@ -100,14 +100,14 @@ static void elevator_put(struct elevator_type *e) module_put(e-elevator_owner); } -static struct elevator_type *elevator_get(const char *name) +static struct elevator_type *elevator_get(const char *name, bool try_loading) { struct elevator_type *e; spin_lock(elv_list_lock); e = elevator_find(name); - if (!e) { + if (!e try_loading) { spin_unlock(elv_list_lock); request_module(%s-iosched, name); spin_lock(elv_list_lock); @@ -207,25 +207,30 @@ int elevator_init(struct request_queue *q, char *name) q-boundary_rq = NULL; if (name) { - e = elevator_get(name); + e = elevator_get(name, true); if (!e) return -EINVAL; } + /* +* Use the default elevator specified by config boot param or +* config option. Don't try to load modules as we could be running +* off async and request_module() isn't allowed from async. +*/ if (!e *chosen_elevator) { - e = elevator_get(chosen_elevator); + e = elevator_get(chosen_elevator, false); if (!e) printk(KERN_ERR I/O scheduler %s not found\n, chosen_elevator); } if (!e) { - e = elevator_get(CONFIG_DEFAULT_IOSCHED); + e = elevator_get(CONFIG_DEFAULT_IOSCHED, false); if (!e) { printk(KERN_ERR Default I/O scheduler not found. \ Using noop.\n); - e = elevator_get(noop); + e = elevator_get(noop, false); } } @@ -967,7 +972,7 @@ int elevator_change(struct request_queue *q, const char *name) return -ENXIO; strlcpy(elevator_name, name, sizeof(elevator_name)); - e = elevator_get(strstrip(elevator_name)); + e = elevator_get(strstrip(elevator_name), true); if (!e) { printk(KERN_ERR elevator: type %s not found\n, elevator_name); return -EINVAL; -- 1.8.1 -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 1/2] init, block: try to load default elevator module early during boot
From bb813f4c933ae9f887a014483690d5f8b8ec05e1 Mon Sep 17 00:00:00 2001 From: Tejun Heo t...@kernel.org Date: Fri, 18 Jan 2013 14:05:56 -0800 This patch adds default module loading and uses it to load the default block elevator. During boot, it's called right after initramfs or initrd is made available and right before control is passed to userland. This ensures that as long as the modules are available in the usual places in initramfs, initrd or the root filesystem, the default modules are loaded as soon as possible. This will replace the on-demand elevator module loading from elevator init path. v2: Fixed build breakage when !CONFIG_BLOCK. Reported by kbuild test robot. Signed-off-by: Tejun Heo t...@kernel.org Cc: Jens Axboe ax...@kernel.dk Cc: Arjan van de Ven ar...@linux.intel.com Cc: Linus Torvalds torva...@linux-foundation.org Cc: Alex Riesen raa.l...@gmail.com Cc: Fengguang We fengguang...@intel.com --- Minor revision to fix build breakage on !CONFIG_BLOCK reported by Fengguang. The patch is queued to wq/for-3.9-async-deadlock-fixes. Thanks. block/elevator.c | 16 include/linux/elevator.h | 5 + include/linux/init.h | 1 + init/do_mounts_initrd.c | 3 +++ init/initramfs.c | 8 +++- init/main.c | 16 6 files changed, 48 insertions(+), 1 deletion(-) diff --git a/block/elevator.c b/block/elevator.c index 9edba1b..c2d61d5 100644 --- a/block/elevator.c +++ b/block/elevator.c @@ -136,6 +136,22 @@ static int __init elevator_setup(char *str) __setup(elevator=, elevator_setup); +/* called during boot to load the elevator chosen by the elevator param */ +void __init load_default_elevator_module(void) +{ + struct elevator_type *e; + + if (!chosen_elevator[0]) + return; + + spin_lock(elv_list_lock); + e = elevator_find(chosen_elevator); + spin_unlock(elv_list_lock); + + if (!e) + request_module(%s-iosched, chosen_elevator); +} + static struct kobj_type elv_ktype; static struct elevator_queue *elevator_alloc(struct request_queue *q, diff --git a/include/linux/elevator.h b/include/linux/elevator.h index c03af76..1866206 100644 --- a/include/linux/elevator.h +++ b/include/linux/elevator.h @@ -138,6 +138,7 @@ extern void elv_drain_elevator(struct request_queue *); /* * io scheduler registration */ +extern void __init load_default_elevator_module(void); extern int elv_register(struct elevator_type *); extern void elv_unregister(struct elevator_type *); @@ -206,5 +207,9 @@ enum { INIT_LIST_HEAD((rq)-csd.list);\ } while (0) +#else /* CONFIG_BLOCK */ + +static inline void load_default_elevator_module(void) { } + #endif /* CONFIG_BLOCK */ #endif diff --git a/include/linux/init.h b/include/linux/init.h index a799273..9230c94 100644 --- a/include/linux/init.h +++ b/include/linux/init.h @@ -161,6 +161,7 @@ extern unsigned int reset_devices; /* used by init/main.c */ void setup_arch(char **); void prepare_namespace(void); +void __init load_default_modules(void); extern void (*late_time_init)(void); diff --git a/init/do_mounts_initrd.c b/init/do_mounts_initrd.c index 5e4ded5..dfe606a 100644 --- a/init/do_mounts_initrd.c +++ b/init/do_mounts_initrd.c @@ -57,6 +57,9 @@ static void __init handle_initrd(void) sys_mkdir(/old, 0700); sys_chdir(/old); + /* try loading default modules from initrd */ + load_default_modules(); + /* * In case that a resume from disk is carried out by linuxrc or one of * its children, we need to tell the freezer not to wait for us. diff --git a/init/initramfs.c b/init/initramfs.c index 84c6bf1..a67ef9d 100644 --- a/init/initramfs.c +++ b/init/initramfs.c @@ -592,7 +592,7 @@ static int __init populate_rootfs(void) initrd_end - initrd_start); if (!err) { free_initrd(); - return 0; + goto done; } else { clean_rootfs(); unpack_to_rootfs(__initramfs_start, __initramfs_size); @@ -607,6 +607,7 @@ static int __init populate_rootfs(void) sys_close(fd); free_initrd(); } + done: #else printk(KERN_INFO Unpacking initramfs...\n); err = unpack_to_rootfs((char *)initrd_start, @@ -615,6 +616,11 @@ static int __init populate_rootfs(void) printk(KERN_EMERG Initramfs unpacking failed: %s\n, err); free_initrd(); #endif + /* +* Try loading default modules from initramfs. This gives +* us a chance to load before device_initcalls. +*/ + load_default_modules(); } return 0; } diff --git a/init/main.c b/init/main.c index baf1f0f..18efadb 100644 --- a/init
[PATCH 1/5] workqueue: set PF_WQ_WORKER on rescuers
From 111c225a5f8d872bc9327ada18d13b75edaa34be Mon Sep 17 00:00:00 2001 From: Tejun Heo t...@kernel.org Date: Thu, 17 Jan 2013 17:16:24 -0800 PF_WQ_WORKER is used to tell scheduler that the task is a workqueue worker and needs wq_worker_sleeping/waking_up() invoked on it for concurrency management. As rescuers never participate in concurrency management, PF_WQ_WORKER wasn't set on them. There's a need for an interface which can query whether %current is executing a work item and if so which. Such interface requires a way to identify all tasks which may execute work items and PF_WQ_WORKER will be used for that. As all normal workers always have PF_WQ_WORKER set, we only need to add it to rescuers. As rescuers start with WORKER_PREP but never clear it, it's always NOT_RUNNING and there's no need to worry about it interfering with concurrency management even if PF_WQ_WORKER is set; however, unlike normal workers, rescuers currently don't have its worker struct as kthread_data(). It uses the associated workqueue_struct instead. This is problematic as wq_worker_sleeping/waking_up() expect struct worker at kthread_data(). This patch adds worker-rescue_wq and start rescuer kthreads with worker struct as kthread_data and sets PF_WQ_WORKER on rescuers. Signed-off-by: Tejun Heo t...@kernel.org Cc: Linus Torvalds torva...@linux-foundation.org --- This one is the same as last time. kernel/workqueue.c | 35 --- 1 file changed, 28 insertions(+), 7 deletions(-) diff --git a/kernel/workqueue.c b/kernel/workqueue.c index 7967f34..6b99ac7 100644 --- a/kernel/workqueue.c +++ b/kernel/workqueue.c @@ -149,6 +149,9 @@ struct worker { /* for rebinding worker to CPU */ struct work_struct rebind_work;/* L: for busy worker */ + + /* used only by rescuers to point to the target workqueue */ + struct workqueue_struct *rescue_wq; /* I: the workqueue to rescue */ }; struct worker_pool { @@ -763,12 +766,20 @@ struct task_struct *wq_worker_sleeping(struct task_struct *task, unsigned int cpu) { struct worker *worker = kthread_data(task), *to_wakeup = NULL; - struct worker_pool *pool = worker-pool; - atomic_t *nr_running = get_pool_nr_running(pool); + struct worker_pool *pool; + atomic_t *nr_running; + /* +* Rescuers, which may not have all the fields set up like normal +* workers, also reach here, let's not access anything before +* checking NOT_RUNNING. +*/ if (worker-flags WORKER_NOT_RUNNING) return NULL; + pool = worker-pool; + nr_running = get_pool_nr_running(pool); + /* this can only happen on the local cpu */ BUG_ON(cpu != raw_smp_processor_id()); @@ -2357,7 +2368,7 @@ sleep: /** * rescuer_thread - the rescuer thread function - * @__wq: the associated workqueue + * @__rescuer: self * * Workqueue rescuer thread function. There's one rescuer for each * workqueue which has WQ_RESCUER set. @@ -2374,20 +2385,27 @@ sleep: * * This should happen rarely. */ -static int rescuer_thread(void *__wq) +static int rescuer_thread(void *__rescuer) { - struct workqueue_struct *wq = __wq; - struct worker *rescuer = wq-rescuer; + struct worker *rescuer = __rescuer; + struct workqueue_struct *wq = rescuer-rescue_wq; struct list_head *scheduled = rescuer-scheduled; bool is_unbound = wq-flags WQ_UNBOUND; unsigned int cpu; set_user_nice(current, RESCUER_NICE_LEVEL); + + /* +* Mark rescuer as worker too. As WORKER_PREP is never cleared, it +* doesn't participate in concurrency management. +*/ + rescuer-task-flags |= PF_WQ_WORKER; repeat: set_current_state(TASK_INTERRUPTIBLE); if (kthread_should_stop()) { __set_current_state(TASK_RUNNING); + rescuer-task-flags = ~PF_WQ_WORKER; return 0; } @@ -2431,6 +2449,8 @@ repeat: spin_unlock_irq(gcwq-lock); } + /* rescuers should never participate in concurrency management */ + WARN_ON_ONCE(!(rescuer-flags WORKER_NOT_RUNNING)); schedule(); goto repeat; } @@ -3266,7 +3286,8 @@ struct workqueue_struct *__alloc_workqueue_key(const char *fmt, if (!rescuer) goto err; - rescuer-task = kthread_create(rescuer_thread, wq, %s, + rescuer-rescue_wq = wq; + rescuer-task = kthread_create(rescuer_thread, rescuer, %s, wq-name); if (IS_ERR(rescuer-task)) goto err; -- 1.8.0.2 -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/5] workqueue: rename kernel/workqueue_sched.h to kernel/workqueue_internal.h
From ea138446e51f7bfe55cdeffa3f1dd9cafc786bd8 Mon Sep 17 00:00:00 2001 From: Tejun Heo t...@kernel.org Date: Fri, 18 Jan 2013 14:05:55 -0800 Workqueue wants to expose more interface internal to kernel/. Instead of adding a new header file, repurpose kernel/workqueue_sched.h. Rename it to workqueue_internal.h and add include protector. This patch doesn't introduce any functional changes. Signed-off-by: Tejun Heo t...@kernel.org Cc: Linus Torvalds torva...@linux-foundation.org Cc: Ingo Molnar mi...@redhat.com Cc: Peter Zijlstra pet...@infradead.org --- Ingo, Peter, changes to scheduler is trivial. If there's no objection, I'd like to route this with the rest of workqueue and async changes. Thanks. kernel/sched/core.c | 2 +- kernel/workqueue.c | 2 +- kernel/workqueue_internal.h | 18 ++ kernel/workqueue_sched.h| 9 - 4 files changed, 20 insertions(+), 11 deletions(-) create mode 100644 kernel/workqueue_internal.h delete mode 100644 kernel/workqueue_sched.h diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 257002c..c6737f4f 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -83,7 +83,7 @@ #endif #include sched.h -#include ../workqueue_sched.h +#include ../workqueue_internal.h #include ../smpboot.h #define CREATE_TRACE_POINTS diff --git a/kernel/workqueue.c b/kernel/workqueue.c index 6b99ac7..b4e9206 100644 --- a/kernel/workqueue.c +++ b/kernel/workqueue.c @@ -43,7 +43,7 @@ #include linux/idr.h #include linux/hashtable.h -#include workqueue_sched.h +#include workqueue_internal.h enum { /* diff --git a/kernel/workqueue_internal.h b/kernel/workqueue_internal.h new file mode 100644 index 000..b3ea6ad --- /dev/null +++ b/kernel/workqueue_internal.h @@ -0,0 +1,18 @@ +/* + * kernel/workqueue_internal.h + * + * Workqueue internal header file. Only to be included by workqueue and + * core kernel subsystems. + */ +#ifndef _KERNEL_WORKQUEUE_INTERNAL_H +#define _KERNEL_WORKQUEUE_INTERNAL_H + +/* + * Scheduler hooks for concurrency managed workqueue. Only to be used from + * sched.c and workqueue.c. + */ +void wq_worker_waking_up(struct task_struct *task, unsigned int cpu); +struct task_struct *wq_worker_sleeping(struct task_struct *task, + unsigned int cpu); + +#endif /* _KERNEL_WORKQUEUE_INTERNAL_H */ diff --git a/kernel/workqueue_sched.h b/kernel/workqueue_sched.h deleted file mode 100644 index 2d10fc9..000 --- a/kernel/workqueue_sched.h +++ /dev/null @@ -1,9 +0,0 @@ -/* - * kernel/workqueue_sched.h - * - * Scheduler hooks for concurrency managed workqueue. Only to be - * included from sched.c and workqueue.c. - */ -void wq_worker_waking_up(struct task_struct *task, unsigned int cpu); -struct task_struct *wq_worker_sleeping(struct task_struct *task, - unsigned int cpu); -- 1.8.0.2 -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 3/5] workqueue: move struct worker definition to workqueue_internal.h
From 2eaebdb33e1911c0cf3d44fd3596c42c6f502fab Mon Sep 17 00:00:00 2001 From: Tejun Heo t...@kernel.org Date: Fri, 18 Jan 2013 14:05:55 -0800 This will be used to implement an inline function to query whether %current is a workqueue worker and, if so, allow determining which work item it's executing. Signed-off-by: Tejun Heo t...@kernel.org Cc: Linus Torvalds torva...@linux-foundation.org --- kernel/workqueue.c | 32 +--- kernel/workqueue_internal.h | 37 + 2 files changed, 38 insertions(+), 31 deletions(-) diff --git a/kernel/workqueue.c b/kernel/workqueue.c index b4e9206..2ffa240 100644 --- a/kernel/workqueue.c +++ b/kernel/workqueue.c @@ -122,37 +122,7 @@ enum { * W: workqueue_lock protected. */ -struct global_cwq; -struct worker_pool; - -/* - * The poor guys doing the actual heavy lifting. All on-duty workers - * are either serving the manager role, on idle list or on busy hash. - */ -struct worker { - /* on idle list while idle, on busy hash table while busy */ - union { - struct list_headentry; /* L: while idle */ - struct hlist_node hentry; /* L: while busy */ - }; - - struct work_struct *current_work; /* L: work being processed */ - work_func_t current_func; /* L: current_work's fn */ - struct cpu_workqueue_struct *current_cwq; /* L: current_work's cwq */ - struct list_headscheduled; /* L: scheduled works */ - struct task_struct *task; /* I: worker task */ - struct worker_pool *pool; /* I: the associated pool */ - /* 64 bytes boundary on 64bit, 32 on 32bit */ - unsigned long last_active;/* L: last active timestamp */ - unsigned intflags; /* X: flags */ - int id; /* I: worker id */ - - /* for rebinding worker to CPU */ - struct work_struct rebind_work;/* L: for busy worker */ - - /* used only by rescuers to point to the target workqueue */ - struct workqueue_struct *rescue_wq; /* I: the workqueue to rescue */ -}; +/* struct worker is defined in workqueue_internal.h */ struct worker_pool { struct global_cwq *gcwq; /* I: the owning gcwq */ diff --git a/kernel/workqueue_internal.h b/kernel/workqueue_internal.h index b3ea6ad..02549fa 100644 --- a/kernel/workqueue_internal.h +++ b/kernel/workqueue_internal.h @@ -7,6 +7,43 @@ #ifndef _KERNEL_WORKQUEUE_INTERNAL_H #define _KERNEL_WORKQUEUE_INTERNAL_H +#include linux/workqueue.h + +struct global_cwq; +struct worker_pool; + +/* + * The poor guys doing the actual heavy lifting. All on-duty workers are + * either serving the manager role, on idle list or on busy hash. For + * details on the locking annotation (L, I, X...), refer to workqueue.c. + * + * Only to be used in workqueue and async. + */ +struct worker { + /* on idle list while idle, on busy hash table while busy */ + union { + struct list_headentry; /* L: while idle */ + struct hlist_node hentry; /* L: while busy */ + }; + + struct work_struct *current_work; /* L: work being processed */ + work_func_t current_func; /* L: current_work's fn */ + struct cpu_workqueue_struct *current_cwq; /* L: current_work's cwq */ + struct list_headscheduled; /* L: scheduled works */ + struct task_struct *task; /* I: worker task */ + struct worker_pool *pool; /* I: the associated pool */ + /* 64 bytes boundary on 64bit, 32 on 32bit */ + unsigned long last_active;/* L: last active timestamp */ + unsigned intflags; /* X: flags */ + int id; /* I: worker id */ + + /* for rebinding worker to CPU */ + struct work_struct rebind_work;/* L: for busy worker */ + + /* used only by rescuers to point to the target workqueue */ + struct workqueue_struct *rescue_wq; /* I: the workqueue to rescue */ +}; + /* * Scheduler hooks for concurrency managed workqueue. Only to be used from * sched.c and workqueue.c. -- 1.8.0.2 -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 4/5] workqueue: implement current_is_async()
From 84b233adcca3cacd5cfa8013a5feda7a3db4a9af Mon Sep 17 00:00:00 2001 From: Tejun Heo t...@kernel.org Date: Fri, 18 Jan 2013 14:05:56 -0800 This function queries whether %current is an async worker executing an async item. This will be used to implement warning on synchronous request_module() from async workers. Signed-off-by: Tejun Heo t...@kernel.org --- include/linux/async.h | 1 + kernel/async.c | 14 ++ kernel/workqueue_internal.h | 11 +++ 3 files changed, 26 insertions(+) diff --git a/include/linux/async.h b/include/linux/async.h index 7a24fe9..345169c 100644 --- a/include/linux/async.h +++ b/include/linux/async.h @@ -52,4 +52,5 @@ extern void async_synchronize_full_domain(struct async_domain *domain); extern void async_synchronize_cookie(async_cookie_t cookie); extern void async_synchronize_cookie_domain(async_cookie_t cookie, struct async_domain *domain); +extern bool current_is_async(void); #endif diff --git a/kernel/async.c b/kernel/async.c index 9d31183..d9bf2a9 100644 --- a/kernel/async.c +++ b/kernel/async.c @@ -57,6 +57,8 @@ asynchronous and synchronous parts of the kernel. #include linux/slab.h #include linux/workqueue.h +#include workqueue_internal.h + static async_cookie_t next_cookie = 1; #define MAX_WORK 32768 @@ -337,3 +339,15 @@ void async_synchronize_cookie(async_cookie_t cookie) async_synchronize_cookie_domain(cookie, async_running); } EXPORT_SYMBOL_GPL(async_synchronize_cookie); + +/** + * current_is_async - is %current an async worker task? + * + * Returns %true if %current is an async worker task. + */ +bool current_is_async(void) +{ + struct worker *worker = current_wq_worker(); + + return worker worker-current_func == async_run_entry_fn; +} diff --git a/kernel/workqueue_internal.h b/kernel/workqueue_internal.h index 02549fa..cc35e7e 100644 --- a/kernel/workqueue_internal.h +++ b/kernel/workqueue_internal.h @@ -8,6 +8,7 @@ #define _KERNEL_WORKQUEUE_INTERNAL_H #include linux/workqueue.h +#include linux/kthread.h struct global_cwq; struct worker_pool; @@ -44,6 +45,16 @@ struct worker { struct workqueue_struct *rescue_wq; /* I: the workqueue to rescue */ }; +/** + * current_wq_worker - return struct worker if %current is a workqueue worker + */ +static inline struct worker *current_wq_worker(void) +{ + if (current-flags PF_WQ_WORKER) + return kthread_data(current); + return NULL; +} + /* * Scheduler hooks for concurrency managed workqueue. Only to be used from * sched.c and workqueue.c. -- 1.8.0.2 -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 5/5] async, kmod: warn on synchronous request_module() from async workers
From 4983f3b51e18d008956dd113e0ea2f252774cefc Mon Sep 17 00:00:00 2001 From: Tejun Heo t...@kernel.org Date: Fri, 18 Jan 2013 14:05:57 -0800 Synchronous requet_module() from an async worker can lead to deadlock because module init path may invoke async_synchronize_full(). The async worker waits for request_module() to complete and the module loading waits for the async task to finish. This bug happened in the block layer because of default elevator auto-loading. Block layer has been updated not to do default elevator auto-loading and it has been decided to disallow synchronous request_module() from async workers. Trigger WARN_ON_ONCE() on synchronous request_module() from async workers. For more details, please refer to the following thread. http://thread.gmane.org/gmane.linux.kernel/1420814 Signed-off-by: Tejun Heo t...@kernel.org Reported-by: Alex Riesen raa.l...@gmail.com Cc: Linus Torvalds torva...@linux-foundation.org Cc: Arjan van de Ven ar...@linux.intel.com --- kernel/kmod.c | 9 + 1 file changed, 9 insertions(+) diff --git a/kernel/kmod.c b/kernel/kmod.c index 1c317e3..ecd42b4 100644 --- a/kernel/kmod.c +++ b/kernel/kmod.c @@ -38,6 +38,7 @@ #include linux/suspend.h #include linux/rwsem.h #include linux/ptrace.h +#include linux/async.h #include asm/uaccess.h #include trace/events/module.h @@ -130,6 +131,14 @@ int __request_module(bool wait, const char *fmt, ...) #define MAX_KMOD_CONCURRENT 50 /* Completely arbitrary value - KAO */ static int kmod_loop_msg; + /* +* We don't allow synchronous module loading from async. Module +* init may invoke async_synchronize_full() which will end up +* waiting for this task which already is waiting for the module +* loading to complete, leading to a deadlock. +*/ + WARN_ON_ONCE(wait current_is_async()); + va_start(args, fmt); ret = vsnprintf(module_name, MODULE_NAME_LEN, fmt, args); va_end(args); -- 1.8.0.2 -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] init, block: try to load default elevator module early during boot
On Thu, Jan 17, 2013 at 10:46:47AM -0800, Linus Torvalds wrote: On Thu, Jan 17, 2013 at 10:38 AM, Tejun Heo t...@kernel.org wrote: Oh yeah, it's coming. I just wanted to finish something else first and, as turning on PF_WQ_WORKER on a rescuer thread has some chance of developing into an obscure difficult-to-trigger and diagnose problem, don't want to hurry it too much. Ok. I think I'll delay these things for 3.9 anyway, since the actual _problem_ people are seeing should be fixed with your other patch. So I guess it's not really all that critical any more. If you're okay with it, I'll route these two and the patches to add warning through a wq branch. There's already a wq/for-3.9 patch which am_i_async() can make use of, so it's gonna be easier this way. Thanks. -- tejun -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] init, block: try to load default elevator module early during boot
Hello, Linus. On Thu, Jan 17, 2013 at 10:05:53AM -0800, Linus Torvalds wrote: And here I was really hoping that there was a third patch in the series that added the warning... We don't currently have a am I an async worker helper function for the warning to use, which is something very much up your alley. Oh yeah, it's coming. I just wanted to finish something else first and, as turning on PF_WQ_WORKER on a rescuer thread has some chance of developing into an obscure difficult-to-trigger and diagnose problem, don't want to hurry it too much. Thanks. -- tejun -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] async: fix __lowest_in_progress()
Hello, On Thu, Jan 17, 2013 at 10:16:50AM -0800, Linus Torvalds wrote: Tejun, mind explaining this one a bit more to me? If ordering matters, and we have a running queue and a pending queue, how could the pending queue *ever* be lower than the running one? So, before being converted to workqueue, async spooled up its own workers and each worker would lock and move the first pending item to the executing list and everything was in order. The conversion to workqueue was done by adding work_struct to each async_entry and async just schedules the work item. The queueing and dispatching of such work items are still in order but now each worker thread is associated with a specific async_entry and move that specific async_entry to the executing list. So, depending on which worker reaches that point earlier, which is completely non-deterministic, we may end up moving an async_entry with larger cookie before one with smaller one. That implies that something was taken off the pending queue and put on the running queue out of order, right? And that in turn implies that there isn't much of a lowest ordering at all, so how could anybody even care about what lowest is? It seems to be a meaningless measure. The execution is still lowest first as workqueue would dispatch workers in queued order. It just is that they can reach the synchronization point at their own differing paces. So with that in mind, I don't see what semantics the first part of the patch fixes. Can you explain more? The problem with the code is that it's keeping a global pending list and domain-specific running lists. I don't know why it developed like this but even before workqueue conversion the code was weird. * It has sorted per-domain running list, so looking at the first item is easy. * It has sorted global pennding list, and looking for first item in a domain involves scanning it. Global syncing ends up scanning all per-domain running lists and domain syncing ends up scanning global pending list, when all we need is each async item to be queued on two lists - global and per-domain in-flight lists - and stay there until done. The posted patch is minimal fix while keeping the basic operation the same so that it doesn't disturb -stable too much. I'll prep a patch to redo synchronization for 3.9. Thanks. -- tejun -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/3] workqueue: set PF_WQ_WORKER on rescuers
PF_WQ_WORKER is used to tell scheduler that the task is a workqueue worker and needs wq_worker_sleeping/waking_up() invoked on it for concurrency management. As rescuers never participate in concurrency management, PF_WQ_WORKER wasn't set on them. There's a need for an interface which can query whether %current is executing a work item and if so which. Such interface requires a way to identify all tasks which may execute work items and PF_WQ_WORKER will be used for that. As all normal workers always have PF_WQ_WORKER set, we only need to add it to rescuers. As rescuers start with WORKER_PREP but never clear it, it's always NOT_RUNNING and there's no need to worry about it interfering with concurrency management even if PF_WQ_WORKER is set; however, unlike normal workers, rescuers currently don't have its worker struct as kthread_data(). It uses the associated workqueue_struct instead. This is problematic as wq_worker_sleeping/waking_up() expect struct worker at kthread_data(). This patch adds worker-rescue_wq and start rescuer kthreads with worker struct as kthread_data and sets PF_WQ_WORKER on rescuers. Signed-off-by: Tejun Heo t...@kernel.org Cc: Linus Torvalds torva...@linux-foundation.org --- These three patches implement the warning on synchronous request_module() from async. The first two will go through wq/for-3.9 and the last one through wq/for-3.9-async-deadlock-fixes together with the block layer updates. Thanks. kernel/workqueue.c | 35 --- 1 file changed, 28 insertions(+), 7 deletions(-) diff --git a/kernel/workqueue.c b/kernel/workqueue.c index 7967f34..6b99ac7 100644 --- a/kernel/workqueue.c +++ b/kernel/workqueue.c @@ -149,6 +149,9 @@ struct worker { /* for rebinding worker to CPU */ struct work_struct rebind_work;/* L: for busy worker */ + + /* used only by rescuers to point to the target workqueue */ + struct workqueue_struct *rescue_wq; /* I: the workqueue to rescue */ }; struct worker_pool { @@ -763,12 +766,20 @@ struct task_struct *wq_worker_sleeping(struct task_struct *task, unsigned int cpu) { struct worker *worker = kthread_data(task), *to_wakeup = NULL; - struct worker_pool *pool = worker-pool; - atomic_t *nr_running = get_pool_nr_running(pool); + struct worker_pool *pool; + atomic_t *nr_running; + /* +* Rescuers, which may not have all the fields set up like normal +* workers, also reach here, let's not access anything before +* checking NOT_RUNNING. +*/ if (worker-flags WORKER_NOT_RUNNING) return NULL; + pool = worker-pool; + nr_running = get_pool_nr_running(pool); + /* this can only happen on the local cpu */ BUG_ON(cpu != raw_smp_processor_id()); @@ -2357,7 +2368,7 @@ sleep: /** * rescuer_thread - the rescuer thread function - * @__wq: the associated workqueue + * @__rescuer: self * * Workqueue rescuer thread function. There's one rescuer for each * workqueue which has WQ_RESCUER set. @@ -2374,20 +2385,27 @@ sleep: * * This should happen rarely. */ -static int rescuer_thread(void *__wq) +static int rescuer_thread(void *__rescuer) { - struct workqueue_struct *wq = __wq; - struct worker *rescuer = wq-rescuer; + struct worker *rescuer = __rescuer; + struct workqueue_struct *wq = rescuer-rescue_wq; struct list_head *scheduled = rescuer-scheduled; bool is_unbound = wq-flags WQ_UNBOUND; unsigned int cpu; set_user_nice(current, RESCUER_NICE_LEVEL); + + /* +* Mark rescuer as worker too. As WORKER_PREP is never cleared, it +* doesn't participate in concurrency management. +*/ + rescuer-task-flags |= PF_WQ_WORKER; repeat: set_current_state(TASK_INTERRUPTIBLE); if (kthread_should_stop()) { __set_current_state(TASK_RUNNING); + rescuer-task-flags = ~PF_WQ_WORKER; return 0; } @@ -2431,6 +2449,8 @@ repeat: spin_unlock_irq(gcwq-lock); } + /* rescuers should never participate in concurrency management */ + WARN_ON_ONCE(!(rescuer-flags WORKER_NOT_RUNNING)); schedule(); goto repeat; } @@ -3266,7 +3286,8 @@ struct workqueue_struct *__alloc_workqueue_key(const char *fmt, if (!rescuer) goto err; - rescuer-task = kthread_create(rescuer_thread, wq, %s, + rescuer-rescue_wq = wq; + rescuer-task = kthread_create(rescuer_thread, rescuer, %s, wq-name); if (IS_ERR(rescuer-task)) goto err; -- 1.8.0.2 -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo
[PATCH 2/3] workqueue, async: implement work/async_current_func()
Implement work/async_current_func() which query whether the current task is a workqueue or async worker respectively and, if so, return the current function being executed along with work / async item related information. This will be used to implement warning on synchronous request_module() from async workers. Signed-off-by: Tejun Heo t...@kernel.org Cc: Linus Torvalds torva...@linux-foundation.org Cc: Arjan van de Ven ar...@linux.intel.com --- include/linux/async.h | 2 ++ include/linux/workqueue.h | 1 + kernel/async.c| 25 + kernel/workqueue.c| 22 ++ 4 files changed, 50 insertions(+) diff --git a/include/linux/async.h b/include/linux/async.h index 7a24fe9..6c49157 100644 --- a/include/linux/async.h +++ b/include/linux/async.h @@ -52,4 +52,6 @@ extern void async_synchronize_full_domain(struct async_domain *domain); extern void async_synchronize_cookie(async_cookie_t cookie); extern void async_synchronize_cookie_domain(async_cookie_t cookie, struct async_domain *domain); +extern async_func_ptr *async_current_func(void **datap, + async_cookie_t *cookiep); #endif diff --git a/include/linux/workqueue.h b/include/linux/workqueue.h index 2b58905..984fbef 100644 --- a/include/linux/workqueue.h +++ b/include/linux/workqueue.h @@ -428,6 +428,7 @@ extern void workqueue_set_max_active(struct workqueue_struct *wq, extern bool workqueue_congested(unsigned int cpu, struct workqueue_struct *wq); extern unsigned int work_cpu(struct work_struct *work); extern unsigned int work_busy(struct work_struct *work); +extern work_func_t work_current_func(struct work_struct **workp); /* * Like above, but uses del_timer() instead of del_timer_sync(). This means, diff --git a/kernel/async.c b/kernel/async.c index 9d31183..ed1eda0 100644 --- a/kernel/async.c +++ b/kernel/async.c @@ -337,3 +337,28 @@ void async_synchronize_cookie(async_cookie_t cookie) async_synchronize_cookie_domain(cookie, async_running); } EXPORT_SYMBOL_GPL(async_synchronize_cookie); + +/** + * async_current_func - determine the async entry %current is executing + * @datap: optional out param for the data + * @cookiep: optional out param for the cookie + * + * Determine whether %current is an async worker executing an async_entry + * and if so return the async function and, if @cookiep is not %NULL, the + * cookie. If %current isn't executing an async_entry, %NULL is returned. + */ +async_func_ptr *async_current_func(void **datap, async_cookie_t *cookiep) +{ + struct work_struct *work; + struct async_entry *entry; + + if (work_current_func(work) != async_run_entry_fn) + return NULL; + + entry = container_of(work, struct async_entry, work); + if (datap) + *datap = entry-data; + if (cookiep) + *cookiep = entry-cookie; + return entry-func; +} diff --git a/kernel/workqueue.c b/kernel/workqueue.c index 6b99ac7..9fc1549 100644 --- a/kernel/workqueue.c +++ b/kernel/workqueue.c @@ -3492,6 +3492,28 @@ unsigned int work_busy(struct work_struct *work) } EXPORT_SYMBOL_GPL(work_busy); +/** + * work_current_func - determine the work fn and item %current is executing + * @workp: optional out param for the current work item + * + * Determine whether %current is a kworker executing a work item and if so + * return the work function and, if @workp is not %NULL, the work item. If + * %current isn't executing a work item, %NULL is returned. + */ +work_func_t work_current_func(struct work_struct **workp) +{ + struct worker *worker; + + /* am I a kworker? */ + if (!(current-flags PF_WQ_WORKER)) + return NULL; + + worker = kthread_data(current); + if (workp) + *workp = worker-current_work; + return worker-current_func; +} + /* * CPU hotplug. * -- 1.8.0.2 -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 3/3] async, kmod: warn on synchronous request_module() from async workers
Synchronous requet_module() from an async worker can lead to deadlock because module init path may invoke async_synchronize_full(). The async worker waits for request_module() to complete and the module loading waits for the async task to finish. This bug happened in the block layer because of default elevator auto-loading. Block layer has been updated not to do default elevator auto-loading and it has been decided to disallow synchronous request_module() from async workers. Trigger WARN_ON_ONCE() on synchronous request_module() from async workers. For more details, please refer to the following thread. http://thread.gmane.org/gmane.linux.kernel/1420814 Signed-off-by: Tejun Heo t...@kernel.org Reported-by: Alex Riesen raa.l...@gmail.com Cc: Linus Torvalds torva...@linux-foundation.org Cc: Arjan van de Ven ar...@linux.intel.com --- Linus, please note that I dropped system_state == SYSTEM_RUNNING condition from WARN_ON_ONCE() as the deadlock can happen during system init too - e.g. libata probing block device using async making block layer try to load default elevator from initramfs. Thanks. kernel/kmod.c | 9 + 1 file changed, 9 insertions(+) diff --git a/kernel/kmod.c b/kernel/kmod.c index 1c317e3..028287e 100644 --- a/kernel/kmod.c +++ b/kernel/kmod.c @@ -38,6 +38,7 @@ #include linux/suspend.h #include linux/rwsem.h #include linux/ptrace.h +#include linux/async.h #include asm/uaccess.h #include trace/events/module.h @@ -130,6 +131,14 @@ int __request_module(bool wait, const char *fmt, ...) #define MAX_KMOD_CONCURRENT 50 /* Completely arbitrary value - KAO */ static int kmod_loop_msg; + /* +* We don't allow synchronous module loading from async. Module +* init may invoke async_synchronize_full() which will end up +* waiting for this task which already is waiting for the module +* loading to complete, leading to a deadlock. +*/ + WARN_ON_ONCE(wait async_current_func(NULL, NULL)); + va_start(args, fmt); ret = vsnprintf(module_name, MODULE_NAME_LEN, fmt, args); va_end(args); -- 1.8.0.2 -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/3] workqueue, async: implement work/async_current_func()
Hello, Linus. On Thu, Jan 17, 2013 at 06:47:48PM -0800, Linus Torvalds wrote: On Thu, Jan 17, 2013 at 5:25 PM, Tejun Heo t...@kernel.org wrote: Implement work/async_current_func() which query whether the current task is a workqueue or async worker respectively and, if so, return the current function being executed along with work / async item related information. So why the odd interface? The only user of it calls it with a Yeah, I was doing something else in async and arguing between that and current_is_async() and ended up keeping it as it was consistent with the workqueue counterpart. NULL/NULL pair of arguments, and in general it's just way too complex to be an exported function at all. I *suspect* you chose that complex interface because you feel you may have some use for it inside of the async code itself, but why isn't that then not totally private to there? IOW, why isn't the interface just static struct worker *current_worker(void) { if (current-flags PF_WQ_WORKER) return kthread_data(current); return NULL; } I'd prefer to keep struct worker inside workqueue.c, so how about keeping the workqueue part and make async part current_is_async()? Thanks. -- tejun -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/3] workqueue, async: implement work/async_current_func()
On Thu, Jan 17, 2013 at 07:18:26PM -0800, Linus Torvalds wrote: I'm not sure I understand what you mean? Do you mean trying to limit work_current_func() to only be accessible to the async code? You'd have to make some kind of private header file under kernel/ for that, but I guess that would work fine. We already do something similar inside filesystems etc, where they have their own local headers. Yeap, and I'm unsure whether it's worth introducing a new internal header file. Yes, yes, some globally optimizing compiler could sort it all out, but I'd personally be inclined to just move all the structure definitions into kernel/worker.h, and make the code be inline functions. The only actual current *user* would also be in the kernel/ subdirectory, and we don't know if we'd ever want to really expand it past there. Hmm? If we're gonna make it kernel/ internal thing with internal header, we definitely can go all the way. It's a bit meh because the code path involved is very cold. Hmm... I'll make it that way. I like how it keeps the thing apparently internal. Thanks. -- tejun -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: USB device cannot be reconnected and khubd blocked for more than 120 seconds
Hello, Alan. On Tue, Jan 15, 2013 at 11:01:15PM -0500, Alan Stern wrote: The current domain implementation is somewhere inbetween. It's not completely simplistic system and at the same time not developed enough to do properly stacked flushing. I like your idea of chronological synchronization: Insist that anybody who wants to flush async jobs must get a cookie, and then only allow them to wait for async jobs started after the cookie was issued. I don't know if this is possible with the current implementation. It would require changing every call to async_synchronize_*(), and in a nontrivial way. But it might provide a proper solution to all these problems. The problem here is that flush everything which comes before me is used to order async jobs. e.g. after async jobs probe the hardware they order themselves by flushing before registering them, so unless we build accurate flushing dependencies, those dependencies will reach beyond the time window we're interested in and bring in deadlocks. And, as Linus pointed it out, tracking dependency through request_module() is tricky no matter what we do. I think it can be done by matching the ones calling request_module() and the ones actually loading modules but it's gonna be nasty. There aren't too many which use async anyway so changing stuff shouldn't be too difficult but I think the simpicity or dumbness is one of major attractions of async, so it'd be nice to keep things that way and the PF_USED_ASYNC hack seems to be able to hold things together for now. Thanks. -- tejun -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] async: fix __lowest_in_progress()
083b804c4d3e1e3d0eace56bdbc0f674946d2847 (async: use workqueue for worker pool) made it possible that async jobs are moved from pending to running out-of-order. While pending async jobs will be queued and dispatched for execution in the same order, nothing guarantees they'll enter 1) move self to the running queue of async_run_entry_fn() in the same order. This broke __lowest_in_progress(). running-domain may not be properly sorted and is not guaranteed to contain lower cookies than pending list when not empty. Fix it by ensuring sort-inserting to the running list and always looking at both pending and running when trying to determine the lowest cookie. Over time, the async synchronization implementation became quite messy. We better restructure it such that each async_entry is linked to two lists - one global and one per domain - and not move it when execution starts. There's no reason to distinguish pending and running. They behave the same for synchronization purposes. Signed-off-by: Tejun Heo t...@kernel.org Cc: Arjan van de Ven ar...@linux.intel.com Cc: sta...@vger.kernel.org --- And here's the fix for the breakage I mentioned earlier. It wouldn't happen often in the wild and the effect of it happening wouldn't be critical for modern distros but it's still kinda surprising nobody noticed this. We definitely need to rewrite async synchronization. It was already messy and this makes it worse and there's no reason to be messy here. Thanks. kernel/async.c | 27 --- 1 file changed, 20 insertions(+), 7 deletions(-) --- a/kernel/async.c +++ b/kernel/async.c @@ -86,18 +86,27 @@ static atomic_t entry_count; */ static async_cookie_t __lowest_in_progress(struct async_domain *running) { + async_cookie_t first_running = next_cookie; /* infinity value */ + async_cookie_t first_pending = next_cookie; /* ditto */ struct async_entry *entry; + /* +* Both running and pending lists are sorted but not disjoint. +* Take the first cookies from both and return the min. +*/ if (!list_empty(running-domain)) { entry = list_first_entry(running-domain, typeof(*entry), list); - return entry-cookie; + first_running = entry-cookie; } - list_for_each_entry(entry, async_pending, list) - if (entry-running == running) - return entry-cookie; + list_for_each_entry(entry, async_pending, list) { + if (entry-running == running) { + first_pending = entry-cookie; + break; + } + } - return next_cookie; /* infinity value */ + return min(first_running, first_pending); } static async_cookie_t lowest_in_progress(struct async_domain *running) @@ -118,13 +127,17 @@ static void async_run_entry_fn(struct wo { struct async_entry *entry = container_of(work, struct async_entry, work); + struct async_entry *pos; unsigned long flags; ktime_t uninitialized_var(calltime), delta, rettime; struct async_domain *running = entry-running; - /* 1) move self to the running queue */ + /* 1) move self to the running queue, make sure it stays sorted */ spin_lock_irqsave(async_lock, flags); - list_move_tail(entry-list, running-domain); + list_for_each_entry_reverse(pos, running-domain, list) + if (entry-cookie pos-cookie) + break; + list_move_tail(entry-list, pos-list); spin_unlock_irqrestore(async_lock, flags); /* 2) run (and print duration) */ -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: USB device cannot be reconnected and khubd blocked for more than 120 seconds
Hello, Alan. On Wed, Jan 16, 2013 at 12:01:53PM -0500, Alan Stern wrote: The problem here is that flush everything which comes before me is used to order async jobs. e.g. after async jobs probe the hardware they order themselves by flushing before registering them, so unless I don't fully understand this example. What is the point -- to make sure that asynchronously probed devices are registered in the order of their discovery? People still want devices to be numbered to their physical ports and so on, so we keep the registeration order the same as natural (whatever that means) hardware order. If so, here's how to do it safely: Start up the async jobs in reverse order of discovery. Have each job acquire a cookie when it starts. Then each job needs to wait only for tasks that started after its cookie was issued. It's a bit clumsy but yeah I guess it could work. There aren't too many which use async anyway so changing stuff shouldn't be too difficult but I think the simpicity or dumbness is one of major attractions of async, so it'd be nice to keep things that way and the PF_USED_ASYNC hack seems to be able to hold things together for now. Nesting won't matter for the chronological approach. I really think you should consider it more fully. It's not a hack, and it doesn't need to be complicated. There is benefit to the current dumb implementation in that drivers can use it without thinking too much, but yeah it could be that the flushing range limit isn't too much of restriction on top. I don't know. At this point, I'd prefer to remove request_module() from elevator init path for the problem at hand. If we need something more involved, changing cookie usage rules definitely seems like an option. Thanks. -- tejun -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/2] init, block: try to load default elevator module early during boot
This patch adds default module loading and uses it to load the default block elevator. During boot, it's called right after initramfs or initrd is made available and right before control is passed to userland. This ensures that as long as the modules are available in the usual places in initramfs, initrd or the root filesystem, the default modules are loaded as soon as possible. This will replace the on-demand elevator module loading from elevator init path. Signed-off-by: Tejun Heo t...@kernel.org Cc: Jens Axboe ax...@kernel.dk Cc: Arjan van de Ven ar...@linux.intel.com Cc: Linus Torvalds torva...@linux-foundation.org Cc: Alex Riesen raa.l...@gmail.com --- block/elevator.c | 16 include/linux/elevator.h |1 + include/linux/init.h |1 + init/do_mounts_initrd.c |3 +++ init/initramfs.c |8 +++- init/main.c | 16 6 files changed, 44 insertions(+), 1 deletion(-) --- a/block/elevator.c +++ b/block/elevator.c @@ -136,6 +136,22 @@ static int __init elevator_setup(char *s __setup(elevator=, elevator_setup); +/* called during boot to load the elevator chosen by the elevator param */ +void __init load_default_elevator_module(void) +{ + struct elevator_type *e; + + if (!chosen_elevator[0]) + return; + + spin_lock(elv_list_lock); + e = elevator_find(chosen_elevator); + spin_unlock(elv_list_lock); + + if (!e) + request_module(%s-iosched, chosen_elevator); +} + static struct kobj_type elv_ktype; static struct elevator_queue *elevator_alloc(struct request_queue *q, --- a/include/linux/elevator.h +++ b/include/linux/elevator.h @@ -138,6 +138,7 @@ extern void elv_drain_elevator(struct re /* * io scheduler registration */ +extern void __init load_default_elevator_module(void); extern int elv_register(struct elevator_type *); extern void elv_unregister(struct elevator_type *); --- a/include/linux/init.h +++ b/include/linux/init.h @@ -161,6 +161,7 @@ extern unsigned int reset_devices; /* used by init/main.c */ void setup_arch(char **); void prepare_namespace(void); +void __init load_default_modules(void); extern void (*late_time_init)(void); --- a/init/do_mounts_initrd.c +++ b/init/do_mounts_initrd.c @@ -57,6 +57,9 @@ static void __init handle_initrd(void) sys_mkdir(/old, 0700); sys_chdir(/old); + /* try loading default modules from initrd */ + load_default_modules(); + /* * In case that a resume from disk is carried out by linuxrc or one of * its children, we need to tell the freezer not to wait for us. --- a/init/initramfs.c +++ b/init/initramfs.c @@ -592,7 +592,7 @@ static int __init populate_rootfs(void) initrd_end - initrd_start); if (!err) { free_initrd(); - return 0; + goto done; } else { clean_rootfs(); unpack_to_rootfs(__initramfs_start, __initramfs_size); @@ -607,6 +607,7 @@ static int __init populate_rootfs(void) sys_close(fd); free_initrd(); } + done: #else printk(KERN_INFO Unpacking initramfs...\n); err = unpack_to_rootfs((char *)initrd_start, @@ -615,6 +616,11 @@ static int __init populate_rootfs(void) printk(KERN_EMERG Initramfs unpacking failed: %s\n, err); free_initrd(); #endif + /* +* Try loading default modules from initramfs. This gives +* us a chance to load before device_initcalls. +*/ + load_default_modules(); } return 0; } --- a/init/main.c +++ b/init/main.c @@ -70,6 +70,8 @@ #include linux/perf_event.h #include linux/file.h #include linux/ptrace.h +#include linux/blkdev.h +#include linux/elevator.h #include asm/io.h #include asm/bugs.h @@ -794,6 +796,17 @@ static void __init do_pre_smp_initcalls( do_one_initcall(*fn); } +/* + * This function requests modules which should be loaded by default and is + * called twice right after initrd is mounted and right before init is + * exec'd. If such modules are on either initrd or rootfs, they will be + * loaded before control is passed to userland. + */ +void __init load_default_modules(void) +{ + load_default_elevator_module(); +} + static int run_init_process(const char *init_filename) { argv_init[0] = init_filename; @@ -900,4 +913,7 @@ static void __init kernel_init_freeable( * we're essentially up and running. Get rid of the * initmem segments and start the user-mode stuff.. */ + + /* rootfs is available now, try loading default modules */ + load_default_modules(); } -- To unsubscribe from this list: send the line unsubscribe linux
[PATCH 2/2] block: don't request module during elevator init
Block layer allows selecting an elevator which is built as a module to be selected as system default via kernel param elevator=. This is achieved by automatically invoking request_module() whenever a new block device is initialized and the elevator is not available. This led to an interesting deadlock problem involving async and module init. Block device probing running off an async job invokes request_module(). While the module is being loaded, it performs async_synchronize_full() which ends up waiting for the async job which is already waiting for request_module() to finish, leading to deadlock. Invoking request_module() from deep in block device init path is already nasty in itself. It seems best to avoid these situations from the beginning by moving on-demand module loading out of block init path. The previous patch made sure that the default elevator module is loaded early during boot if available. This patch removes on-demand loading of the default elevator from elevator init path. As the module would have been loaded during boot, userland-visible behavior difference should be minimal. For more details, please refer to the following thread. http://thread.gmane.org/gmane.linux.kernel/1420814 Signed-off-by: Tejun Heo t...@kernel.org Cc: Jens Axboe ax...@kernel.dk Cc: Arjan van de Ven ar...@linux.intel.com Cc: Linus Torvalds torva...@linux-foundation.org Cc: Alex Riesen raa.l...@gmail.com --- block/elevator.c | 19 --- 1 file changed, 12 insertions(+), 7 deletions(-) --- a/block/elevator.c +++ b/block/elevator.c @@ -100,14 +100,14 @@ static void elevator_put(struct elevator module_put(e-elevator_owner); } -static struct elevator_type *elevator_get(const char *name) +static struct elevator_type *elevator_get(const char *name, bool request_module) { struct elevator_type *e; spin_lock(elv_list_lock); e = elevator_find(name); - if (!e) { + if (!e request_module) { spin_unlock(elv_list_lock); request_module(%s-iosched, name); spin_lock(elv_list_lock); @@ -207,25 +207,30 @@ int elevator_init(struct request_queue * q-boundary_rq = NULL; if (name) { - e = elevator_get(name); + e = elevator_get(name, true); if (!e) return -EINVAL; } + /* +* Use the default elevator specified by config boot param or +* config option. Don't try to load modules as we could be running +* off async and request_module() isn't allowed from async. +*/ if (!e *chosen_elevator) { - e = elevator_get(chosen_elevator); + e = elevator_get(chosen_elevator, false); if (!e) printk(KERN_ERR I/O scheduler %s not found\n, chosen_elevator); } if (!e) { - e = elevator_get(CONFIG_DEFAULT_IOSCHED); + e = elevator_get(CONFIG_DEFAULT_IOSCHED, false); if (!e) { printk(KERN_ERR Default I/O scheduler not found. \ Using noop.\n); - e = elevator_get(noop); + e = elevator_get(noop, false); } } @@ -967,7 +972,7 @@ int elevator_change(struct request_queue return -ENXIO; strlcpy(elevator_name, name, sizeof(elevator_name)); - e = elevator_get(strstrip(elevator_name)); + e = elevator_get(strstrip(elevator_name), true); if (!e) { printk(KERN_ERR elevator: type %s not found\n, elevator_name); return -EINVAL; -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: USB device cannot be reconnected and khubd blocked for more than 120 seconds
Hello, Linus. On Tue, Jan 15, 2013 at 09:36:57AM -0800, Linus Torvalds wrote: Tejun, comments? You can see the whole thread on lkml, but the basic problem is that the module loading doing the unconditional async_synchronize_full() has caused problems, because we have - load module A - module A does per-controller async discovery of its devices (eg scsi or ata probing) - in the async thread, it initializes somethign that needs another module B (in this case the default IO scheduler module) - modprobe for B loads the IO scheduler module successfully at the end of the module load, it does async_synchronize_full() to make sure load_module won't return before the module is ready *DEADLOCK*, because the async_synchronize_full() thing actually waits for not the module B async code (it didn't have any), but for the module *A* async code, which is waiting for module B to finish. I think the root problem here, apart from request_module() from block - which is a bit nasty but making that part completely async would too be quite nasty albeit in a different way - is that async_synchronize_full() is way too indescriminate. It's something only suitable for things like the end of system init. I'm wondering whether what we need is a rudimentray nesting like the following. finished_loading() { blah blah; cookie = async_current_cookie(); do init calls; async_synchronize_upto(cookie); blah blah; } The nesting here would be an approximation as the dependency recorded here is chronological. I *suspect* this should be safe unless the module is doing something weird. Need to think more about it. One way or the other, I think what we need is some form of scoping for flushing async ops. BTW, the current synchronization is broken - cookie isn't transferred to running-domain in queueing order but __lowest_in_progress() assumes that. I think I broke that while converting it to workqueue. Anyways, working on it. Thanks. -- tejun -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: USB device cannot be reconnected and khubd blocked for more than 120 seconds
Hello, Alan. On Tue, Jan 15, 2013 at 01:20:58PM -0500, Alan Stern wrote: It may not be so easy. When the SCSI async thread probes the new disk, it has to do I/O. So it needs to use a scheduler. But maybe it could use a built-in trivial scheduler until the proper one is loaded. Then the loading could be asynchronous. It can be done. Noop is always built-in and block IO can do IOs with noop. The problem here is that request_module() is done synchronously during evelator_init(). We can punt that to a work item so that the elevator is switched on load completion. There are some nastiness involved tho - if module probing returns before elevator switch happens, the userland can observe elevator being switched after some indetermined short period of time, which can, for example, break scripts adjusting elevator knobs and etc... I *think* it'll be best to allow scoped synchronization of async ops. Looking into it. Thanks. -- tejun -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: USB device cannot be reconnected and khubd blocked for more than 120 seconds
Hello, Linus Will continue on another reply but this one is relevant so... On Tue, Jan 15, 2013 at 10:18:45AM -0800, Linus Torvalds wrote: Tejun, is there a good way for code to see I'm running in async context? Then we could do something like Almost. With a bit of modification we can ask whether current is a kworker, reach struct worker_struct via kthread_data() if so and then test worker-current_func against the async workfn. Thanks. -- tejun -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: USB device cannot be reconnected and khubd blocked for more than 120 seconds
cc'ing Arjan. Arjan, the original thread can be read from http://thread.gmane.org/gmane.linux.kernel/1420814 Hello, again. On Tue, Jan 15, 2013 at 12:18:01PM -0800, Linus Torvalds wrote: I think that is a good solution if it works, but look out: we need to synchronize across *all* domains, not just the default one. The sd.c code, for example, uses its own scsi_sd_probe_domain for example, and we *do* want to synchronize with it. Can you do that with your suggested interface (ie it would have to be a *global* sequence number). So, I've been thinking about it for a while now and it looks like async is cutting too many corners to implement any sane stackable flushing scheme on top. There simply isn't much information to determine who should wait for what. I've thought of two workarounds. Both suck. A. Try to detect deadlock conditions from synchronize(). If deadlock condition involving other async jobs are detected, whine about it and then skip. Ignore deadlock condition on self (should solve this particular case). Detecting deadlock condition isn't difficult if there are only global synchronizations; unfortunately, fragmented dependencies via domain-local synchronization makes this non-trivial. We can still do ignore-self thing mostly trivially tho. This will at least work around the problem at hand. B. The ranged synchronization I first suggested. The problem with this is that it's a common practice for a given async job to try to flush anything which comes before it. This can introduce spurious synchronization dependencies which can then lead to deadlocks. These conditions can be detected and ignored, at least only considering global synchronizations. The problem here is that those deadlock conditions will occur under normal usage and thus should be ignored silently, which basically makes synchronization silently ignore and finish successfully even if there are legitimate deadlocks which should be investigated. For now, I'm gonna implement simple I'm not gonna wait for myself self-deadlock avoidance. If this needs any more sophistication, I think we better reimplement it so that we can explicitly match up and track who's gonna wait for what instead of throwing everything into a single cookie space and then try to work back from there. Thanks. -- tejun -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: USB device cannot be reconnected and khubd blocked for more than 120 seconds
Hello, Arjan. On Tue, Jan 15, 2013 at 04:25:54PM -0800, Arjan van de Ven wrote: async fundamentally had the concept of a monotonic increasing number, and that you could always wait for everyone before me. then people (like me) wanted exceptions to what everyone means ;-( I'm ok with going back to a single space and simplify the world. If we want (or need) finer grained operation, we'll probably have to head the other direction, so that we can definitively tell that an async operation belongs to domains system, module load A and B, so that each waiter knows what to wait for. The current domain implementation is somewhere inbetween. It's not completely simplistic system and at the same time not developed enough to do properly stacked flushing. the module wait case is tricky, and I wonder if there's deadlocks lurking even without async. I don't think so. It's really an async job waiting for itself. Working around just this case is mostly trivial (working on patches now) but it really is putting kludges on top of shaky foundation. Maybe this is the extent of complexity that we need to go given the rather limited use cases of async. Let's hope so. I think we'll have to reimplement synchronization scheme if we have to go further. at some point in the past we had the concept of request a module but don't wait for it, and I wonder if that is what should have been used here. We actually want to wait for it as it creates a userland visible behavior difference otherwise. It's just that async's way of waiting is too ham-fisted to be used properly in more complex scenarios. Thanks. -- tejun -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: USB device cannot be reconnected and khubd blocked for more than 120 seconds
On Tue, Jan 15, 2013 at 04:36:34PM -0800, Linus Torvalds wrote: The thing is, the module loading in particular is not necessarily happening in the same context as what *started* the module loading. A module loader will request the module from user space, and then later user space - through possibly a totally unrelated process - will finish it. So there is no myself. There's not even necessarily any relationship that the kernel even knows about, because the module loading request can have gone from usermode_helper over something like dbus to systemd. See? Right. Gees, there's even no way to link them. -- tejun -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] module, async: async_synchronize_full() on module init iff async is used
Hello, Linus. On Tue, Jan 15, 2013 at 07:00:31PM -0800, Linus Torvalds wrote: That said, maybe we could just make the rule be that you can't pick a default IO scheduler that is modular. This is definitely much more preferable but it would affect use case where everything is built modular and the elevator is selected via kernel param. This is way outside the usual usage and we can warn about the new behavior but it still is an observable behavior change. Do you think this would be okay? And I *would* like to see the warning we discussed. Maybe there are other situations that can trigger this? Because something like that WARN_ON_ONCE(wait i_am_async() system_state == SYSTEM_RUNNING); in kernel/kmod.c (__request_module()) still sounds like a good idea to verify that this is the only thing that triggers it (of course, we'd need to somehow avoid the warning for the known case with the known workaround). And then this warning can be added without introducing request_module_but_dont_warn_about_being_called_from_async(). Thanks. -- tejun -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 22/25] usb/at91_udc: don't use [delayed_]work_pending()
There's no need to test whether a (delayed) work item in pending before queueing, flushing or cancelling it. Most uses are unnecessary and quite a few of them are buggy. Remove unnecessary pending tests from at91_udc. Only compile tested. Signed-off-by: Tejun Heo t...@kernel.org Cc: Andrew Victor li...@maxim.org.za Cc: Nicolas Ferre nicolas.fe...@atmel.com Cc: Jean-Christophe Plagniol-Villard plagn...@jcrosoft.com Cc: Felipe Balbi ba...@ti.com Cc: linux-usb@vger.kernel.org --- Please let me know how this patch should be routed. I can take it through the workqueue tree if necessary. Thanks. drivers/usb/gadget/at91_udc.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/usb/gadget/at91_udc.c b/drivers/usb/gadget/at91_udc.c index f4a21f6..e81d8a2 100644 --- a/drivers/usb/gadget/at91_udc.c +++ b/drivers/usb/gadget/at91_udc.c @@ -1621,8 +1621,7 @@ static void at91_vbus_timer(unsigned long data) * bus such as i2c or spi which may sleep, so schedule some work * to read the vbus gpio */ - if (!work_pending(udc-vbus_timer_work)) - schedule_work(udc-vbus_timer_work); + schedule_work(udc-vbus_timer_work); } static int at91_start(struct usb_gadget *gadget, -- 1.8.0.2 -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html