Re: Clang and X86-EFlags (was Re: [PATCH] usbhid: Fix lockdep unannotated irqs-off warning)
On Tue, Apr 24, 2018 at 6:28 AM, Sedat Dilekwrote: > > $ objdump -S clang-eflag.o > > Does this now look good? Looks fine to me. The instruction choice is still pretty odd: 1a: f6 c3 fftest $0xff,%bl 1d: 74 02 je 21 that "test $0xff,%bl" is a rather odd way of testing the byte for zero, since 1a: 84 dbtest %bl,%bl 1c: 74 02 je 20 would have been a byte shorter and is the canonical way on x86 to test a register against zero. But that's just a "looks odd to somebody who is used to x86 asm", not a bug or anything really noticeable. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Re: dvb usb issues since kernel 4.9
On Tue, Jan 9, 2018 at 9:57 AM, Eric Dumazetwrote: > > Your patch considers TASKLET_SOFTIRQ being a candidate for 'immediate > handling', but TCP Small queues heavily use TASKLET, > so as far as I am concerned a revert would have the same effect. Does it actually? TCP ends up dropping packets outside of the window etc, so flooding a machine with TCP packets and causing some further processing up the stack sounds very different from the basic packet flooding thing that happens with NET_RX_SOFTIRQ. Also, honestly, the kinds of people who really worry about flooding tend to have packet filtering in the receive path etc. So I really think "you can use up 90% of CPU time with a UDP packet flood from the same network" is very very very different - and honestly not at all as important - as "you want to be able to use a USB DVB receiver and watch/record TV". Because that whole "UDP packet flood from the same network" really is something you _fundamentally_ have other mitigations for. I bet that whole commit was introduced because of a benchmark test, rather than real life. No? In contrast, now people are complaining about real loads not working. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: dvb usb issues since kernel 4.9
On Tue, Jan 9, 2018 at 9:42 AM, Mauro Carvalho Chehabwrote: > > On my preliminar tests, writing to a file on an ext4 partition at a > USB stick loses data up to the point to make it useless (1/4 of the data > is lost!). However, writing to a class 10 microSD card is doable. Note that most USB sticks are horrible crap. They can have write latencies counted in _seconds_. You can cause VM issues and various other non-hardware stalls with them, simply because something gets stuck waiting for a page writeout that should take a few ms on any reasonable hardware, but ends up talking half a second or more. For example, even really well-written software that tries to do things like threaded write-behind to smooth out the IO will be _totally_ screwed by the USB stick behavior (where you might write a few MB at high speeds, and then the next write - however small - takes a second because the stupid USB stick does a synchronous garbage collection. Suddenly all that clever software that tried to keep things moving along smoothly without any hiccups, and tried hard to make the USB bus have a nice constant loadm can't do anything at all about the crap hardware. So when testing writes to USB sticks, I'm not convinced you're actually testing any USB bus limitations or even really any other hardware limitations than the USB stick itself. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Re: dvb usb issues since kernel 4.9
On Tue, Jan 9, 2018 at 9:27 AM, Eric Dumazetwrote: > > So yes, commit 4cd13c21b207 ("softirq: Let ksoftirqd do its job") has > shown up multiple times in various 'regressions' > simply because it could surface the problem more often. > But even if you revert it, you can still make the faulty > driver/subsystem misbehave by adding more stress to the cpu handling > the IRQ. ..but that's always true. People sometimes live on the edge - often by design (ie hardware has been designed/selected to be the crappiest possible that still work). That doesn't change anything. A patch that takes "bad things can happen" to "bad things DO happen" is a bad patch. > Maybe the answer is to tune the kernel for small latencies at the > price of small throughput (situation before the patch) Generally we always want to tune for latency. Throughput is "easy", but almost never interesting. Sure, people do batch jobs. And yes, people often _benchmark_ throughput, because it's easy to benchmark. It's much harder to benchmark latency, even when it's often much more important. A prime example is the SSD benchmarks in the last few years - they improved _dramatically_ when people noticed that the real problem was latency, not the idiotic maximum big-block bandwidth numbers that have almost zero impact on most people. Put another way: we already have a very strong implicit bias towards bandwidth just because it's easier to see and measure. That means that we generally should strive to have a explicit bias towards optimizing for latency when that choice comes up. Just to balance things out (and just to not take the easy way out: bandwidth can often be improved by adding more layers of buffering and bigger buffers, and that often ends up really hurting latency). > 1) Revert the patch Well, we can revert it only partially - limiting it to just networking for example. Just saying "act the way you used to for tasklets" already seems to have fixed the issue in DVB. > 2) get rid of ksoftirqd since it adds unexpected latencies. We can't get rid of it entirely, since the synchronous softirq code can cause problems too. It's why we have that "maximum of ten synchronous events" in __do_softirq(). And we don't *want* to get rid of it. We've _always_ had that small-scale "at some point we can't do it synchronously any more". That is a small-scale "don't have horrible latency for _other_ things" protection. So it's about latency too, it's just about protecting latency of the rest of the system. The problem with commit 4cd13c21b207 is that it turns the small-scale latency issues in softirq handling (they get larger latencies for lots of hardware interrupts or even from non-preemptible kernel code) into the _huge_ scale latency of scheduling, and does so in a racy way too. > 3) Let applications that expect to have high throughput make sure to > pin their threads on cpus that are not processing IRQ. > (And make sure to not use irqbalance, and setup IRQ cpu affinities) The only people that really deal in "thoughput only" tend to be the HPC people, and they already do things like this. (The other end of the spectrum is the realtime people that have extreme latency requirements, who do things like that for the reverse reason: keeping one or more CPU's reserved for the particular low-latency realtime job). Linus -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: dvb usb issues since kernel 4.9
On Mon, Jan 8, 2018 at 11:15 AM, Alan Sternwrote: > > Both dwc2_hsotg and ehci-hcd use the tasklets embedded in the > giveback_urb_bh member of struct usb_hcd. See usb_hcd_giveback_urb() > in drivers/usb/core/hcd.c; the calls are > > else if (high_prio_bh) > tasklet_hi_schedule(>bh); > else > tasklet_schedule(>bh); > > As it turns out, high_prio_bh gets set for interrupt and isochronous > URBs but not for bulk and control URBs. The DVB driver in question > uses bulk transfers. Ok, so we could try out something like the appended? NOTE! I have not tested this at all. It LooksObvious(tm), but... Linus kernel/softirq.c | 12 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/kernel/softirq.c b/kernel/softirq.c index 2f5e87f1bae2..97b080956fea 100644 --- a/kernel/softirq.c +++ b/kernel/softirq.c @@ -79,12 +79,16 @@ static void wakeup_softirqd(void) /* * If ksoftirqd is scheduled, we do not want to process pending softirqs - * right now. Let ksoftirqd handle this at its own rate, to get fairness. + * right now. Let ksoftirqd handle this at its own rate, to get fairness, + * unless we're doing some of the synchronous softirqs. */ -static bool ksoftirqd_running(void) +#define SOFTIRQ_NOW_MASK ((1 << HI_SOFTIRQ) | (1 << TASKLET_SOFTIRQ)) +static bool ksoftirqd_running(unsigned long pending) { struct task_struct *tsk = __this_cpu_read(ksoftirqd); + if (pending & SOFTIRQ_NOW_MASK) + return false; return tsk && (tsk->state == TASK_RUNNING); } @@ -325,7 +329,7 @@ asmlinkage __visible void do_softirq(void) pending = local_softirq_pending(); - if (pending && !ksoftirqd_running()) + if (pending && !ksoftirqd_running(pending)) do_softirq_own_stack(); local_irq_restore(flags); @@ -352,7 +356,7 @@ void irq_enter(void) static inline void invoke_softirq(void) { - if (ksoftirqd_running()) + if (ksoftirqd_running(local_softirq_pending())) return; if (!force_irqthreads) {
Re: dvb usb issues since kernel 4.9
On Mon, Jan 8, 2018 at 9:55 AM, Ingo Molnarwrote: > > as I doubt we have enough time to root-case this properly. Well, it's not like this is a new issue, and we don't have to get it fixed for 4.15. It's been around since 4.9, it's not a "have to suddenly fix it this week" issue. I just think that people should plan on having to maybe revert it and mark the revert for stable. But if the USB or DVB layers can instead just make the packet queue a bit deeper and not react so badly to the latency of a single softirq, that would obviously be a good thing in general, and maybe fix this issue. So I'm not saying that the revert is inevitable either. But I have to say that that commit 4cd13c21b207 ("softirq: Let ksoftirqd do its job") was a pretty damn big hammer, and entirely ignored the "softirqs can have latency concerns" issue. So I do feel like the UDP packet storm thing might want a somewhat more directed fix than that huge hammer of trying to move softirqs aggressively into the softirq thread. This is not that different from threaded irqs. And while you can set the "thread every irq" flag, that would be largely insane to do by default and in general. So instead, people do it either for specific irqs (ie "request_threaded_irq()") or they have a way to opt out of it (IRQF_NO_THREAD). I _suspect_ that the softirq thing really just wants the same thing. Have the networking case maybe set the "prefer threaded" flag just for networking, if it's less latency-sensitive for softirq handling than In fact, even for networking, there are separate TX/RX softirqs, maybe networking would only set it for the RX case? Or maybe even trigger it only for cases where things queue up and it goes into a "polling mode" (like NAPI already does). Of course, I don't even know _which_ softirq it is that the DVB case has issues with. Maybe it's the same NET_RX case? But looking at that offending commit, I do note (for example), that we literally have things like tasklet[_hi]_schedule() that might have been explicitly expected to just run the tasklet at a fairly low latency (maybe instead of a workqueue exactly because it doesn't need to sleep and wants lower latency). So saying "just because softirqd is possibly already woken up, let's delay all those tasklets etc" does really seem very wrong to me. Can somebody tell which softirq it is that dvb/usb cares about? Linus -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: dvb usb issues since kernel 4.9
On Sat, Jan 6, 2018 at 11:54 AM, Mauro Carvalho Chehabwrote: > > Em Sat, 6 Jan 2018 16:04:16 +0100 > "Josef Griebichler" escreveu: >> >> the causing commit has been identified. >> After reverting commit >> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=4cd13c21b207e80ddb1144c576500098f2d5f882 >> its working again. > > Just replying to me won't magically fix this. The ones that were involved on > this patch should also be c/c, plus USB people. Just added them. Actually, you seem to have added an odd subset of the people involved. For example, Ingo - who actually committed that patch - wasn't on the cc. I do think we need to simply revert that patch. It's very simple: it has been reported to lead to actual problems for people, and we don't fix one problem and then say "well, it fixed something else" when something breaks. When something breaks, we either unbreak it, or we revert the change that caused the breakage. It's really that simple. That's what "no regressions" means. We don't accept changes that cause regressions. This one did. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [GIT PULL] USB/PHY driver changes for 4.15-rc1
On Mon, Nov 13, 2017 at 8:19 AM, Greg KHwrote: > > Other major thing is the typec code that moved out of staging and into > the "real" part of the drivers/usb/ tree, which was nice to see happen. Hmm. So now it asks me about Type-C Port Controller Manager. Fair enough. I say "N", because I have none. But then it still asks me about that TI TPS6598x driver... So I do see the _technical_ logic in there - the "TYPEC" config option is a hidden internal option, and it's selected by the things that need it. But from a user perspective, this configuration model is really strange. Why is TYPEC_TCPM something you ask the user, but not "do you want Type-C support"? And if you single out the PCM side to ask about, why don't you single out the power delivery side? Wouldn't it make more sense to at least ask whether I want Type-C power delivery chips before it then starts asking about individual PD drivers, the same way you asked about the port controller before you started asking ab out individual port controller drivers? Or is it just me who finds this a bit odd? Linus -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 4879b7ae05 ("Merge tag 'dmaengine-4.12-rc1' of .."): WARNING: kernel stack regs at bd92bc2e in 01-cpu-hotplug:3811 has bad 'bp' value 000001be
On Mon, Oct 2, 2017 at 2:26 PM, Josh Poimboeufwrote: > > The bisect is pointing to a commit which is almost 5 months old, so this > is pre-ORC. Kallsyms *is* enabled, but the unwinder dump isn't smart > enough to realize it's dumping misaligned stack addresses: Ahh, I didn't pick up on that "esp isn't aligned" part. That said, if %esp gets unaligned at some point, it's not clear exactly when we should align it. An unaligned stack pointer will continue to _work_ just potentially perform fairly badly. But more likely, we picked the wrong frame value to begin with. For example, maybe that decode_frame_pointer() logic really should check not that the low bit in bp is set, but instead check that it's a valid "unsigned long *" that has the low bit set. IOW, the difference would be that instead of checking if (!(regs & 0x1)) return NULL; if would check if ((regs & (sizeof(unsigned long)-1)) != 1) return NULL; but also maybe add logic to simply not trust a next frame pointer that isn't appropriately aligned. So I think adding PTR_ALIGN() there in the unwind dumper might be a bit late. By that time it has already accepted what looks like a garbage frame. No? Linus -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 4879b7ae05 ("Merge tag 'dmaengine-4.12-rc1' of .."): WARNING: kernel stack regs at bd92bc2e in 01-cpu-hotplug:3811 has bad 'bp' value 000001be
Bringing in Josh on this, because the merge commit gets fingered because it seems to be an interaction between the new code from the merge and the ORC unwinder changes. It's probably some almost trivial code difference that just causes some code generation to change. And because Josh wasn't implicated in the merge, he didn't get cc'd by the kernel test robot. Of course, the stack trace itself seems to be pretty useless, since that randconfig doesn't have KALLSYMS enabled, so it's just random hex numbers. Josh, original email on lkml and a few other mailing lists, see for example: https://www.spinics.net/lists/devicetree/msg196692.html Any ideas? Linus On Sun, Oct 1, 2017 at 4:17 PM, kernel test robotwrote: > Greetings, > > 0day kernel testing robot got the below dmesg and the first bad commit is > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master > > commit 4879b7ae05431ebcd228a4ff25a81120b3d85891 > Merge tag 'dmaengine-4.12-rc1' of > git://git.infradead.org/users/vkoul/slave-dma > > Pull dmaengine updates from Vinod Koul: > "This time again a smaller update consisting of: > >- support for TI DA8xx dma controller and updates to the cppi driver > >- updates on bunch of drivers like xilinx, pl08x, stm32-dma, mv_xor, > ioat, dmatest" > > * tag 'dmaengine-4.12-rc1' of > git://git.infradead.org/users/vkoul/slave-dma: (35 commits) > dmaengine: pl08x: remove lock documentation > dmaengine: pl08x: fix pl08x_dma_chan_state documentation > dmaengine: pl08x: Use the BIT() macro consistently > dmaengine: pl080: Fix some missing kerneldoc > dmaengine: pl080: Cut some unused defines > dmaengine: dmatest: Add check for supported buffer count (sg_buffers) > dmaengine: dmatest: Select DMA_ENGINE_RAID as its needed for the > slave_sg test > dmaengine: virt-dma: Convert to use list_for_each_entry_safe() > dma-debug: use offset_in_page() macro > dmaengine: mv_xor: use offset_in_page() macro > dmaengine: dmatest: use offset_in_page() macro > dmaengine: sun4i: fix invalid argument > dmaengine: ioat: use setup_timer > dmaengine: cppi41: Fix an Oops happening in cppi41_dma_probe() > dmaengine: pl330: remove pdata based initialization > dmaengine: cppi: fix build error due to bad variable > dmaengine: imx-sdma: add 1ms delay to ensure SDMA channel is stopped > dmaengine: cppi41: use managed functions devm_*() > dmaengine: cppi41: fix cppi41_dma_tx_status() logic > dmaengine: qcom_hidma: pause the channel on shutdown > ... > > ecc721a72c Merge tag 'pwm/for-4.12-rc1' of > git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm > be13ec668d Merge branch 'topic/pl330' into for-linus > 4879b7ae05 Merge tag 'dmaengine-4.12-rc1' of > git://git.infradead.org/users/vkoul/slave-dma > 9e66317d3c Linux 4.14-rc3 > 1418b85217 Add linux-next specific files for 20170929 > +---++++---+---+ > | | ecc721a72c | > be13ec668d | 4879b7ae05 | v4.14-rc3 | next-20170929 | > +---++++---+---+ > | boot_successes| 1009 | 1009 > | 909| 5 | 510 | > | boot_failures | 0 | 0 > | 1 | 4 | 153 | > | WARNING:kernel_stack | 0 | 0 > | 1 | 3 | 111 | > | BUG:unable_to_handle_kernel | 0 | 0 > | 0 | 3 | 48| > | Oops:#[##]| 0 | 0 > | 0 | 3 | 48| > | EIP:update_stack_state| 0 | 0 > | 0 | 3 | 48| > | Kernel_panic-not_syncing:Fatal_exception_in_interrupt | 0 | 0 > | 0 | 3 | 48| > | invoked_oom-killer:gfp_mask=0x| 0 | 0 > | 0 | 1 | 16| > | Mem-Info | 0 | 0 > | 0 | 1 | 16| > | EIP:clear_user| 0 | 0 > | 0 | 0 | 2 | > | EIP:copy_page_to_iter | 0 | 0 > | 0 | 0 | 1 | > +---++++---+---+ > > [
Re: Two regressions on BYT/T ASUS T100TA 4.12-rc: #2 Keyboard no longer works
On Thu, Jun 8, 2017 at 12:49 PM, Alan Sternwrote: > > So maybe CONFIG_HID_ASUS should default to Y? I'll leave it up to Hans > to straighten this out. No, I think the main problem is that the hid_have_special_driver[] array change is garbage. It adds the ASUS ID, regardless of whether the special driver exists or not. We *really* shouldn't do that. We had a very similar regression wrt the "improved" synaptics RMI handling in commit 279967a65b32. If you didn't have HID_RMI enabled at all, that commit made the regular HID driver not take care of the device, and without an RMI driver there was nothing else that took care of it either. So when you add things like "don't react to this device", you damn well should make sure that something else *does* react to the device instead. So those ASUSTEK entries that get disabled in hid-core.c should be protected by something like #if IS_ENABLED(CONFIG_HID_ASUS) ... #endif exactly so that the core HID subsystem doesn't ignore them when nothing else picks up the slack. I note that *some* cases do do that. Too few, apparently. PLEASE STOP THIS MINDLESS "DISABLE GENERIC DEVICE DISCOVERY" CRAP! It's only ok to disable a generic device if you actually have a real working fallback that is actually enabled! So that commit 76dd1fbebbae ("HID: asus: Add support for T100 keyboard") is actively buggy, because it incorrectly assumes a particular config. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [REGRESSION] Failed network caused by: xhci: switch to pci_alloc_irq_vectors
On Fri, May 19, 2017 at 5:46 AM, Christoph Hellwigwrote: > > diff --git a/include/linux/pci.h b/include/linux/pci.h > index 33c2b0b77429..5a7fd3b6a7b9 100644 > --- a/include/linux/pci.h > +++ b/include/linux/pci.h > @@ -1342,7 +1342,7 @@ pci_alloc_irq_vectors_affinity(struct pci_dev *dev, > unsigned int min_vecs, >unsigned int max_vecs, unsigned int flags, >const struct irq_affinity *aff_desc) > { > - if (min_vecs > 1) > + if (min_vecs > 1 || !(flags & PCI_IRQ_LEGACY)) > return -EINVAL; > return 1; > } Side note: why is it doing that " > 1" check, when any value _other_ than 1 is wrong? Also, to match the non-MSI implementation, wouldn't it be nicer to just write it that same way (and also verify "dev->irq"): if (flags & PCI_IRQ_LEGACY) { if (min_vecs == 1 && dev->irq) return 1; } return -ENOSPC; (the exact error value probably doesn't matter in practice, but the CONFIG_MSI case returns ENOSPC by default and that's what Documentation/PCI/MSI-HOWTO.txt says too). Hmm? Linus -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: v4.10-rc8 (-rc6) boot regression on Intel desktop, does not boot after cold boots, boots after reboot
On Thu, Feb 16, 2017 at 12:06 PM, Pavel Machekwrote: > > Hmm, that would explain problems at boot _and_ problems during > suspend/resume. I've committed the revert, and I'm just assuming that the revert also fixed your suspend/resume issues, but I wanted to just double-check that since it's only implied, no staed explicitly.. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: v4.10-rc8 (-rc6) boot regression on Intel desktop, does not boot after cold boots, boots after reboot
On Thu, Feb 16, 2017 at 10:13 AM, Frederic Weisbeckerwrote: > > I haven't followed the discussion but this patch has a known issue which is > fixed > with: > 7bdb59f1ad474bd7161adc8f923cdef10f2638d1 > "tick/nohz: Fix possible missing clock reprog after tick soft restart" > > I hope this fixes your issue. No, Pavel saw the problem with rc8 too, which already has that fix. So I think we'll just need to revert that original patch (and that means that we have to revert the commit you point to as well, since that ->next_tick field was added by the original commit). Pavel, can you verify that rc8 with both 24b91e360ef521a2808771633d76ebc68bd5604b 7bdb59f1ad474bd7161adc8f923cdef10f2638d1 reverted works reliably for you? Linus -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: v4.10-rc8 (-rc6) boot regression on Intel desktop, does not boot after cold boots, boots after reboot
On Wed, Feb 15, 2017 at 3:20 PM, Pavel Machekwrote: > 4.10-rc4 broken > 4.10-rc3 ok Hmm. If those actually end up being reliable, then there's not a whole lot in between them wrt PCI or USB. What looked like the most likely candidate seems to be xhci-specific, though. But maybe it's something that isn't directly in drivers/{pci,usb}/ and just interacts badly. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: net2280 Driver Bug
On Sun, Dec 25, 2016 at 8:09 AM, Raz Manorwrote: > > I had a problem with the net2280 driver- reading from an endpoint resulted > with a wrong read size in some cases. > > I found the problem and fixed it, and I wanted to publish the fix. However, > I have no push access, and I couldn't find who is the maintainer of the > file. > > Attached is the patch to fix the problem, hopefully you could forward it, or > connect me with the maintainer. That patch is really hard to read due to the whitespace changes and all the debugging code. The only real change seems to be that you changed the "tmp" use in reading the scan_dma_completions() to be another variable. That seems to be because the re-use of "tmp" there corrupts the previous value of "tmp" that is then later used for "dma_done()", and you used a different variable for the "readl(>dma->dmacount)". That makes sense. But I note that the exact same thing seems to happen in the other if-statement too. IOW, maybe you meant something like the attached? Does that fix the problem for you too? I don't know the hardware. Maybe overwriting "tmp" was intentional. Regardless, the use of "tmp" as a variable name for something that clearly is *not* temporary is bad. Adding the proper people to the recipients list so that they can hopefully take a more informed look at this. Linus drivers/usb/gadget/udc/net2280.c | 14 +++--- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/drivers/usb/gadget/udc/net2280.c b/drivers/usb/gadget/udc/net2280.c index 85504419ab31..3988a48a 100644 --- a/drivers/usb/gadget/udc/net2280.c +++ b/drivers/usb/gadget/udc/net2280.c @@ -1163,8 +1163,8 @@ static int scan_dma_completions(struct net2280_ep *ep) */ if (unlikely(req->td->dmadesc == 0)) { /* paranoia */ - tmp = readl(>dma->dmacount); - if (tmp & DMA_BYTE_COUNT_MASK) + u32 ep_dmacount = readl(>dma->dmacount); + if (ep_dmacount & DMA_BYTE_COUNT_MASK) break; /* single transfer mode */ dma_done(ep, req, tmp, 0); @@ -1173,24 +1173,24 @@ static int scan_dma_completions(struct net2280_ep *ep) } else if (!ep->is_in && (req->req.length % ep->ep.maxpacket) && !(ep->dev->quirks & PLX_PCIE)) { + u32 ep_stat = readl(>regs->ep_stat); - tmp = readl(>regs->ep_stat); /* AVOID TROUBLE HERE by not issuing short reads from * your gadget driver. That helps avoids errata 0121, * 0122, and 0124; not all cases trigger the warning. */ - if ((tmp & BIT(NAK_OUT_PACKETS)) == 0) { + if ((ep_stat & BIT(NAK_OUT_PACKETS)) == 0) { ep_warn(ep->dev, "%s lost packet sync!\n", ep->ep.name); req->req.status = -EOVERFLOW; } else { - tmp = readl(>regs->ep_avail); - if (tmp) { + u32 ep_avail = readl(>regs->ep_avail); + if (ep_avail) { /* fifo gets flushed later */ ep->out_overflow = 1; ep_dbg(ep->dev, "%s dma, discard %d len %d\n", - ep->ep.name, tmp, + ep->ep.name, ep_avail, req->req.length); req->req.status = -EOVERFLOW; }
Re: [PATCH] usbhid: Fix lockdep unannotated irqs-off warning
On Mon, Jun 27, 2016 at 1:27 PM, Sedat Dilekwrote: > > I just grepped for some "buzzwords" people gave me in this > email-thread and I was looking at (llvm.git HEAD - upcoming v3.9 > release) and found these comments in [1] > > > [1] > https://github.com/llvm-mirror/llvm/blob/master/lib/Target/X86/X86InstrInfo.cpp#L4516 Christ. That's still pretty bad. Using LAHF/SAHF is just wrong. But at least it's not semantically buggy any more, it's just stupid and slow. Apparently the problem is that LLVM doesn't actually track flags as different conditions, but as a single register, and doesn't know which bits of it matter. I guess the SETO + LAHF/SAHF is the best llvm can do then. But it doesn't speak well of the code generation quality. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] usbhid: Fix lockdep unannotated irqs-off warning
On Mon, Jun 27, 2016 at 12:50 PM, Sedat Dilekwrote: > > $ objdump -S clang-eflag.o > > clang-eflag.o: file format elf64-x86-64 > > > Disassembly of section .text: > > : >0: 55 push %rbp >1: 48 89 e5mov%rsp,%rbp >4: 53 push %rbx >5: 50 push %rax >6: e8 00 00 00 00 callq b >b: ff 0d 00 00 00 00 decl 0x0(%rip)# 11 > 11: 9c pushfq > 12: 5b pop%rbx > 13: e8 00 00 00 00 callq 18 > 18: b8 01 00 00 00 mov$0x1,%eax > 1d: 53 push %rbx > 1e: 9d popfq > 1f: 75 07 jne28 Yeah, the above is pure garbage. > So, the issue is still alive. > > What do you mean by "for the kernel we at a minimum need a way to > disable that code generation"? > Can this be fixed in the Linux-kernel? No. This will never be fixed in the kernel. It's a compiler bug. The compiler generates shit code. It's absolutely atrociously bad even if you ignore any kernel issues, because that kind of code just performs badly (the compiler should have used "setcc" or something similar to just set the comparison value, not save and restore eflags. And quite frankly, any compiler writer that thinks it is good code is not somebody I want touching a compiler that the kernel depends on anyway. But it is not just bad code for the kernel, it's actively buggy code, since it corrupts the IF. Until this gets fixed in LLVM, there's no way in hell that we will ever have a kernel compiled with that piece of shit. Really. If the LLVM developers cannot fix their crap code generation, it's not worth touching that shit with a ten-foot pole. I'd love to be able to compile the kernel with LLVM, but the fact that the broken eflags code apparently _still_ hasn't been fixed makes me just go "not worth it". And if the LLVM developers don't see this as an obvious bug, it's even less worth it - because that shows not just that the compiler is broken, but that the developers involved with it are broken too. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [GIT PULL] USB driver patches for 4.6-rc1
On Fri, Mar 18, 2016 at 3:51 PM, Linus Torvalds <torva...@linux-foundation.org> wrote: > > The commit that ends up being marked bad is odd, but there it is: > 69bec7259853 "USB: core: let USB device know device node". Confirmed. Not only did it bisect to that, reverting it on top of the current kernel fixes my machine. So that commit is somehow buggy. I don't see what it does that would break even with OF disabled, but something does. I'll just revert it. The way it is done seems bogus anyway. It looks at of_node when OF is disabled, but generally that isn't even initialized as far as I can tell, and we have things like dev_of_node() helpers to make sure you don't do that. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [GIT PULL] USB driver patches for 4.6-rc1
On Wed, Mar 16, 2016 at 5:09 PM, Greg KHwrote: > > USB patches for 4.6-rc1 > > Here is the big USB patchset for 4.6-rc1. Something in this - or possibly the tty pull, but that doesn't sound very likely - has killed my USB keyboard on my desktop. I'm bisecting right now. Expect a likely revert. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [GIT PULL] USB driver patches for 4.6-rc1
On Fri, Mar 18, 2016 at 2:43 PM, Linus Torvalds <torva...@linux-foundation.org> wrote: > > Something in this - or possibly the tty pull, but that doesn't sound > very likely - has killed my USB keyboard on my desktop. Yeah, the bisect is now solidly in the usb part. The machine has 00:14.0 USB controller: Intel Corporation Sunrise Point-H USB 3.0 xHCI Controller (rev 31) 03:00.0 USB controller: ASMedia Technology Inc. ASM1142 USB 3.1 Host Controller and the keyboard and mouse are on /: Bus 03.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/2p, 480M which seems to be that ASMedia 3.1 controller. In case that gives anybody a clue while I continue bisecting. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [GIT PULL] USB driver patches for 4.6-rc1
On Fri, Mar 18, 2016 at 3:58 PM, Greg KHwrote: > > Yes, people did report issues with that yesterday, and I queued up a > patch for it, it's attached below, but I didn't think it would cause any > issues with non-OF systems either. I wanted to give it a few days > testing in linux-next before sending it to you, but can do so now if you > want. Ok, that fixes it for me. Yeah, I'll need this in order to continue merging, I don't merge more stuff on top of known-bad kernels. I can apply the patch directly (I did it on that machine for testing), or take a pull request. Just let me know. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [GIT PULL] USB driver patches for 4.6-rc1
On Fri, Mar 18, 2016 at 2:58 PM, Linus Torvalds <torva...@linux-foundation.org> wrote: > > Yeah, the bisect is now solidly in the usb part. The commit that ends up being marked bad is odd, but there it is: 69bec7259853 "USB: core: let USB device know device node". Very odd, but I tested multiple times: I'm typing this on d883f52e1f6d, and everything is fine. On 69bec7259853, I can't type. Note that when I say "I can't type", it's the early boot disk encryption password. And maybe there's an oops there, but I can't see it, and I can't log it. I don't even have CONFIG_OF enabled, so I don't see what difference that patch would make. But it makes some difference. I'll double-check by reverting it on top of current git (it does seem to at least revert cleanly), I haven't done that yet. But adding the people involved in that commit to the discussion to see if anybody sees anything. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] cdc_ncm: do not call usbnet_link_change from cdc_ncm_bind
On Mon, Mar 7, 2016 at 12:15 PM, Bjørn Morkwrote: > usbnet_link_change will call schedule_work and should be > avoided if bind is failing. Otherwise we will end up with > scheduled work referring to a netdev which has gone away. > > Instead of making the call conditional, we can just defer > it to usbnet_probe, using the driver_info flag made for > this purpose. So looking at this, I wonder... Why is that FLAG_LINK_INTR thing not just always used? The _only_ thing that FLAG_LINK_INTR does is to cause usbnet_link_change(dev, 0, 0); to be called after network device attach. That doesn't seem to be controversial. Looking at some examples, we have ax88179_178a.c that doesn't set the flag, but instead does that usbnet_link_change() call at the end of ax88179_bind(). There are a few drivers that seem to never call that usbnet_link_change() at all, and don't have that FLAG_LINK_INTR flag. Would they break? Stupid grep: git grep -lw FLAG_ETHER | xargs grep -L FLAG_LINK_INTR | xargs grep -L usbnet_link_change | sed 's:drivers/net/usb/::' gives cdc_eem.c ch9200.c cx82310_eth.c int51x1.c rndis_host.c so maybe that FLAG_LINK_INTR si required. Why is it called "FLAG_LINK_INTR" anyway? There doesn't seem to be anything "INTR" about it. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] usbhid: Fix lockdep unannotated irqs-off warning
On Mon, Mar 7, 2016 at 10:07 AM, Alan Sternwrote: > > Of course, there are other ways to save a single flag value (such as > setz). It's up to the compiler developers to decide what they think is > best. Using 'setcc' to save eflags somewhere is definitely the right thing to do. Using pushf/popf in generated code is completely insane (unless done very localized in a controlled area). It is, in fact, insane and wrong even in user space, since eflags does contain bits that user space itself might be modifying. In fact, even IF may be modified with iopl 3 (thing old X server setups), but ignoring that flag entirely, you have AC that acts in very similar ways (system-wide alignment control) that user space might be using to make sure it doesn't have unaligned accesses. It's rare, yes. But still - this isn't really limited to just the kernel. But perhaps more importantly, I suspect using pushf/popf isn't just semantically the wrong thing to do, it's just plain stupid. It's likely slower than the obvious 'setcc' model. Agner Fog's table shows it "popf" as being 25-30 uops on several microarchitectures. Looks like it's often microcode. Now, pushf/popf may well be fairly cheap on *some* uarchitectures, but it really sounds like a bad idea to use it when not absolutely required. And that is completely independent of the fact that is screws up the IF bit. But yeah, for the kernel we at a minimum need a way to disable that code generation, even if the clang guys might have some insane reason to keep it for other cases. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Possible double-free in the usbnet driver
On Sat, Mar 5, 2016 at 11:53 AM, Bjørn Morkwrote: > > > Definitely. The patch is so obviously correct that we can only wonder how it > was possible to miss it it the first place :) > > Will take a look to see if we could do a better job cleaning up in other > places. What should I do for 4.5? Will there be a pull request for this, or should I just commit my cdc_ncm_bind() patch directly? Linus -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Possible double-free in the usbnet driver
On Fri, Mar 4, 2016 at 2:26 PM, Andrey Konovalovwrote: > > and when I run the vm and connect the device I get: > > [ 23.672662] cdc_ncm 1-1:1.6: bind() failure > [ 23.673447] usbnet_probe(): freeing netdev: 88006ab48000 > [ 23.675822] usbnet_probe(): freeing netdev: 88006ab48000 > > So this seems to be a double-free (or at least a double free_netdev() > call), but the object gets freed twice from usbnet_probe() and not > from usbnet_disconnect(), so you're right that the latter doesn't get > called. I'm not sure how usbnet_probe() ends up being called twice. I still don't think it's a double free. I think the probe thing is called twice, and that's why to different allocations get free'd twice (and the reason it's the same pointer is that the same allocation got reused) But I don't know that driver, really. >> Which particular usbnet bind routine is this? There are multiple >> sub-drivers for usbnet that all do different things. > > cdc_ncm_bind() Ok, so that matches my theory. Does this attached stupid patch make the warning go away? Because from what I can tell, usbnet_link_change() really will start using that "dev" that simply isn't valid - and will be free'd - if the bind fails. So you have usbnet_defer_kevent() getting triggered, which in turn ends up using "usbnet->kevent" But somebody like Oliver is really the right person to check this. For example, it's entirely possible that we should just instead do cancel_work_sync(>kevent); before the "free_netdev(net)" in the "out1" label. And there might be other things that bind() can have touched than the kevent workqueue. Linus drivers/net/usb/cdc_ncm.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/net/usb/cdc_ncm.c b/drivers/net/usb/cdc_ncm.c index dc0212c3cc28..5878b54cc563 100644 --- a/drivers/net/usb/cdc_ncm.c +++ b/drivers/net/usb/cdc_ncm.c @@ -995,6 +995,8 @@ static int cdc_ncm_bind(struct usbnet *dev, struct usb_interface *intf) * placed NDP. */ ret = cdc_ncm_bind_common(dev, intf, CDC_NCM_DATA_ALTSETTING_NCM, 0); + if (ret < 0) + return ret; /* * We should get an event when network connection is "connected" or @@ -1003,7 +1005,7 @@ static int cdc_ncm_bind(struct usbnet *dev, struct usb_interface *intf) * start IPv6 negotiation and more. */ usbnet_link_change(dev, 0, 0); - return ret; + return 0; } static void cdc_ncm_align_tail(struct sk_buff *skb, size_t modulus, size_t remainder, size_t max)
Re: Possible double-free in the usbnet driver
[ Moving this to proper lists ] On Thu, Mar 3, 2016 at 4:19 PM, Andrey Konovalovwrote: > > I found another double-free, this time in the usbnet driver. Hmm. It doesn't look like a double free to me, at least from the logs you attached. > Whenever the `bind()` function fails (drivers/net/usb/usbnet.c:1676) when > called from `usbnet_probe()` (and it can fail due to a invalid usb > descriptor), > `free_netdev()` is called for the `net` device > (drivers/net/usb/usbnet.c:1772). > Then, `free_netdev(net)` is called again in `usbnet_disconnect()` > (drivers/net/usb/usbnet.c:1570) causing a double-free. The KASAN report says that it's a use-after-free in the kworker thread: the net device got free'd at the end of usbnet_probe(), but some work-struct was apparently active at the time. There might be a double free later that isn't in your report, though. Do you have the data for that? But I didn't think we even called the disconnect() function if the "bind()" failed, so I don't think that one should free it. Greg? So it *sounds* to me like the usbnet "bind()" routine ended up returning an error, but doing so *after* it had already activated the structure somehow. Which particular usbnet bind routine is this? There are multiple sub-drivers for usbnet that all do different things. For example, it *looks* like the cdc_ncm_bind() will have done a usbnet_link_change() even if the bind fails. So now we've done a usbnet_defer_kevent() even though we're failing, and then that sets the ball rolling to later touch the netdev that we're freeing due to the failure. But I may be *entirely* misreading this thing. Anyway, I'm cc'ing the usbnet people who actually know the code (and netdev). The proper fix may be to just cancel any work that might have been set up before freeing. Or maybe that netdev *does* get free'd later some other way properly. Let's see what the experts on the usbnet driver say. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [GIT PULL] USB driver fixes for 4.5-rc4
On Sun, Feb 14, 2016 at 11:01 AM, Greg KHwrote: > > USB and PHY fixes for 4.5-rc4 > > Here are a number of USB and PHY driver fixes for 4.5-rc4. Actually, only PHY fixes. The USB fixes you already sent me for rc3. See merge commit 46df55ceeaf3. Apparently you haven't updated your upstream tree.. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [GIT PULL] USB driver fixes for 4.5-rc4
On Sun, Feb 14, 2016 at 12:39 PM, Greg KHwrote: > > Ugh, you are right, sorry about that, do you need an updated pull > request, or are you ok with this one? I took it and updates the merge log appropriately. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] usb: storage: scsiglue: increase transfer size limit
On Thu, Oct 29, 2015 at 4:44 PM, Linus Torvalds <torva...@linux-foundation.org> wrote: > > That said, I'm pretty sure it's just that there were (and probably > still are) tons of bad USB sticks with cheap controllers with 8-bit > sector counts, and you're better off picking something that causes > fairly even sizes than picking 255. Side note: it's not necessarily just the cheap usb memory sticks. I would not be surprised if USB->IDE bridge chipsets etc have a 255-sector limit. The old IDE interfaces contained just a 8-bit sector count, and while 256 sectors was supposed to work (a sector count of zero was supposed to mean 256 sectors), there were at least some disks that got confused and didn't work unless you limited it to 255 sectors rather than 256. So I would really suggest strongly that to blindly increase the max sector field would be rather dangerous. Even if it almost certainly works fine on any *good* USB storage thing. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] usb: storage: scsiglue: increase transfer size limit
On Thu, Oct 29, 2015 at 4:25 PM, Felipe Balbiwrote: > > Fair enough. Linus, do you have any recollection of which device you > found to be quirky ? Your original commit limitting to 240 sectors > doesn't make that clear at all ;-) > > [1] http://marc.info/?l=git-commits-head=106945975115131=2 Yeah, no, my memory doesn't go that far back, and there's a reason we've made good commit messages more of a priority than it used to be.. That said, I'm pretty sure it's just that there were (and probably still are) tons of bad USB sticks with cheap controllers with 8-bit sector counts, and you're better off picking something that causes fairly even sizes than picking 255. If you really want to increase it and have a good reason to, I suspect it should be limited to known-good (or known-modern) devices some way. Or perhaps make it based on size - if you have a 64GB device the controller is likely higher quality than if you have some el-cheapo 256MB old stick, and you probably have a bigger reason to use large transfer sizes anyway. Or perhaps even just on transfer speed... USB sticks are still sold mainly on price, and while there certainly are other USB storage models than "cheap USB stick", that is still the bulk of it by far. And I'm not at all sure quality has improved all that much at the low end. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: First kernel patch (optimization)
On Tue, Sep 15, 2015 at 12:53 PM, Eric Curtinwrote: > My first kernel patch, hope I did everything correctly! Instead of calling > strlen on every iteration of the for loop, just call it once instead and > store in a variable. Heh. Ok, that resulted in a rather long email thread. Anyway, I'd actually prefer to merge this patch, for two reasons: - the "termination calculation is expensive" problem is a real problem, and while in this case the compiler may be able to notice that the "strlen()" is constant over the loop and can be hoisted up, that is not at all necessarily the case most of the time. So I actually think patches like this are good things. Not because this particular code site necessarily matters, but because people who write code with things like "strlen()" in the terminating condition need to learn that it's *wrong*. - I'd much rather see this kind of trivial patch than the usual trivial patch that is clearly just "let's run some script on the kernel and fix up warnings it generates mindlessly". In contrast to that, *this* trivial patch was about somebody who thought about code generation and efficiency. And *that* is the kind of trivial patches we want to encourage, not necessariyl because this particular case was so important, but because that's the kind of people and thinking we want to encourage. So quite frankly, I'd just take this directly, but I'd like more of real changelog. But I wonder if Eric is even reading the emails any more ;) Linus -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Fwd: [PATCH] net:usb:cdc_ncm: fix that tag Huawei devices as wwan
Hmm. Oliver is marked as the maintainer of the USB CDC code, but others have touched it more recently. So I'm just wildly adding people to the cc to comment on this patch and maybe apply it. Oliver/David/Ben/Bjørn? Linus -- Forwarded message -- From: xiaomao xiaomao0...@hotmail.com Date: Sun, Jun 14, 2015 at 1:18 PM Subject: [PATCH] net:usb:cdc_ncm: fix that tag Huawei devices as wwan Form: Qianni mamaqia...@huawei.com Huawei devices is the gereric CDC_NCM devices, but not is WWAN devices. In the patch, we deleted the code about tagging Huawei devices as WWAN. --- Signed-off-by:Qianni mamaqia...@huawei.com --- linux-3.19/drivers/net/usb/cdc_ncm.c.orig 2015-06-15 01:29:52.354238079 +0800 +++ linux-3.19/drivers/net/usb/cdc_ncm.c2015-06-15 01:31:04.074236246 +0800 @@ -1573,12 +1573,12 @@ static const struct usb_device_id cdc_de }, /* tag Huawei devices as wwan */ - { USB_VENDOR_AND_INTERFACE_INFO(0x12d1, + /*{ USB_VENDOR_AND_INTERFACE_INFO(0x12d1, USB_CLASS_COMM, USB_CDC_SUBCLASS_NCM, USB_CDC_PROTO_NONE), .driver_info = (unsigned long)wwan_info, - }, + },*/ /* Infineon(now Intel) HSPA Modem platform */ { USB_DEVICE_AND_INTERFACE_INFO(0x1519, 0x0443,
Re: [GIT PULL] USB driver patches for 3.19-rc1
On Mon, Dec 15, 2014 at 3:32 AM, Hans de Goede hdego...@redhat.com wrote: Code wise this looks good and I've just given: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master a spin with an uas disk enclosure, and verified that tcq is being used, and everything works fine. Thanks for verifying, Linus -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [GIT PULL] USB driver patches for 3.19-rc1
On Sun, Dec 14, 2014 at 2:35 PM, Greg KH gre...@linuxfoundation.org wrote: Hans de Goede (1): uas: Make uas work with blk-mq So I got some fairly trivial conflicts on this one (conflicting with the scsi cleanups mainly by Christoph Hellwig. I resolved the conflict easily, and it all *looks* fine, but quite frankly, I'd be a lot happier about it if somebody who has the hardware were to actually test the end result. The whole interaction with .use_blk_tags etc should be verified by somebody who knows the code. Hans? Christoph? Linus -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [GIT PULL] USB driver patches for 3.19-rc1
On Sun, Dec 14, 2014 at 3:17 PM, Stephen Rothwell s...@canb.auug.org.au wrote: Attached is the message I sent about this on Nov 24 to which I received no response and so assumed it was ok ... Looks like the same resolution I got. I'd still prefer to get an actual positive yup, tested it about it. Linus -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [GIT PULL] USB driver fixes for 3.18-rc7
Hmm, Greg. I seem to get this problem possibly more commonly at boot these days: usb 1-6: new full-speed USB device number 2 using xhci_hcd usb 1-6: device descriptor read/64, error -71 xhci_hcd :00:14.0: Setup ERROR: setup context command for slot 1. usb 1-6: hub failed to enable device, error -22 usb 1-6: new full-speed USB device number 3 using xhci_hcd usb 1-6: device descriptor read/64, error -71 usb 1-6: hub failed to enable device, error -22 usb 1-6: new full-speed USB device number 4 using xhci_hcd usb 1-6: Device not responding to setup address. usb 1-6: Device not responding to setup address. usb 1-6: device not accepting address 4, error -71 usb 1-6: new full-speed USB device number 5 using xhci_hcd usb 1-6: Device not responding to setup address. usb 1-6: Device not responding to setup address. usb 1-6: device not accepting address 5, error -71 usb usb1-port6: unable to enumerate USB device and my keyboard doesn't work. I then unplug and re-plug it, and get usb 1-6: new full-speed USB device number 9 using xhci_hcd usb 1-6: New USB device found, idVendor=2516, idProduct=0020 usb 1-6: New USB device strings: Mfr=1, Product=2, SerialNumber=0 usb 1-6: Product: Quickfire Rapid i usb 1-6: Manufacturer: CM Storm Any ideas? Some setup delay that isn't long enough at boot time for a slightly finicky device? It has happened before, but now I've gotten it twice within a couple of days: Nov 02 12:18:56 i7.lan kernel: usb 1-6: device descriptor read/64, error -71 Nov 28 16:54:06 i7.lan kernel: usb 1-6: device descriptor read/64, error -71 Nov 30 11:26:35 i7.lan kernel: usb 1-6: device descriptor read/64, error -71 (I've had this keyboard since mid-September, and looking at the logs this machine has been rebooted 41 times since, with those three failures..) Linus -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Linux 3.16-rc6
On Thu, Jul 24, 2014 at 5:58 AM, Peter Zijlstra pet...@infradead.org wrote: So going by the nifty picture rostedt made: [ 61.454336]CPU0CPU1 [ 61.454336] [ 61.454336] lock((p-alloc_lock)-rlock); [ 61.454336]local_irq_disable(); [ 61.454336]lock(tasklist_lock); [ 61.454336]lock((p-alloc_lock)-rlock); [ 61.454336] Interrupt [ 61.454336] lock(tasklist_lock); So this *should* be fine. It always has been in the past, and it was certainly the *intention* that it should continue to work with qrwlock, even in the presense of pending writers on other cpu's. The qrwlock rules are that a read-lock in an interrupt is still going to be unfair and succeed if there are other readers. the fact that CPU1 holds tasklist_lock for reading, does not automagically allow CPU0 to acquire tasklist_lock for reading too, for example if CPU2 (not in the picture) is waiting to acquire tasklist_lock for writing, CPU0's read acquire is made to wait. No. That is true for qrwlock in general. But *not* in interrupt context. In interrupt context, it's unfair. At least that was the _intent_ of the code, maybe that got screwed up some way. The only kind of recursion that's safe is same CPU interrupt. Any read-lock from an irq should still be unfair, no same CPU rules. See queue_read_lock_slowpath(), where it will just wait for any actual write lockers (not *waiting* writers) to go away. So by definition, of somebody else (not just the current CPU) holds the lock for reading, taking it for reading is safe on all cpu's in irq context, because we obviously won't be waiting for any actual write lock holders. So it sounds to me like the new lockdep rules in tip/master are too strict and are throwing a false positive. Linus -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Linux 3.16-rc6
On Wed, Jul 23, 2014 at 2:53 AM, Borislav Petkov b...@alien8.de wrote: Well, it looks like we f*cked up something after -rc5 since I'm starting to see lockdep splats all over the place which I didn't see before. I'm running rc6 + tip/master. There was one in r8169 yesterday: https://lkml.kernel.org/r/20140722081840.ga6...@pd.tnic and now I'm seeing the following in a kvm guest. I'm adding some more lists to CC which look like might be related, judging from the stack traces. Hmm. I'm not seeing the reason for this. [ 31.704282] [ INFO: possible irq lock inversion dependency detected ] [ 31.704282] 3.16.0-rc6+ #1 Not tainted [ 31.704282] - [ 31.704282] Xorg/3484 just changed the state of lock: [ 31.704282] (tasklist_lock){.?.+..}, at: [81184b19] send_sigio+0x59/0x1b0 [ 31.704282] but this lock took another, HARDIRQ-unsafe lock in the past: [ 31.704282] ((p-alloc_lock)-rlock){+.+...} Ok, so the claim is that there's a 'p-alloc_lock' (ie task_lock()) that is inside the tasklist_lock, which would indeed be wrong. But I'm not seeing it. The shortest dependencies thing seems to imply __set_task_comm(), but that only takes task_lock. Unless there is something in tip/master. Can you check that this is actually in plain -rc6? Or maybe I'm just blind. Those lockdep splats are easy to get wrong. Adding PeterZ and Ingo to the list just because they are my lockdep go-to people. Linus -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [GIT PULL] USB patches for 3.15-rc1
On Tue, Apr 1, 2014 at 11:49 AM, Greg KH gre...@linuxfoundation.org wrote: USB patches for 3.15-rc1 Here's the big USB pull request for 3.15-rc1. Hmm. I'm getting this when testing: warning: (AHCI_XGENE) selects PHY_XGENE which has unmet direct dependencies (HAS_IOMEM OF (ARM64 || COMPILE_TEST)) which looks like AHCI_XGENE doesn't have the proper dependency on OF (or alternatively PHY_XGENE has an incorrect dependemcy on OF). According to google it looks like Fengguang reported this on kbuild-all, but nowhere else. The actual build then succeeds. But the Kconfig warning is real and implies that something is seriously wrong wrt the dependencies for this thing. It looks like the select PHY_XGENE came in through the libata update, but this USB update actually brought in the config PHY_XGENE and thus this error. Which makes me wonder how this all worked. Why does that select PHY_XGENE exist when apparently it's not needed? Regardless, there's something broken somewhere. Linus -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [3.8-rc3 - 3.8-rc4 regression] Re: [PATCH] module, async: async_synchronize_full() on module init iff async is used
On Tue, Nov 26, 2013 at 1:29 PM, Josh Hunt joshhun...@gmail.com wrote: Both ahci and sata_svw call ata_host_activate(), which call ata_host_register() and async_schedule(async_port_probe, ap). Well, with the modern logic (only wait for async probing if the module itself did async probing) the ahci and svw modules didn't really change any behavior. But other modules did. I wonder, for example, if people insmod the dm module, and expect all devices to exist afterwards. Which the old logic of we always wait for all async code regardless of whether we started it ourselves would do, but the new logic does not. Something similar might hit the (non-modular) md auto-detect ioctl. So maybe we should just special-case those two issues, and say let's just wait for async requests here Something like the appended (whitespace-damaged) diff. Does that make a difference to you guys? And if it does, can you check *which* of the two async_synchronize_full() calls it is that matters for your cases? Linus --- duh, apply by hand -- diff --git a/drivers/md/dm.c b/drivers/md/dm.c index 0704c523a76b..7e7a2f743b11 100644 --- a/drivers/md/dm.c +++ b/drivers/md/dm.c @@ -351,6 +351,7 @@ static int __init dm_init(void) goto bad; } + async_synchronize_full(); return 0; bad: diff --git a/drivers/md/md.c b/drivers/md/md.c index b6b7a2866c9e..1d173dc662fc 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -8602,6 +8602,7 @@ static void autostart_arrays(int part) i_scanned = 0; i_passed = 0; + async_synchronize_full(); printk(KERN_INFO md: Autodetecting RAID arrays.\n); while (!list_empty(all_detected_devices) i_scanned INT_MAX) { -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [3.8-rc3 - 3.8-rc4 regression] Re: [PATCH] module, async: async_synchronize_full() on module init iff async is used
On Tue, Nov 26, 2013 at 2:12 PM, Josh Hunt joshhun...@gmail.com wrote: I should have clarified that I'm not using dm/md in my setup. I know the modules are getting loaded in the log I attached, but root is not a md/dm device. Hmm. The initcall debugging doesn't actually show any of the wait for async events, because those debug messages come from do_one_initcall(), and the waiting happens later. Plus your messages don't actually show where you are trying to - and failing - to mount the root filesystem. Without that kind of information, it's kind of hard to guess. Maybe you could add a few printk's to your kernel? Add one to do_init_module() *after* the if (current-flags PF_USED_ASYNC) async_synchronize_full(); thing, and another to fs/namespace.c: do_mount() (just put something like printk(do_mount: %s at %s\n, dev_name, dir_name); or whatever, so that we can see when that happens.. Linus -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] usb: gadget LLVMLinux: Removing the use of VLAIS from the gadget driver
On Mon, Sep 23, 2013 at 12:30 PM, Felipe Balbi ba...@ti.com wrote: I remember there was a discussion of not dropping variable length array support, wasn't there ? We should definitely drop it. The feature is an abomination. I thought gcc only allowed them at the end of structs, in the middle of a struct it's just f*cking insane beyond belief. That said, for *this* particular case, that USB widget driver already does a ton of small kmalloc's and then remembers the addresses independently. People may not care about performance, but it *might* be a good idea to just do one kmalloc()/kfree(), and then still have those pointer variables, but just be offsets within that one allocation. That's what gcc has to basically do for that thing internally *anyway*, just hidden behind a horrible construct that should never have existed. Linus -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] USB: remove remaining instances of USB_SUSPEND
On Wed, May 1, 2013 at 9:13 AM, Alan Stern st...@rowland.harvard.edu wrote: This patch (as1677) removes the remaining instances of that symbol. Btw, what are these as ID's, and what does the noise buy us? Doing a git log | egrep -w 'as[0-9]{3,}' shows that this has been going on forever, but it still doesn't make any *sense*. It adds nothing worthwhile. In fact, doing git log | grep 'This patch (' shows that you're pretty the only one doing it, but there's *one* other one jidong xiao: jx001. I really don't see the advantage of having this kind of random noise in the kernel logs. Linus -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [GIT PATCH] USB patches for 3.10-rc1
On Mon, Apr 29, 2013 at 9:23 AM, Greg KH gre...@linuxfoundation.org wrote: git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git/ tags/usb-3.10-rc1 This has some odd things in it, but I made the mistake of pushing out my merge before I noticed, so it's out there now regardless. See here: commit 84ebc10294a3 (USB: remove CONFIG_USB_SUSPEND option) claims to remove the USB_SUSPEND config option, but doesn't actually do so. And no, it's not my merge that is in error, although there may be merge errors in addition to the oddities I've found. Doing a git grep -w USB_SUSPEND 84ebc10294a3 shows that there remains things that have depends on .. USB_SUSPEND even in the very commit that claims to remove the option. Later commits then move them around (eg the whole drivers/usb/otg - drivers/usb/phy movement), but the USB_SUSPEND stays around. Hmm? I'm assuming you applied a patch without noticing that in the meantime other users of USB_SUSPEND had grow up? Linus -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [GIT PATCH] USB patches for 3.10-rc1
On Mon, Apr 29, 2013 at 2:14 PM, Alan Stern st...@rowland.harvard.edu wrote: What other things seemed odd about Greg's pull request? The only other thing I noticed was the new CONFIG_USB_PHY quesiton, which is not something that I think is sensible to ask from a user, and the help text doesn't really help anything either. I think the question may make sense, but the wording does not. *EVERYBODY* wants a USB PHY. You can't have USB without a physical layer unless it's a purely virtual device. There's one in a EHCI controller too. It's like a network chip - without a PHY there's no point. Why ask about whether you want to support a phy or not? The question makes no sense. So I don't think the question should be do you want a USB PHY. The question should be Do you want a driver for some of the specialized external USB controllers or something like that. Because as it is now, anybody who actually reads the question is likely to answer y, I think, even if he just wants one of the *normal* USB chips that don't split out the PHY. Hmm? Or does PHY have some magic other meaning in USB circles? In which case the wording should still be totally redone to not be so confusing and misleading. Linus -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/2] USB PM and PM QoS fixes (Re: gpf in pm_qos_remote_wakeup_show)
On Sun, Mar 31, 2013 at 3:56 PM, Rafael J. Wysocki r...@sisk.pl wrote: So, I have two patches (on top of the Linus' tree) that will follow shortly: Should I take these directly as patches, or expect them to show up in a pull soon (ie do you have or expect to have other things pending)? Linus -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [GIT PATCH] USB patches for 3.9-rc1
On Fri, Feb 22, 2013 at 4:19 PM, Rafael J. Wysocki r...@sisk.pl wrote: It won't revert, there's more stuff on top of it. And it is a fix, so reverting it is not really a good idea anyway. Rafael, please don't *ever* write that crap again. We revert stuff whether it fixed something else or not. The rule is NO REGRESSIONS. It doesn't matter one whit if something fixes something else or not - if it breaks an old case, it gets reverted. Seriously. Why do I even have to mention this? Why do I have to explain this to somebody pretty much *every* f*cking merge window? This is not a new rule. And btw, the *reason* for that rule becoming such a hard rule was pretty much exactly suspend/resume and ACPI. Exactly because we used to have those infinite let's fix one thing and break another dances. So you should be well acquainted with the rule, and I'm surprised to hear that kind of utter garbage from you in particular. There is no excuse for regressions, and it is a fix is actually the _least_ valid of all reasons. A commit that causes a regression is - by definition - not a fix. So please don't *ever* say something that stupid again. Things that used to work are simply a million times more important than things that historically didn't work. So this had better get fixed asap, and I need to feel like people are working on it. Otherwise we start reverting. And no amount but it's a fix matters one whit. In fact, it just makes me feel like I need to start reverting early, because the maintainer doesn't seem to understand how serious a regression is. Linus -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [GIT PATCH] USB patches for 3.9-rc1
On Fri, Feb 22, 2013 at 4:48 PM, Rafael J. Wysocki r...@sisk.pl wrote: The problem is, though, that even if bisection turns up something, it doesn't automatically mean that this particular commit is the one that caused the problem to happen in the first place. No, I agree. I just react *very* strongly when somebody starts arguing against reverting. The fact that the commit that was bisected fixes another bug in a previous commit does in fact just imply that that *previous* commit should perhaps be reverted. Of course, I see that that is the first in the whole series, so I understand the pain, but even so.. And looking at that commit that makes power resources be treated specially, that looks completely bogus. It's not what the old code did either, and it seems to be a total hack for the breakage that just happens to fix that one machine for purely random reasons, as far as I can tell. The ACPI_BUS_TYPE_POWER case always had match=1 before, no? The thing that looks to have been dropped somewhere is that .acpi_op_start = !!context, which first became device-add_type = context ? ACPI_BUS_ADD_START : ACPI_BUS_ADD_MATCH; and then somehow that wasn't needed any more, so it became just an unconditional device-add_type = ACPI_BUS_ADD_START; for unexplained reasons (the commit message says that the argument context wasn't used, and renames it not_used to reinforce the point, but doesn't explain why it can remove the use that was there). That unconditional add_type = ACPI_BUS_ADD_START then later becomes flags.match_driver = true, which is where the whole match_driver thing comes in. Anyway, I don't see how magically it became about ACPI_BUS_TYPE_POWER. The code is not exactly obvious, though, so whatever. But that context is suddenly not used commit (e3863094c6f9: ACPI: Drop the second argument of acpi_bus_scan()) looks odd. Maybe context was effectively always non-NULL, but it *looks* like that series of commits is intended to have no semantic changes, and that's the one that looked very dubious to me. Linus -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [GIT PATCH] USB patches for 3.9-rc1
On Thu, Feb 21, 2013 at 10:40 AM, Greg KH gre...@linuxfoundation.org wrote: USB patches for 3.9-rc1 Here's the big USB merge for 3.9-rc1 Nothing major, lots of gadget fixes, and of course, xhci stuff. Ok, so there were a couple of conflicts with Thierry Reding's series to convert devm_request_and_ioremap() users into devm_ioremap_resource(), where some of the old users had been converted to use other helper functions (eg omap_get_control_dev()). I left the omap_get_control_dev() users alone, but I do want to note that omap_control_usb_probe() itself now uses that devm_request_and_ioremap() function. And I did *not* extend the merge to do that kind of conversion in the helper function, so I'm assuming Thierry might want to extend his work. Assuming people care enough.. Linus -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] init, block: try to load default elevator module early during boot
And here I was really hoping that there was a third patch in the series that added the warning... We don't currently have a am I an async worker helper function for the warning to use, which is something very much up your alley. Linus On Wed, Jan 16, 2013 at 1:30 PM, Tejun Heo t...@kernel.org wrote: This patch adds default module loading and uses it to load the default block elevator. During boot, it's called right after initramfs or initrd is made available and right before control is passed to userland. This ensures that as long as the modules are available in the usual places in initramfs, initrd or the root filesystem, the default modules are loaded as soon as possible. This will replace the on-demand elevator module loading from elevator init path. Signed-off-by: Tejun Heo t...@kernel.org Cc: Jens Axboe ax...@kernel.dk Cc: Arjan van de Ven ar...@linux.intel.com Cc: Linus Torvalds torva...@linux-foundation.org Cc: Alex Riesen raa.l...@gmail.com --- block/elevator.c | 16 include/linux/elevator.h |1 + include/linux/init.h |1 + init/do_mounts_initrd.c |3 +++ init/initramfs.c |8 +++- init/main.c | 16 6 files changed, 44 insertions(+), 1 deletion(-) --- a/block/elevator.c +++ b/block/elevator.c @@ -136,6 +136,22 @@ static int __init elevator_setup(char *s __setup(elevator=, elevator_setup); +/* called during boot to load the elevator chosen by the elevator param */ +void __init load_default_elevator_module(void) +{ + struct elevator_type *e; + + if (!chosen_elevator[0]) + return; + + spin_lock(elv_list_lock); + e = elevator_find(chosen_elevator); + spin_unlock(elv_list_lock); + + if (!e) + request_module(%s-iosched, chosen_elevator); +} + static struct kobj_type elv_ktype; static struct elevator_queue *elevator_alloc(struct request_queue *q, --- a/include/linux/elevator.h +++ b/include/linux/elevator.h @@ -138,6 +138,7 @@ extern void elv_drain_elevator(struct re /* * io scheduler registration */ +extern void __init load_default_elevator_module(void); extern int elv_register(struct elevator_type *); extern void elv_unregister(struct elevator_type *); --- a/include/linux/init.h +++ b/include/linux/init.h @@ -161,6 +161,7 @@ extern unsigned int reset_devices; /* used by init/main.c */ void setup_arch(char **); void prepare_namespace(void); +void __init load_default_modules(void); extern void (*late_time_init)(void); --- a/init/do_mounts_initrd.c +++ b/init/do_mounts_initrd.c @@ -57,6 +57,9 @@ static void __init handle_initrd(void) sys_mkdir(/old, 0700); sys_chdir(/old); + /* try loading default modules from initrd */ + load_default_modules(); + /* * In case that a resume from disk is carried out by linuxrc or one of * its children, we need to tell the freezer not to wait for us. --- a/init/initramfs.c +++ b/init/initramfs.c @@ -592,7 +592,7 @@ static int __init populate_rootfs(void) initrd_end - initrd_start); if (!err) { free_initrd(); - return 0; + goto done; } else { clean_rootfs(); unpack_to_rootfs(__initramfs_start, __initramfs_size); @@ -607,6 +607,7 @@ static int __init populate_rootfs(void) sys_close(fd); free_initrd(); } + done: #else printk(KERN_INFO Unpacking initramfs...\n); err = unpack_to_rootfs((char *)initrd_start, @@ -615,6 +616,11 @@ static int __init populate_rootfs(void) printk(KERN_EMERG Initramfs unpacking failed: %s\n, err); free_initrd(); #endif + /* +* Try loading default modules from initramfs. This gives +* us a chance to load before device_initcalls. +*/ + load_default_modules(); } return 0; } --- a/init/main.c +++ b/init/main.c @@ -70,6 +70,8 @@ #include linux/perf_event.h #include linux/file.h #include linux/ptrace.h +#include linux/blkdev.h +#include linux/elevator.h #include asm/io.h #include asm/bugs.h @@ -794,6 +796,17 @@ static void __init do_pre_smp_initcalls( do_one_initcall(*fn); } +/* + * This function requests modules which should be loaded by default and is + * called twice right after initrd is mounted and right before init is + * exec'd. If such modules are on either initrd or rootfs, they will be + * loaded before control is passed to userland. + */ +void __init load_default_modules(void) +{ + load_default_elevator_module(); +} + static int
Re: [PATCH] async: fix __lowest_in_progress()
Tejun, mind explaining this one a bit more to me? If ordering matters, and we have a running queue and a pending queue, how could the pending queue *ever* be lower than the running one? That implies that something was taken off the pending queue and put on the running queue out of order, right? And that in turn implies that there isn't much of a lowest ordering at all, so how could anybody even care about what lowest is? It seems to be a meaningless measure. So with that in mind, I don't see what semantics the first part of the patch fixes. Can you explain more? Linus -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] init, block: try to load default elevator module early during boot
On Thu, Jan 17, 2013 at 10:59 AM, Tejun Heo t...@kernel.org wrote: If you're okay with it, I'll route these two and the patches to add warning through a wq branch. There's already a wq/for-3.9 patch which am_i_async() can make use of, so it's gonna be easier this way. Sounds good to me. Thanks, Linus -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/3] workqueue, async: implement work/async_current_func()
On Thu, Jan 17, 2013 at 5:25 PM, Tejun Heo t...@kernel.org wrote: Implement work/async_current_func() which query whether the current task is a workqueue or async worker respectively and, if so, return the current function being executed along with work / async item related information. So why the odd interface? The only user of it calls it with a NULL/NULL pair of arguments, and in general it's just way too complex to be an exported function at all. I *suspect* you chose that complex interface because you feel you may have some use for it inside of the async code itself, but why isn't that then not totally private to there? IOW, why isn't the interface just static struct worker *current_worker(void) { if (current-flags PF_WQ_WORKER) return kthread_data(current); return NULL; } int current_is_async(void) { struct worker *worker = current_worker(void); return worker worker-current_func == async_run_entry_fn; } and that current_is_async() is enough for the exported interface. Then, if you actually want to care about the work itself, you can use that same current_worker() helper function, and look at the different worker fields more. But why export that kind of logic? Am I missing something? Linus -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/3] workqueue, async: implement work/async_current_func()
On Thu, Jan 17, 2013 at 7:04 PM, Tejun Heo t...@kernel.org wrote: Another thing is that it seems like having introspection type interface often lead to abuses - work_pending(), work_busy() both ended up bringing more unnecessary dependencies and subtle bugginess on internal details than actual benefits. Querying %current is much less likely to be harmful in itself but I'm afraid it might encourage its users to develop something crazy on top. It might be a good idea to make it only available to async. I'm not sure I understand what you mean? Do you mean trying to limit work_current_func() to only be accessible to the async code? You'd have to make some kind of private header file under kernel/ for that, but I guess that would work fine. We already do something similar inside filesystems etc, where they have their own local headers. I still don't really see what other data the async code could possibly ever want than the is this an async thread I guess if you want to keep all these things private to their own C files with no leaving of the structure definitions (even within just a private kernel/worker.h file or something) you could make the interface be one like - kernel/workqueue.c: int current_uses_workfn(work_func_t match) { if (current-flags PF_WQ_WORKER) { struct worker *worker = kthread_data(current); return worker match == worker-current_func; } return 0; } - kernel/async.c: int current_is_async(void) { return current_uses_workfn(async_run_entry_fn); } but quite frankly, we've generally tried to avoid those kinds of silly wrappers just because it's very wasteful to do two function calls just to hide some detail like this. The code generation is atrocious, jumping around for no good reason just increases cache pressure etc. Yes, yes, some globally optimizing compiler could sort it all out, but I'd personally be inclined to just move all the structure definitions into kernel/worker.h, and make the code be inline functions. The only actual current *user* would also be in the kernel/ subdirectory, and we don't know if we'd ever want to really expand it past there. Hmm? Linus -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] module, async: async_synchronize_full() on module init iff async is used
On Wed, Jan 16, 2013 at 9:03 AM, Arjan van de Ven ar...@linux.intel.com wrote: we can even try twice the first time right after we mount the initramfs the second time when the initramfs code exits, and before we exec init (the initramfs supposedly mounted the real root fs at this point) Yes. This, together with don't try request_module for the default elevator, and the warn if somebody does request_module from async context would, I think, be the right thing to do. In the meantime, I've applied Tejun's patch. It possibly speeds things up regardless of this particular deadlock thing, and while it's not pretty it certainly isn't horribly nasty or very invasive either, so I don't see any reason to delay it just because there might be a better solution some day. Linus -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: USB device cannot be reconnected and khubd blocked for more than 120 seconds
[ Added Tejun to the discussion, since he's the async go-to-guy ] On Mon, Jan 14, 2013 at 10:23 PM, Ming Lei ming@canonical.com wrote: But I have another idea to address the problem, and let module code call async_synchronize_full() only if the module requires that explicitly, so how about the below draft patch? No way. This kind of let's just let drivers tell us when they used async helpers is basically *asking* for buggy code. In fact, just to prove how bad it is, YOU SCREWED IT UP YOURSELF. Because it's not just sd.c that uses async_schedule(), and would need the async synchronize. It's floppy.c, it's generic scsi scanning (so scsi tapes etc), and it's libata-core.c. This kind of let's randomly encourage people to write subtly buggy code that has magical timing dependencies, so that the developer won't likely even see it because he has fast disks etc code is totally unacceptable. And this code was *designed* to be that kind of buggy. No, if we set a flag like this, then it needs to be set *automatically*, so that a module cannot screw this up by mistake. It could be as simple as having a per-thread flag that gets set by the __async_schedule() function, and gets cleared by fork. Then the module code could do something like /* before calling the module -init function */ current-used_async = 0; ... if (current-used_async) async_synchronize_full(); or whatever. Tejun, comments? You can see the whole thread on lkml, but the basic problem is that the module loading doing the unconditional async_synchronize_full() has caused problems, because we have - load module A - module A does per-controller async discovery of its devices (eg scsi or ata probing) - in the async thread, it initializes somethign that needs another module B (in this case the default IO scheduler module) - modprobe for B loads the IO scheduler module successfully at the end of the module load, it does async_synchronize_full() to make sure load_module won't return before the module is ready *DEADLOCK*, because the async_synchronize_full() thing actually waits for not the module B async code (it didn't have any), but for the module *A* async code, which is waiting for module B to finish. Now, I'll happily argue that we shouldn't have this kind of load modules from random context behavior in the kernel, and I think the block layer is to blame for doing the IO scheduler load at an insane time. So don't do that then would be the best solution. Sadly, we don't even have a good way to notice that we're doing it, so hacky workaround that at least doesn't require driver authors to care is likely the second-best workaround. But the hacky workaround absolutely needs to be *automatic*. Because the driver writers need to get this subtle untestable thing right is *not* acceptable. That's the patch that Ming Lei did, and I refuse to have that kind of fragile crap in the kernel. Linus -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: USB device cannot be reconnected and khubd blocked for more than 120 seconds
On Tue, Jan 15, 2013 at 9:36 AM, Linus Torvalds torva...@linux-foundation.org wrote: This kind of let's randomly encourage people to write subtly buggy code that has magical timing dependencies, so that the developer won't likely even see it because he has fast disks etc code is totally unacceptable. And this code was *designed* to be that kind of buggy. Btw, we could *possibly* do this the other way around. Wait for all async work by default, but then have a really hacky way to turn that off for modules that explicitly don't want it, because they know they can be loaded in async context, and they don't do any async work themselves. Then we could make the IO schedulers set that flag (I know I'm loaded from async space, and I know I'm not myself doing any async init) Quite frankly, I'd still much rather prefer the automated approach - or even better, just avoiding the load modules in async context entirely. But at least the I can put a huge comment about why I don't want to be waited on would be much more acceptable than the I need to explicitly tell the world that it needs to wait on me. So Ming Lei's patch was easily subtly buggy by mistake (showing that by the fact that it was indeed buggy), while the opposite model where you have to explicitly ask people not to wait for you could still be very buggy, but at least now it needs to explicitly do extra work in order to be buggy. So if an interface is fragile, it should aim to be fragile in the right way - making the fragility explicit, so that people can grep for it, and people can add comments to the particular code that marks it fragile. The default behavior should be the robust one. And if would be lovely to add a warning to the people loaded a module from async context case, so that we'd *see* this. Tejun, is there a good way for code to see I'm running in async context? Then we could do something like WARN_ON_ONCE(wait system_state == SYSTEM_RUNNING in_async_thread()); in kernel/kmod.c (__request_module()). That should at least warn about this whole issue happening. Linus -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: USB device cannot be reconnected and khubd blocked for more than 120 seconds
On Tue, Jan 15, 2013 at 10:32 AM, Tejun Heo t...@kernel.org wrote: I think the root problem here, apart from request_module() from block - which is a bit nasty but making that part completely async would too be quite nasty albeit in a different way - is that async_synchronize_full() is way too indescriminate. It's something only suitable for things like the end of system init. I'm wondering whether what we need is a rudimentray nesting like the following. I think that is a good solution if it works, but look out: we need to synchronize across *all* domains, not just the default one. The sd.c code, for example, uses its own scsi_sd_probe_domain for example, and we *do* want to synchronize with it. Can you do that with your suggested interface (ie it would have to be a *global* sequence number). Linus -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: USB device cannot be reconnected and khubd blocked for more than 120 seconds
On Tue, Jan 15, 2013 at 3:50 PM, Tejun Heo t...@kernel.org wrote: For now, I'm gonna implement simple I'm not gonna wait for myself self-deadlock avoidance. You can't really do that. Or rather, it won't *help*. The thing is, the module loading in particular is not necessarily happening in the same context as what *started* the module loading. A module loader will request the module from user space, and then later user space - through possibly a totally unrelated process - will finish it. So there is no myself. There's not even necessarily any relationship that the kernel even knows about, because the module loading request can have gone from usermode_helper over something like dbus to systemd. See? There's a reason I asked for a warning for this. Or the let's flag the current thread if it ever started anything asynchronous. Because it's complicated. Linus -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: USB device cannot be reconnected and khubd blocked for more than 120 seconds
On Tue, Jan 15, 2013 at 4:36 PM, Linus Torvalds torva...@linux-foundation.org wrote: There's a reason I asked for a warning for this. Or the let's flag the current thread if it ever started anything asynchronous. Because it's complicated. Btw, the sequence counter (that is *not* taking anything else into account) is good enough in practice, exactly because the common case for module loading is actually that nothing in the module init sequence is done asynchronously. Yes, device discovery (particularly for block devices) is often asynchronous. But the modules it then asks to load usually wouldn't be. So if we just have the flag did this thread ever even start async work over the module init sequence, we can just avoid the async serialization entirely for that case, and it breaks the deadlock chain nicely in practice. Only of a block device does async work and then wants to load another module that does more async work in its init routine would it then break. But at that point, I'll happily just put my foot down and tell people they are crazy, and Let's not do that kind of crap. Linus -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] module, async: async_synchronize_full() on module init iff async is used
On Tue, Jan 15, 2013 at 6:52 PM, Tejun Heo t...@kernel.org wrote: It makes me feel dirty but makes the problem go away and I can't think of anything better, so here is the implementation of used async workaround. Ok, people, can we get a tested-by (or Nope, doesn't work) from the people who saw this? That said, maybe we could just make the rule be that you can't pick a default IO scheduler that is modular. And I *would* like to see the warning we discussed. Maybe there are other situations that can trigger this? Because something like that WARN_ON_ONCE(wait i_am_async() system_state == SYSTEM_RUNNING); in kernel/kmod.c (__request_module()) still sounds like a good idea to verify that this is the only thing that triggers it (of course, we'd need to somehow avoid the warning for the known case with the known workaround). Linus -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] module, async: async_synchronize_full() on module init iff async is used
On Tue, Jan 15, 2013 at 7:25 PM, Tejun Heo t...@kernel.org wrote: Hello, Linus. On Tue, Jan 15, 2013 at 07:00:31PM -0800, Linus Torvalds wrote: That said, maybe we could just make the rule be that you can't pick a default IO scheduler that is modular. This is definitely much more preferable but it would affect use case where everything is built modular and the elevator is selected via kernel param. This is way outside the usual usage and we can warn about the new behavior but it still is an observable behavior change. Do you think this would be okay? I do want the same user-visible semantics, so it's not some one-liner. The compiled-in elevator would be easy enough to handle in the Kconfig file (maybe we do already, I didn't even bother to check). The real problem is the chosen_elevator one, which is dynamic with the kernel command line. And we could handle that one by just trying to load the module early (but exactly _when_?) and then instead of looking things up by name, just keep a pointer to the default elevator around. But no, it's not just some trivial one-liner. Especially the question about when to try to load the module that is given on the kernel command line is not trivial. Do we require that the module is in the initrd and loadable basically immediate at boot? Do we try again after switching the root filesystem? Things like that.. And then this warning can be added without introducing request_module_but_dont_warn_about_being_called_from_async(). I do agree that it would be much nicer that way. Linus -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: USB device cannot be reconnected and khubd blocked for more than 120 seconds
On Sun, Jan 13, 2013 at 11:15 PM, Ming Lei ming@canonical.com wrote: The deadlock problem is caused by calling request_module() inside async function of do_scan_async(), and it was introduced by Linus's below commit: commit d6de2c80e9d758d2e36c21699117db6178c0f517 Author: Linus Torvalds torva...@linux-foundation.org Date: Fri Apr 10 12:17:41 2009 -0700 async: Fix module loading async-work regression IMO, maybe the commit isn't a proper fix, considered the below fact: - it isn't good to allow async function to be marked as __init Immaterial. For modules, __init is a non-issue. For non-modules, the synchronization elsewhere is sufficient. - any user mode shouldn't expect that the device is ready just after completing of 'insmod' Bullshit. That expectation is just a fact. People insmod a device driver, and mount the device immediately in scripts. We do not say user mode shouldn't. Seriously. EVER. User mode *does*, and we deal with it. Learn it now, and stop ever saying that again. This is really starting to annoy me. Kernel developers who say user mode should be fixes to not do that should go somewhere else. The whole and *only* point of a kernel is to hide these kinds of issues from user mode, and make things just work in user mode. User mode should not ever worry about oh, doing X can trigger a module load, so now the device might not be available immediately, so I should delay and re-try until it is. That's just f*cking crazy talk. We have a very simple rule in the kernel: we don't break user space. EVER. Learn that rule. I don't ever want to hear any user mode shouldn't expect again. User mode *does* expect. End of discussion. - from view of driver, introducing async_synchronize_full() after do_one_initcall() inside do_init_module() is like a sync probe for drivers built as module, and cause this kind of deadlock easily. So could we revert the commit and fix the previous problems just case by case? or other better fix? There's no way in hell we take a fix things one by one approach. It's not going to work. And your suggestion seems to not do async discovery of devices in general, which is a *much* worse fix than anything else. It's just crazy. But there are other approaches we might take. We might move the call to async_synchronize_full(); to other places. For example, maybe we're better off doing it at block/char device open instead? Linus -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: USB device cannot be reconnected and khubd blocked for more than 120 seconds
On Mon, Jan 14, 2013 at 10:04 AM, Alan Stern st...@rowland.harvard.edu wrote: How about skipping that call if the current thread is one of the async helpers? Is it possible to detect when that happens? Or maybe such a check should go inside async_synchronize_full() itself. Do we have some idea of exactly what is waiting for what? Which async context is causing the module load to happen in the first place? I think *that* is what we should avoid - it sounds like the block layer is loading the IO scheduler at the wrong point. I realize that people like (for testing purposes) to change the IO scheduler at random, but if that means that any IO can basically result in a request_module(), then that sounds like a problem. It seems to be elevator_get(), and I presume the chain is something like load block driver async, the block driver does blk_init_allocated_queue, that does request_module() to find the elevator, the request_module() succeeds, but ends up waiting for async work, which is the block driver load, which is waiting for the request_module to finish. And my gut feel is that blk_init_allocated_queue() probably shouldn't do that request_module() at all. We migth want to do it when we *open* the device, but not while loading the module for the device. So my _feeling_ is that this is just a bug in the block layer, and that it shouldn't set up block device drivers for this kind of crazy need to load the elevator module while in the middle of scanning devices. I think *that* is what we should aim to change. Hmm? That said, I think it might indeed be a good idea to make this problem much easier to see, and that detect when it happens would be a good thing (and then we should WARN_ON_ONCE() on people trying to do request_module() calls from async context). Linus -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH 00/13] firmware loader: introduce cache/uncache firmware
On Wed, Jul 25, 2012 at 5:35 AM, Ming Lei ming@canonical.com wrote: The below patch should fix the problem above. Actually, I think we could make this even simpler. There's nothing wrong with saying user mode is enabled *just* before we unthaw things, if we also simply guarantee that there is no sleeping lock or similar that we might get caught on (causing a deadlock or other untimely waking) before we've actually thawed everything. So *if* the only problem wrt the USB hub code comes from this area, then I think the solution might be as simple as just moving the usermodehelper_enable() up a few lines, with a comment. Something like the *untested* and whitespace-damaged thing below.. Rafael? Who has one of those isight things and has seen the warning to test? Linus --- diff --git a/kernel/power/process.c b/kernel/power/process.c index 19db29f67558..5bf50e488196 100644 --- a/kernel/power/process.c +++ b/kernel/power/process.c @@ -181,6 +181,12 @@ void thaw_processes(void) pm_freezing = false; pm_nosig_freezing = false; + /* +* User mode helper are available again (or will be, +* modulo scheduling) +*/ + usermodehelper_enable(); + oom_killer_enable(); printk(Restarting tasks ... ); @@ -193,8 +199,6 @@ void thaw_processes(void) } while_each_thread(g, p); read_unlock(tasklist_lock); - usermodehelper_enable(); - schedule(); printk(done.\n); } -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] firmware load: defer request_firmware during early boot and resume
On Sun, Jul 22, 2012 at 5:58 AM, Borislav Petkov b...@alien8.de wrote: Question: is there any other reason [besides maybe embedded people who care about each single Kb of memory on the system] why we don't make this cache/uncache firmware thing *implicit*? That is, load it once at driver open time and keep it in memory during the whole driver's lifetime. And this all taken care of by the driver core, btw. So some firmware is a *lot* more than a few kB. We're talking hundreds of kB, sometimes more. And to make matters worse, we keep it in memory with vmalloc(), which is a limited resource on 32-bit systems. So it can actually be worse than just the memory use itself. Also, as you already mentioned, there's the cache coherency issue. There are real advantages to reloading the firmware occasionally, because it might have changed on disk. So.. I do think we might want to consider it, although I do suspect that we do want to throw it out because of the problems with infinite caches. Linus -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] firmware load: defer request_firmware during early boot and resume
On Fri, Jul 20, 2012 at 5:33 AM, Ming Lei tom.leim...@gmail.com wrote: The RFC patch is just for discussing if the idea of deferring request_firmware is doable for addressing the issue of request_firmware in resume path, which is caused by driver unbind/rebind during resume. NAK. It really *has* to be handled some other way. This whole approach seems to actively try to *silently* fix up broken drivers. And it's wrong. There's a reason we warn, and there's a reason we *have* to warn: to let driver writers know that what they are attempting to do MAY NOT WORK. Really. Sure, for a lot of devices it's fine to load the firmware later. But some devices may be part of the resume sequence in very critical ways, and deferring the firmware loading will just mean that the resume will fail. This we *need* the WARN_ON() - so that even in the case where it happens to work, people are very much told that sure, your suspend/resume may have worked, but it was doing fundamentally wrong things that may mean that for other people it *won't* work. For example, maybe it's a USB network dongle, and for *YOU* it is perfectly fine to defer the firmware loading, so that the network comes back up a few seconds after the system is up and running. But in another machine, that exact same USB network dongle may actually be hardwired on the motherboard (it's fairly common to use USB as a system bus in some laptop and embedded devices), and maybe that other machine actually is a thin client that has some tiny rdinit thing, and then everything else is NFS-mounted, and if you resume without networking, the machine is simply *dead*. Ok, so that was a completely made-up example, but we have actually had situations kind of like that, where a device is just not that important for lots of people, but in other situations it's critical for the rest of the suspend/resume to succeed. This is why I'm so vehemently against silently hiding these issues. If you have a driver that has problems, make THAT ONE PARTICULAR driver do the deferral explicitly. Don't make some generic silently defer if there are problems patch. See what I'm saying? You're solving things in exactly the wrong place, and in exactly the wrong way. You're papering things over, and making the generic code silently just make broken cases work. That's really really bad, because it makes it *easier* for driver writers to do the wrong thing without even thinking about it, and without ever seeing the problem. And then when people say suspend/resume doesn't work, the driver author says it works for me and ignores the problem. Because you've systemically made it easy to ignore the problem, and made it easy to do the wrong thing by default. So we should make driver writers do the right thing by default, and if they cannot do the right thing (and the isight camera always comes up, and f*ck it, just fix that driver) then they should do extra work. Seriously. People should load their firmware *before* the suspend/resume cycle. And if that isn't possible, then the system should ABSOLUTELY NOT silently say whatever and defer it until later. We should have that big failure and the big noisy warning, and drivers that need to defer need to do so themselves, so that we never *ever* have that silent automatic defer situation. Linus -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html