Re: Clang and X86-EFlags (was Re: [PATCH] usbhid: Fix lockdep unannotated irqs-off warning)

2018-04-24 Thread Linus Torvalds
On Tue, Apr 24, 2018 at 6:28 AM, Sedat Dilek  wrote:
>
> $ objdump -S clang-eflag.o
>
> Does this now look good?

Looks fine to me. The instruction choice is still pretty odd:

  1a:   f6 c3 fftest   $0xff,%bl
  1d:   74 02   je 21 

that "test   $0xff,%bl" is a rather odd way of testing the byte for zero, since

  1a:   84 dbtest   %bl,%bl
  1c:   74 02   je 20 

would have been a byte shorter and is the canonical way on x86 to test
a register against zero.

But that's just a "looks odd to somebody who is used to x86 asm", not
a bug or anything really noticeable.

  Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Re: dvb usb issues since kernel 4.9

2018-01-09 Thread Linus Torvalds
On Tue, Jan 9, 2018 at 9:57 AM, Eric Dumazet  wrote:
>
> Your patch considers TASKLET_SOFTIRQ being a candidate for 'immediate
> handling', but TCP Small queues heavily use TASKLET,
> so as far as I am concerned a revert would have the same effect.

Does it actually?

TCP ends up dropping packets outside of the window etc, so flooding a
machine with TCP packets and causing some further processing up the
stack sounds very different from the basic packet flooding thing that
happens with NET_RX_SOFTIRQ.

Also, honestly, the kinds of people who really worry about flooding
tend to have packet filtering in the receive path etc.

So I really think "you can use up 90% of CPU time with a UDP packet
flood from the same network" is very very very different - and
honestly not at all as important - as "you want to be able to use a
USB DVB receiver and watch/record TV".

Because that whole "UDP packet flood from the same network" really is
something you _fundamentally_ have other mitigations for.

I bet that whole commit was introduced because of a benchmark test,
rather than real life. No?

In contrast, now people are complaining about real loads not working.

 Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: dvb usb issues since kernel 4.9

2018-01-09 Thread Linus Torvalds
On Tue, Jan 9, 2018 at 9:42 AM, Mauro Carvalho Chehab
 wrote:
>
> On my preliminar tests, writing to a file on an ext4 partition at a
> USB stick loses data up to the point to make it useless (1/4 of the data
> is lost!). However, writing to a class 10 microSD card is doable.

Note that most USB sticks are horrible crap. They can have write
latencies counted in _seconds_.

You can cause VM issues and various other non-hardware stalls with
them, simply because something gets stuck waiting for a page writeout
that should take a few ms on any reasonable hardware, but ends up
talking half a second or more.

For example, even really well-written software that tries to do things
like threaded write-behind to smooth out the IO will be _totally_
screwed by the USB stick behavior (where you might write a few MB at
high speeds, and then the next write - however small - takes a second
because the stupid USB stick does a synchronous garbage collection.
Suddenly all that clever software that tried to keep things moving
along smoothly without any hiccups, and tried hard to make the USB bus
have a nice constant loadm can't do anything at all about the crap
hardware.

So when testing writes to USB sticks, I'm not convinced you're
actually testing any USB bus limitations or even really any other
hardware limitations than the USB stick itself.

  Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Re: dvb usb issues since kernel 4.9

2018-01-09 Thread Linus Torvalds
On Tue, Jan 9, 2018 at 9:27 AM, Eric Dumazet  wrote:
>
> So yes, commit 4cd13c21b207 ("softirq: Let ksoftirqd do its job") has
> shown up multiple times in various 'regressions'
> simply because it could surface the problem more often.
> But even if you revert it, you can still make the faulty
> driver/subsystem misbehave by adding more stress to the cpu handling
> the IRQ.

..but that's always true. People sometimes live on the edge - often by
design (ie hardware has been designed/selected to be the crappiest
possible that still work).

That doesn't change anything. A patch that takes "bad things can
happen" to "bad things DO happen" is a bad patch.

> Maybe the answer is to tune the kernel for small latencies at the
> price of small throughput (situation before the patch)

Generally we always want to tune for latency. Throughput is "easy",
but almost never interesting.

Sure, people do batch jobs. And yes, people often _benchmark_
throughput, because it's easy to benchmark. It's much harder to
benchmark latency, even when it's often much more important.

A prime example is the SSD benchmarks in the last few years - they
improved _dramatically_ when people noticed that the real problem was
latency, not the idiotic maximum big-block bandwidth numbers that have
almost zero impact on most people.

Put another way: we already have a very strong implicit bias towards
bandwidth just because it's easier to see and measure.

That means that we generally should strive to have a explicit bias
towards optimizing for latency when that choice comes up.  Just to
balance things out (and just to not take the easy way out: bandwidth
can often be improved by adding more layers of buffering and bigger
buffers, and that often ends up really hurting latency).

> 1) Revert the patch

Well, we can revert it only partially - limiting it to just networking
for example.

Just saying "act the way you used to for tasklets" already seems to
have fixed the issue in DVB.

> 2) get rid of ksoftirqd since it adds unexpected latencies.

We can't get rid of it entirely, since the synchronous softirq code
can cause problems too. It's why we have that "maximum of ten
synchronous events" in __do_softirq().

And we don't *want* to get rid of it.

We've _always_ had that small-scale "at some point we can't do it
synchronously any more".

That is a small-scale "don't have horrible latency for _other_ things"
protection. So it's about latency too, it's just about protecting
latency of the rest of the system.

The problem with commit 4cd13c21b207 is that it turns the small-scale
latency issues in softirq handling (they get larger latencies for lots
of hardware interrupts or even from non-preemptible kernel code) into
the _huge_ scale latency of scheduling, and does so in a racy way too.

> 3) Let applications that expect to have high throughput make sure to
> pin their threads on cpus that are not processing IRQ.
> (And make sure to not use irqbalance, and setup IRQ cpu affinities)

The only people that really deal in "thoughput only" tend to be the
HPC people, and they already do things like this.

(The other end of the spectrum is the realtime people that have
extreme latency requirements, who do things like that for the reverse
reason: keeping one or more CPU's reserved for the particular
low-latency realtime job).

   Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: dvb usb issues since kernel 4.9

2018-01-08 Thread Linus Torvalds
On Mon, Jan 8, 2018 at 11:15 AM, Alan Stern  wrote:
>
> Both dwc2_hsotg and ehci-hcd use the tasklets embedded in the
> giveback_urb_bh member of struct usb_hcd.  See usb_hcd_giveback_urb()
> in drivers/usb/core/hcd.c; the calls are
>
> else if (high_prio_bh)
> tasklet_hi_schedule(>bh);
> else
> tasklet_schedule(>bh);
>
> As it turns out, high_prio_bh gets set for interrupt and isochronous
> URBs but not for bulk and control URBs.  The DVB driver in question
> uses bulk transfers.

Ok, so we could try out something like the appended?

NOTE! I have not tested this at all. It LooksObvious(tm), but...

Linus
 kernel/softirq.c | 12 
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/kernel/softirq.c b/kernel/softirq.c
index 2f5e87f1bae2..97b080956fea 100644
--- a/kernel/softirq.c
+++ b/kernel/softirq.c
@@ -79,12 +79,16 @@ static void wakeup_softirqd(void)
 
 /*
  * If ksoftirqd is scheduled, we do not want to process pending softirqs
- * right now. Let ksoftirqd handle this at its own rate, to get fairness.
+ * right now. Let ksoftirqd handle this at its own rate, to get fairness,
+ * unless we're doing some of the synchronous softirqs.
  */
-static bool ksoftirqd_running(void)
+#define SOFTIRQ_NOW_MASK ((1 << HI_SOFTIRQ) | (1 << TASKLET_SOFTIRQ))
+static bool ksoftirqd_running(unsigned long pending)
 {
struct task_struct *tsk = __this_cpu_read(ksoftirqd);
 
+   if (pending & SOFTIRQ_NOW_MASK)
+   return false;
return tsk && (tsk->state == TASK_RUNNING);
 }
 
@@ -325,7 +329,7 @@ asmlinkage __visible void do_softirq(void)
 
pending = local_softirq_pending();
 
-   if (pending && !ksoftirqd_running())
+   if (pending && !ksoftirqd_running(pending))
do_softirq_own_stack();
 
local_irq_restore(flags);
@@ -352,7 +356,7 @@ void irq_enter(void)
 
 static inline void invoke_softirq(void)
 {
-   if (ksoftirqd_running())
+   if (ksoftirqd_running(local_softirq_pending()))
return;
 
if (!force_irqthreads) {


Re: dvb usb issues since kernel 4.9

2018-01-08 Thread Linus Torvalds
On Mon, Jan 8, 2018 at 9:55 AM, Ingo Molnar  wrote:
>
> as I doubt we have enough time to root-case this properly.

Well, it's not like this is a new issue, and we don't have to get it
fixed for 4.15. It's been around since 4.9, it's not a "have to
suddenly fix it this week" issue.

I just think that people should plan on having to maybe revert it and
mark the revert for stable.

But if the USB or DVB layers can instead just make the packet queue a
bit deeper and not react so badly to the latency of a single softirq,
that would obviously be a good thing in general, and maybe fix this
issue. So I'm not saying that the revert is inevitable either.

But I have to say that that commit 4cd13c21b207 ("softirq: Let
ksoftirqd do its job") was a pretty damn big hammer, and entirely
ignored the "softirqs can have latency concerns" issue.

So I do feel like the UDP packet storm thing might want a somewhat
more directed fix than that huge hammer of trying to move softirqs
aggressively into the softirq thread.

This is not that different from threaded irqs. And while you can set
the "thread every irq" flag, that would be largely insane to do by
default and in general. So instead, people do it either for specific
irqs (ie "request_threaded_irq()") or they have a way to opt out of it
(IRQF_NO_THREAD).

I _suspect_ that the softirq thing really just wants the same thing.
Have the networking case maybe set the "prefer threaded" flag just for
networking, if it's less latency-sensitive for softirq handling than

In fact, even for networking, there are separate TX/RX softirqs, maybe
networking would only set it for the RX case? Or maybe even trigger it
only for cases where things queue up and it goes into a "polling mode"
(like NAPI already does).

Of course, I don't even know _which_ softirq it is that the DVB case
has issues with. Maybe it's the same NET_RX case?

But looking at that offending commit, I do note (for example), that we
literally have things like tasklet[_hi]_schedule() that might have
been explicitly expected to just run the tasklet at a fairly low
latency (maybe instead of a workqueue exactly because it doesn't need
to sleep and wants lower latency).

So saying "just because softirqd is possibly already woken up, let's
delay all those tasklets etc" does really seem very wrong to me.

Can somebody tell which softirq it is that dvb/usb cares about?

 Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: dvb usb issues since kernel 4.9

2018-01-07 Thread Linus Torvalds
On Sat, Jan 6, 2018 at 11:54 AM, Mauro Carvalho Chehab
 wrote:
>
> Em Sat, 6 Jan 2018 16:04:16 +0100
> "Josef Griebichler"  escreveu:
>>
>> the causing commit has been identified.
>> After reverting commit 
>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=4cd13c21b207e80ddb1144c576500098f2d5f882
>> its working again.
>
> Just replying to me won't magically fix this. The ones that were involved on
> this patch should also be c/c, plus USB people. Just added them.

Actually, you seem to have added an odd subset of the people involved.

For example, Ingo - who actually committed that patch - wasn't on the cc.

I do think we need to simply revert that patch. It's very simple: it
has been reported to lead to actual problems for people, and we don't
fix one problem and then say "well, it fixed something else" when
something breaks.

When something breaks, we either unbreak it, or we revert the change
that caused the breakage.

It's really that simple. That's what "no regressions" means.  We don't
accept changes that cause regressions. This one did.

  Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL] USB/PHY driver changes for 4.15-rc1

2017-11-13 Thread Linus Torvalds
On Mon, Nov 13, 2017 at 8:19 AM, Greg KH  wrote:
>
> Other major thing is the typec code that moved out of staging and into
> the "real" part of the drivers/usb/ tree, which was nice to see happen.

Hmm. So now it asks me about Type-C Port Controller Manager. Fair
enough. I say "N", because I have none. But then it still asks me
about that TI TPS6598x driver...

So I do see the _technical_ logic in there - the "TYPEC" config option
is a hidden internal option, and it's selected by the things that need
it.

But from a user perspective, this configuration model is really strange.

Why is TYPEC_TCPM something you ask the user, but not "do you want
Type-C support"?  And if you single out the PCM side to ask about, why
don't you single out the power delivery side?

Wouldn't it make more sense to at least ask whether I want Type-C
power delivery chips before it then starts asking about individual PD
drivers, the same way you asked about the port controller before you
started asking ab out individual port controller drivers?

Or is it just me who finds this a bit odd?

   Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 4879b7ae05 ("Merge tag 'dmaengine-4.12-rc1' of .."): WARNING: kernel stack regs at bd92bc2e in 01-cpu-hotplug:3811 has bad 'bp' value 000001be

2017-10-02 Thread Linus Torvalds
On Mon, Oct 2, 2017 at 2:26 PM, Josh Poimboeuf  wrote:
>
> The bisect is pointing to a commit which is almost 5 months old, so this
> is pre-ORC.  Kallsyms *is* enabled, but the unwinder dump isn't smart
> enough to realize it's dumping misaligned stack addresses:

Ahh, I didn't pick up on that "esp isn't aligned" part.

That said, if %esp gets unaligned at some point, it's not clear
exactly when we should align it. An unaligned stack pointer will
continue to _work_ just potentially perform fairly badly.

But more likely, we picked the wrong frame value to begin with.

For example, maybe that decode_frame_pointer() logic really should
check not that the low bit in bp is set, but instead check that it's a
valid "unsigned long *" that has the low bit set.

IOW, the difference would be that instead of checking

if (!(regs & 0x1))
return NULL;

if would check

if ((regs & (sizeof(unsigned long)-1)) != 1)
return NULL;

but also maybe add logic to simply not trust a next frame pointer that
isn't appropriately aligned.

So I think adding PTR_ALIGN() there in the unwind dumper might be a
bit late. By that time it has already accepted what looks like a
garbage frame. No?

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 4879b7ae05 ("Merge tag 'dmaengine-4.12-rc1' of .."): WARNING: kernel stack regs at bd92bc2e in 01-cpu-hotplug:3811 has bad 'bp' value 000001be

2017-10-02 Thread Linus Torvalds
Bringing in Josh on this, because the merge commit gets fingered
because it seems to be an interaction between the new code from the
merge and the ORC unwinder changes. It's probably some almost trivial
code difference that just causes some code generation to change.

And because Josh wasn't implicated in the merge, he didn't get cc'd by
the kernel test robot.

Of course, the stack trace itself seems to be pretty useless, since
that randconfig doesn't have KALLSYMS enabled, so it's just random hex
numbers.

Josh, original email on lkml and a few other mailing lists, see for example:

https://www.spinics.net/lists/devicetree/msg196692.html

Any ideas?

Linus

On Sun, Oct 1, 2017 at 4:17 PM, kernel test robot
 wrote:
> Greetings,
>
> 0day kernel testing robot got the below dmesg and the first bad commit is
>
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
>
> commit 4879b7ae05431ebcd228a4ff25a81120b3d85891
> Merge tag 'dmaengine-4.12-rc1' of 
> git://git.infradead.org/users/vkoul/slave-dma
>
> Pull dmaengine updates from Vinod Koul:
>  "This time again a smaller update consisting of:
>
>- support for TI DA8xx dma controller and updates to the cppi driver
>
>- updates on bunch of drivers like xilinx, pl08x, stm32-dma, mv_xor,
>  ioat, dmatest"
>
> * tag 'dmaengine-4.12-rc1' of 
> git://git.infradead.org/users/vkoul/slave-dma: (35 commits)
>   dmaengine: pl08x: remove lock documentation
>   dmaengine: pl08x: fix pl08x_dma_chan_state documentation
>   dmaengine: pl08x: Use the BIT() macro consistently
>   dmaengine: pl080: Fix some missing kerneldoc
>   dmaengine: pl080: Cut some unused defines
>   dmaengine: dmatest: Add check for supported buffer count (sg_buffers)
>   dmaengine: dmatest: Select DMA_ENGINE_RAID as its needed for the 
> slave_sg test
>   dmaengine: virt-dma: Convert to use list_for_each_entry_safe()
>   dma-debug: use offset_in_page() macro
>   dmaengine: mv_xor: use offset_in_page() macro
>   dmaengine: dmatest: use offset_in_page() macro
>   dmaengine: sun4i: fix invalid argument
>   dmaengine: ioat: use setup_timer
>   dmaengine: cppi41: Fix an Oops happening in cppi41_dma_probe()
>   dmaengine: pl330: remove pdata based initialization
>   dmaengine: cppi: fix build error due to bad variable
>   dmaengine: imx-sdma: add 1ms delay to ensure SDMA channel is stopped
>   dmaengine: cppi41: use managed functions devm_*()
>   dmaengine: cppi41: fix cppi41_dma_tx_status() logic
>   dmaengine: qcom_hidma: pause the channel on shutdown
>   ...
>
> ecc721a72c  Merge tag 'pwm/for-4.12-rc1' of 
> git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm
> be13ec668d  Merge branch 'topic/pl330' into for-linus
> 4879b7ae05  Merge tag 'dmaengine-4.12-rc1' of 
> git://git.infradead.org/users/vkoul/slave-dma
> 9e66317d3c  Linux 4.14-rc3
> 1418b85217  Add linux-next specific files for 20170929
> +---++++---+---+
> |   | ecc721a72c | 
> be13ec668d | 4879b7ae05 | v4.14-rc3 | next-20170929 |
> +---++++---+---+
> | boot_successes| 1009   | 1009   
> | 909| 5 | 510   |
> | boot_failures | 0  | 0  
> | 1  | 4 | 153   |
> | WARNING:kernel_stack  | 0  | 0  
> | 1  | 3 | 111   |
> | BUG:unable_to_handle_kernel   | 0  | 0  
> | 0  | 3 | 48|
> | Oops:#[##]| 0  | 0  
> | 0  | 3 | 48|
> | EIP:update_stack_state| 0  | 0  
> | 0  | 3 | 48|
> | Kernel_panic-not_syncing:Fatal_exception_in_interrupt | 0  | 0  
> | 0  | 3 | 48|
> | invoked_oom-killer:gfp_mask=0x| 0  | 0  
> | 0  | 1 | 16|
> | Mem-Info  | 0  | 0  
> | 0  | 1 | 16|
> | EIP:clear_user| 0  | 0  
> | 0  | 0 | 2 |
> | EIP:copy_page_to_iter | 0  | 0  
> | 0  | 0 | 1 |
> +---++++---+---+
>
> [ 

Re: Two regressions on BYT/T ASUS T100TA 4.12-rc: #2 Keyboard no longer works

2017-06-08 Thread Linus Torvalds
On Thu, Jun 8, 2017 at 12:49 PM, Alan Stern  wrote:
>
> So maybe CONFIG_HID_ASUS should default to Y?  I'll leave it up to Hans
> to straighten this out.

No, I think the main problem is that the hid_have_special_driver[]
array change is garbage.

It adds the ASUS ID, regardless of whether the special driver exists or not.

We *really* shouldn't do that. We had a very similar regression wrt
the "improved" synaptics RMI handling in commit 279967a65b32.

If you didn't have HID_RMI enabled at all, that commit made the
regular HID driver not take care of the device, and without an RMI
driver there was nothing else that took care of it either.

So when you add things like "don't react to this device", you damn
well should make sure that something else *does* react to the device
instead.

So those ASUSTEK entries that get disabled in hid-core.c should be
protected by something like

  #if IS_ENABLED(CONFIG_HID_ASUS)
...
  #endif

exactly so that the core HID subsystem doesn't ignore them when
nothing else picks up the slack.

I note that *some* cases do do that. Too few, apparently.

PLEASE STOP THIS MINDLESS "DISABLE GENERIC DEVICE DISCOVERY" CRAP!

It's only ok to disable a generic device if you actually have a real
working fallback that is actually enabled!

So that commit 76dd1fbebbae ("HID: asus: Add support for T100
keyboard") is actively buggy, because it incorrectly assumes a
particular config.

  Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [REGRESSION] Failed network caused by: xhci: switch to pci_alloc_irq_vectors

2017-05-20 Thread Linus Torvalds
On Fri, May 19, 2017 at 5:46 AM, Christoph Hellwig  wrote:
>
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index 33c2b0b77429..5a7fd3b6a7b9 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -1342,7 +1342,7 @@ pci_alloc_irq_vectors_affinity(struct pci_dev *dev, 
> unsigned int min_vecs,
>unsigned int max_vecs, unsigned int flags,
>const struct irq_affinity *aff_desc)
>  {
> -   if (min_vecs > 1)
> +   if (min_vecs > 1 || !(flags & PCI_IRQ_LEGACY))
> return -EINVAL;
> return 1;
>  }

Side note: why is it doing that " > 1" check, when any value _other_
than 1 is wrong?

Also, to match the non-MSI implementation, wouldn't it be nicer to
just write it that same way (and also verify "dev->irq"):

if (flags & PCI_IRQ_LEGACY) {
if (min_vecs == 1 && dev->irq)
return 1;
}
return -ENOSPC;

(the exact error value probably doesn't matter in practice, but the
CONFIG_MSI case returns ENOSPC by default and that's what
Documentation/PCI/MSI-HOWTO.txt says too).

Hmm?

 Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: v4.10-rc8 (-rc6) boot regression on Intel desktop, does not boot after cold boots, boots after reboot

2017-02-16 Thread Linus Torvalds
On Thu, Feb 16, 2017 at 12:06 PM, Pavel Machek  wrote:
>
> Hmm, that would explain problems at boot _and_ problems during
> suspend/resume.

I've committed the revert, and I'm just assuming that the revert also
fixed your suspend/resume issues, but I wanted to just double-check
that since it's only implied, no staed explicitly..

 Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: v4.10-rc8 (-rc6) boot regression on Intel desktop, does not boot after cold boots, boots after reboot

2017-02-16 Thread Linus Torvalds
On Thu, Feb 16, 2017 at 10:13 AM, Frederic Weisbecker
 wrote:
>
> I haven't followed the discussion but this patch has a known issue which is 
> fixed
> with:
> 7bdb59f1ad474bd7161adc8f923cdef10f2638d1
> "tick/nohz: Fix possible missing clock reprog after tick soft restart"
>
> I hope this fixes your issue.

No, Pavel saw the problem with rc8 too, which already has that fix.

So I think we'll just need to revert that original patch (and that
means that we have to revert the commit you point to as well, since
that ->next_tick field was added by the original commit).

Pavel, can you verify that rc8 with both

  24b91e360ef521a2808771633d76ebc68bd5604b
  7bdb59f1ad474bd7161adc8f923cdef10f2638d1

reverted works reliably for you?

   Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: v4.10-rc8 (-rc6) boot regression on Intel desktop, does not boot after cold boots, boots after reboot

2017-02-15 Thread Linus Torvalds
On Wed, Feb 15, 2017 at 3:20 PM, Pavel Machek  wrote:
> 4.10-rc4 broken
> 4.10-rc3 ok

Hmm. If those actually end up being reliable, then there's not a whole
lot in between them wrt PCI or USB.

What looked like the most likely candidate seems to be xhci-specific, though.

But maybe it's something that isn't directly in drivers/{pci,usb}/ and
just interacts badly.

 Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: net2280 Driver Bug

2016-12-25 Thread Linus Torvalds
On Sun, Dec 25, 2016 at 8:09 AM, Raz Manor  wrote:
>
> I had a problem with the net2280 driver- reading from an endpoint resulted
> with a wrong read size in some cases.
>
> I found the problem and fixed it, and I wanted to publish the fix. However,
> I have no push access, and I couldn't find who is the maintainer of the
> file.
>
> Attached is the patch to fix the problem, hopefully you could forward it, or
> connect me with the maintainer.

That patch is really hard to read due to the whitespace changes and
all the debugging code.

The only real change seems to be that you changed the "tmp" use in
reading the scan_dma_completions() to be another variable.

That seems to be because the re-use of "tmp" there corrupts the
previous value of "tmp" that is then later used for "dma_done()", and
you used a different variable for the "readl(>dma->dmacount)".
That makes sense. But I note that the exact same thing seems to happen
in the other if-statement too.

IOW, maybe you meant something like the attached? Does that fix the
problem for you too?

I don't know the hardware. Maybe overwriting "tmp" was intentional.
Regardless, the use of "tmp" as a variable name for something that
clearly is *not* temporary is bad.

Adding the proper people to the recipients list so that they can
hopefully take a more informed look at this.

 Linus
 drivers/usb/gadget/udc/net2280.c | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/usb/gadget/udc/net2280.c b/drivers/usb/gadget/udc/net2280.c
index 85504419ab31..3988a48a 100644
--- a/drivers/usb/gadget/udc/net2280.c
+++ b/drivers/usb/gadget/udc/net2280.c
@@ -1163,8 +1163,8 @@ static int scan_dma_completions(struct net2280_ep *ep)
 */
if (unlikely(req->td->dmadesc == 0)) {
/* paranoia */
-   tmp = readl(>dma->dmacount);
-   if (tmp & DMA_BYTE_COUNT_MASK)
+   u32 ep_dmacount = readl(>dma->dmacount);
+   if (ep_dmacount & DMA_BYTE_COUNT_MASK)
break;
/* single transfer mode */
dma_done(ep, req, tmp, 0);
@@ -1173,24 +1173,24 @@ static int scan_dma_completions(struct net2280_ep *ep)
} else if (!ep->is_in &&
   (req->req.length % ep->ep.maxpacket) &&
   !(ep->dev->quirks & PLX_PCIE)) {
+   u32 ep_stat = readl(>regs->ep_stat);
 
-   tmp = readl(>regs->ep_stat);
/* AVOID TROUBLE HERE by not issuing short reads from
 * your gadget driver.  That helps avoids errata 0121,
 * 0122, and 0124; not all cases trigger the warning.
 */
-   if ((tmp & BIT(NAK_OUT_PACKETS)) == 0) {
+   if ((ep_stat & BIT(NAK_OUT_PACKETS)) == 0) {
ep_warn(ep->dev, "%s lost packet sync!\n",
ep->ep.name);
req->req.status = -EOVERFLOW;
} else {
-   tmp = readl(>regs->ep_avail);
-   if (tmp) {
+   u32 ep_avail = readl(>regs->ep_avail);
+   if (ep_avail) {
/* fifo gets flushed later */
ep->out_overflow = 1;
ep_dbg(ep->dev,
"%s dma, discard %d len %d\n",
-   ep->ep.name, tmp,
+   ep->ep.name, ep_avail,
req->req.length);
req->req.status = -EOVERFLOW;
}


Re: [PATCH] usbhid: Fix lockdep unannotated irqs-off warning

2016-06-27 Thread Linus Torvalds
On Mon, Jun 27, 2016 at 1:27 PM, Sedat Dilek  wrote:
>
> I just grepped for some "buzzwords" people gave me in this
> email-thread and I was looking at (llvm.git HEAD - upcoming v3.9
> release) and found these comments in [1]
>
>
> [1] 
> https://github.com/llvm-mirror/llvm/blob/master/lib/Target/X86/X86InstrInfo.cpp#L4516

Christ. That's still pretty bad. Using LAHF/SAHF is just wrong.

But at least it's not semantically buggy any more, it's just stupid and slow.

Apparently the problem is that LLVM doesn't actually track flags as
different conditions, but as a single register, and doesn't know which
bits of it matter.

I guess the SETO + LAHF/SAHF is the best llvm can do then. But it
doesn't speak well of the code generation quality.

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] usbhid: Fix lockdep unannotated irqs-off warning

2016-06-27 Thread Linus Torvalds
On Mon, Jun 27, 2016 at 12:50 PM, Sedat Dilek  wrote:
>
> $ objdump -S clang-eflag.o
>
> clang-eflag.o: file format elf64-x86-64
>
>
> Disassembly of section .text:
>
>  :
>0:   55  push   %rbp
>1:   48 89 e5mov%rsp,%rbp
>4:   53  push   %rbx
>5:   50  push   %rax
>6:   e8 00 00 00 00  callq  b 
>b:   ff 0d 00 00 00 00   decl   0x0(%rip)# 11 
>   11:   9c  pushfq
>   12:   5b  pop%rbx
>   13:   e8 00 00 00 00  callq  18 
>   18:   b8 01 00 00 00  mov$0x1,%eax
>   1d:   53  push   %rbx
>   1e:   9d  popfq
>   1f:   75 07   jne28 


Yeah, the above is pure garbage.

> So, the issue is still alive.
>
> What do you mean by "for the kernel we at a minimum need a way to
> disable that code generation"?
> Can this be fixed in the Linux-kernel?

No. This will never be fixed in the kernel. It's a compiler bug.

The compiler generates shit code. It's absolutely atrociously bad even
if you ignore any kernel issues, because that kind of code just
performs badly (the compiler should have used "setcc" or something
similar to just set the comparison value, not save and restore eflags.

And quite frankly, any compiler writer that thinks it is good code is
not somebody I want touching a compiler that the kernel depends on
anyway.

But it is not just bad code for the kernel, it's actively buggy code,
since it corrupts the IF.

Until this gets fixed in LLVM, there's no way in hell that we will
ever have a kernel compiled with that piece of shit.

Really. If the LLVM developers cannot fix their crap code generation,
it's not worth touching that shit with a ten-foot pole.

I'd love to be able to compile the kernel with LLVM, but the fact that
the broken eflags code apparently _still_ hasn't been fixed makes me
just go "not worth it".

And if the LLVM developers don't see this as an obvious bug, it's even
less worth it - because that shows not just that the compiler is
broken, but that the developers involved with it are broken too.

  Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL] USB driver patches for 4.6-rc1

2016-03-19 Thread Linus Torvalds
On Fri, Mar 18, 2016 at 3:51 PM, Linus Torvalds
<torva...@linux-foundation.org> wrote:
>
> The commit that ends up being marked bad is odd, but there it is:
> 69bec7259853 "USB: core: let USB device know device node".

Confirmed. Not only did it bisect to that, reverting it on top of the
current kernel fixes my machine.

So that commit is somehow buggy. I don't see what it does that would
break even with OF disabled, but something does.

I'll just revert it. The way it is done seems bogus anyway.  It looks
at of_node when OF is disabled, but generally that isn't even
initialized as far as I can tell, and we have things like
dev_of_node() helpers to make sure you don't do that.

   Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL] USB driver patches for 4.6-rc1

2016-03-19 Thread Linus Torvalds
On Wed, Mar 16, 2016 at 5:09 PM, Greg KH  wrote:
>
> USB patches for 4.6-rc1
>
> Here is the big USB patchset for 4.6-rc1.

Something in this - or possibly the tty pull, but that doesn't sound
very likely - has killed my USB keyboard on my desktop.

I'm bisecting right now. Expect a likely revert.

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL] USB driver patches for 4.6-rc1

2016-03-19 Thread Linus Torvalds
On Fri, Mar 18, 2016 at 2:43 PM, Linus Torvalds
<torva...@linux-foundation.org> wrote:
>
> Something in this - or possibly the tty pull, but that doesn't sound
> very likely - has killed my USB keyboard on my desktop.

Yeah, the bisect is now solidly in the usb part.

The machine has

00:14.0 USB controller: Intel Corporation Sunrise Point-H USB 3.0 xHCI
Controller (rev 31)
03:00.0 USB controller: ASMedia Technology Inc. ASM1142 USB 3.1 Host Controller

and the keyboard and mouse are on

/:  Bus 03.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/2p, 480M

which seems to be that ASMedia 3.1 controller.

In case that gives anybody a clue while I continue bisecting.

  Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL] USB driver patches for 4.6-rc1

2016-03-18 Thread Linus Torvalds
On Fri, Mar 18, 2016 at 3:58 PM, Greg KH  wrote:
>
> Yes, people did report issues with that yesterday, and I queued up a
> patch for it, it's attached below, but I didn't think it would cause any
> issues with non-OF systems either.  I wanted to give it a few days
> testing in linux-next before sending it to you, but can do so now if you
> want.

Ok, that fixes it for me. Yeah, I'll need this in order to continue
merging, I don't merge more stuff on top of known-bad kernels.

I can apply the patch directly (I did it on that machine for testing),
or take a pull request. Just let me know.

  Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL] USB driver patches for 4.6-rc1

2016-03-18 Thread Linus Torvalds
On Fri, Mar 18, 2016 at 2:58 PM, Linus Torvalds
<torva...@linux-foundation.org> wrote:
>
> Yeah, the bisect is now solidly in the usb part.

The commit that ends up being marked bad is odd, but there it is:
69bec7259853 "USB: core: let USB device know device node".

Very odd, but I tested multiple times: I'm typing this on
d883f52e1f6d, and everything is fine. On 69bec7259853, I can't type.

Note that when I say "I can't type", it's the early boot disk
encryption password.  And maybe there's an oops there, but I can't see
it, and I can't log it.

I don't even have CONFIG_OF enabled, so I don't see what difference
that patch would make. But it makes some difference.

I'll double-check by reverting it on top of current git (it does seem
to at least revert cleanly), I haven't done that yet. But adding the
people involved in that commit to the discussion to see if anybody
sees anything.

  Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] cdc_ncm: do not call usbnet_link_change from cdc_ncm_bind

2016-03-08 Thread Linus Torvalds
On Mon, Mar 7, 2016 at 12:15 PM, Bjørn Mork  wrote:
> usbnet_link_change will call schedule_work and should be
> avoided if bind is failing. Otherwise we will end up with
> scheduled work referring to a netdev which has gone away.
>
> Instead of making the call conditional, we can just defer
> it to usbnet_probe, using the driver_info flag made for
> this purpose.

So looking at this, I wonder...

Why is that FLAG_LINK_INTR thing not just always used?

The _only_ thing that FLAG_LINK_INTR does is to cause

usbnet_link_change(dev, 0, 0);

to be called after network device attach. That doesn't seem to be controversial.

Looking at some examples, we have ax88179_178a.c that doesn't set the
flag, but instead does that usbnet_link_change() call at the end of
ax88179_bind().

There are a few drivers that seem to never call that
usbnet_link_change() at all, and don't have that FLAG_LINK_INTR flag.
Would they break?

Stupid grep:

git grep -lw FLAG_ETHER |
xargs grep -L FLAG_LINK_INTR |
xargs grep -L usbnet_link_change |
sed 's:drivers/net/usb/::'

gives

cdc_eem.c
ch9200.c
cx82310_eth.c
int51x1.c
rndis_host.c

so maybe that FLAG_LINK_INTR si required.

Why is it called "FLAG_LINK_INTR" anyway? There doesn't seem to be
anything "INTR" about it.

   Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] usbhid: Fix lockdep unannotated irqs-off warning

2016-03-07 Thread Linus Torvalds
On Mon, Mar 7, 2016 at 10:07 AM, Alan Stern  wrote:
>
> Of course, there are other ways to save a single flag value (such as
> setz).  It's up to the compiler developers to decide what they think is
> best.

Using 'setcc' to save eflags somewhere is definitely the right thing to do.

Using pushf/popf in generated code is completely insane (unless done
very localized in a controlled area).

It is, in fact, insane and wrong even in user space, since eflags does
contain bits that user space itself might be modifying.

In fact, even IF may be modified with iopl 3 (thing old X server
setups), but ignoring that flag entirely, you have AC that acts in
very similar ways (system-wide alignment control) that user space
might be using to make sure it doesn't have unaligned accesses.

It's rare, yes. But still - this isn't really limited to just the kernel.

But perhaps more importantly, I suspect using pushf/popf isn't just
semantically the wrong thing to do, it's just plain stupid. It's
likely slower than the obvious 'setcc' model. Agner Fog's table shows
it "popf" as being 25-30 uops on several microarchitectures. Looks
like it's often microcode.

Now, pushf/popf may well be fairly cheap on *some* uarchitectures, but
it really sounds like a bad idea to use it when not absolutely
required. And that is completely independent of the fact that is
screws up the IF bit.

But yeah, for the kernel we at a minimum need a way to disable that
code generation, even if the clang guys might have some insane reason
to keep it for other cases.

  Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Possible double-free in the usbnet driver

2016-03-07 Thread Linus Torvalds
On Sat, Mar 5, 2016 at 11:53 AM, Bjørn Mork  wrote:
>
>
> Definitely.  The patch is so obviously correct that we can only wonder how it 
> was possible to miss it it the first place :)
>
> Will take a look to see if we could do a better job cleaning up in other 
> places.

What should I do for 4.5? Will there be a pull request for this, or
should I just commit my cdc_ncm_bind() patch directly?

  Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Possible double-free in the usbnet driver

2016-03-04 Thread Linus Torvalds
On Fri, Mar 4, 2016 at 2:26 PM, Andrey Konovalov  wrote:
>
> and when I run the vm and connect the device I get:
>
> [   23.672662] cdc_ncm 1-1:1.6: bind() failure
> [   23.673447] usbnet_probe(): freeing netdev: 88006ab48000
> [   23.675822] usbnet_probe(): freeing netdev: 88006ab48000
>
> So this seems to be a double-free (or at least a double free_netdev()
> call), but the object gets freed twice from usbnet_probe() and not
> from usbnet_disconnect(), so you're right that the latter doesn't get
> called. I'm not sure how usbnet_probe() ends up being called twice.

I still don't think it's a double free. I think the probe thing is
called twice, and that's why to different allocations get free'd twice
(and the reason it's the same pointer is that the same allocation got
reused)

But I don't know that driver, really.

>> Which particular usbnet bind routine is this? There are multiple
>> sub-drivers for usbnet that all do different things.
>
> cdc_ncm_bind()

Ok, so that matches my theory.

Does this attached stupid patch make the warning go away? Because from
what I can tell, usbnet_link_change() really will start using that
"dev" that simply isn't valid - and will be free'd - if the bind
fails.

So you have usbnet_defer_kevent() getting triggered, which in turn
ends up using "usbnet->kevent"

But somebody like Oliver is really the right person to check this. For
example, it's entirely possible that we should just instead do

cancel_work_sync(>kevent);

before the "free_netdev(net)" in the "out1" label.

And there might be other things that bind() can have touched than the
kevent workqueue.

   Linus
 drivers/net/usb/cdc_ncm.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/net/usb/cdc_ncm.c b/drivers/net/usb/cdc_ncm.c
index dc0212c3cc28..5878b54cc563 100644
--- a/drivers/net/usb/cdc_ncm.c
+++ b/drivers/net/usb/cdc_ncm.c
@@ -995,6 +995,8 @@ static int cdc_ncm_bind(struct usbnet *dev, struct 
usb_interface *intf)
 * placed NDP.
 */
ret = cdc_ncm_bind_common(dev, intf, CDC_NCM_DATA_ALTSETTING_NCM, 0);
+   if (ret < 0)
+   return ret;
 
/*
 * We should get an event when network connection is "connected" or
@@ -1003,7 +1005,7 @@ static int cdc_ncm_bind(struct usbnet *dev, struct 
usb_interface *intf)
 * start IPv6 negotiation and more.
 */
usbnet_link_change(dev, 0, 0);
-   return ret;
+   return 0;
 }
 
 static void cdc_ncm_align_tail(struct sk_buff *skb, size_t modulus, size_t 
remainder, size_t max)


Re: Possible double-free in the usbnet driver

2016-03-04 Thread Linus Torvalds
[ Moving this to proper lists ]

On Thu, Mar 3, 2016 at 4:19 PM, Andrey Konovalov  wrote:
>
> I found another double-free, this time in the usbnet driver.

Hmm. It doesn't look like a double free to me, at least from the logs
you attached.

> Whenever the `bind()` function fails (drivers/net/usb/usbnet.c:1676) when
> called from `usbnet_probe()` (and it can fail due to a invalid usb 
> descriptor),
> `free_netdev()` is called for the `net` device 
> (drivers/net/usb/usbnet.c:1772).
> Then, `free_netdev(net)` is called again in `usbnet_disconnect()`
> (drivers/net/usb/usbnet.c:1570) causing a double-free.

The KASAN report says that it's a use-after-free in the kworker
thread: the net device got free'd at the end of usbnet_probe(), but
some work-struct was apparently active at the time.

There might be a double free later that isn't in your report, though.
Do you have the data for that?

But I didn't think we even called the disconnect() function if the
"bind()" failed, so I don't think that one should free it. Greg?

So it *sounds* to me like the usbnet "bind()" routine ended up
returning an error, but doing so *after* it had already activated the
structure somehow.

Which particular usbnet bind routine is this? There are multiple
sub-drivers for usbnet that all do different things.

For example, it *looks* like the cdc_ncm_bind() will have done a
usbnet_link_change() even if the bind fails. So now we've done a
usbnet_defer_kevent() even though we're failing, and then that sets
the ball rolling to later touch the netdev that we're freeing due to
the failure.

But I may be *entirely* misreading this thing.

Anyway, I'm cc'ing the usbnet people who actually know the code (and netdev).

The proper fix may be to just cancel any work that might have been set
up before freeing. Or maybe that netdev *does* get free'd later some
other way properly. Let's see what the experts on the usbnet driver
say.

  Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL] USB driver fixes for 4.5-rc4

2016-02-14 Thread Linus Torvalds
On Sun, Feb 14, 2016 at 11:01 AM, Greg KH  wrote:
>
> USB and PHY fixes for 4.5-rc4
>
> Here are a number of USB and PHY driver fixes for 4.5-rc4.

Actually, only PHY fixes. The USB fixes you already sent me for rc3.
See merge commit 46df55ceeaf3.

Apparently you haven't updated your upstream tree..

  Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL] USB driver fixes for 4.5-rc4

2016-02-14 Thread Linus Torvalds
On Sun, Feb 14, 2016 at 12:39 PM, Greg KH  wrote:
>
> Ugh, you are right, sorry about that, do you need an updated pull
> request, or are you ok with this one?

I took it and updates the merge log appropriately.

  Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] usb: storage: scsiglue: increase transfer size limit

2015-10-29 Thread Linus Torvalds
On Thu, Oct 29, 2015 at 4:44 PM, Linus Torvalds
<torva...@linux-foundation.org> wrote:
>
> That said, I'm pretty sure it's just that there were (and probably
> still are) tons of bad USB sticks with cheap controllers with 8-bit
> sector counts, and you're better off picking something that causes
> fairly even sizes than picking 255.

Side note: it's not necessarily just the cheap usb memory sticks. I
would not be surprised if USB->IDE bridge chipsets etc have a
255-sector limit. The old IDE interfaces contained just a 8-bit sector
count, and while 256 sectors was supposed to work (a sector count of
zero was supposed to mean 256 sectors), there were at least some disks
that got confused and didn't work unless you limited it to 255 sectors
rather than 256.

So I would really suggest strongly that to blindly increase the max
sector field would be rather dangerous. Even if it almost certainly
works fine on any *good* USB storage thing.

 Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] usb: storage: scsiglue: increase transfer size limit

2015-10-29 Thread Linus Torvalds
On Thu, Oct 29, 2015 at 4:25 PM, Felipe Balbi  wrote:
>
> Fair enough. Linus, do you have any recollection of which device you
> found to be quirky ? Your original commit limitting to 240 sectors
> doesn't make that clear at all ;-)
>
> [1] http://marc.info/?l=git-commits-head=106945975115131=2

Yeah, no, my memory doesn't go that far back, and there's a reason
we've made good commit messages more of a priority than it used to
be..

That said, I'm pretty sure it's just that there were (and probably
still are) tons of bad USB sticks with cheap controllers with 8-bit
sector counts, and you're better off picking something that causes
fairly even sizes than picking 255.

If you really want to increase it and have a good reason to, I suspect
it should be limited to known-good (or known-modern) devices some way.
Or perhaps make it based on size - if you have a 64GB device the
controller is likely higher quality than if you have some el-cheapo
256MB old stick, and you probably have a bigger reason to use large
transfer sizes anyway.

Or perhaps even just on transfer speed...

USB sticks are still sold mainly on price, and while there certainly
are other USB storage models than "cheap USB stick", that is still the
bulk of it by far. And I'm not at all sure quality has improved all
that much at the low end.

  Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: First kernel patch (optimization)

2015-09-22 Thread Linus Torvalds
On Tue, Sep 15, 2015 at 12:53 PM, Eric Curtin  wrote:
> My first kernel patch, hope I did everything correctly! Instead of calling 
> strlen on every iteration of the for loop, just call it once instead and 
> store in a variable.

Heh. Ok, that resulted in a rather long email thread.

Anyway, I'd actually prefer to merge this patch, for two reasons:

 - the "termination calculation is expensive" problem is a real
problem, and while in this case the compiler may be able to notice
that the "strlen()" is constant over the loop and can be hoisted up,
that is not at all necessarily the case most of the time.

So I actually think patches like this are good things. Not because
this particular code site necessarily matters, but because people who
write code with things like "strlen()" in the terminating condition
need to learn that it's *wrong*.

 - I'd much rather see this kind of trivial patch than the usual
trivial patch that is clearly just "let's run some script on the
kernel and fix up warnings it generates mindlessly".

   In contrast to that, *this* trivial patch was about somebody who
thought about code generation and efficiency. And *that* is the kind
of trivial patches we want to encourage, not necessariyl because this
particular case was so important, but because that's the kind of
people and thinking we want to encourage.

So quite frankly, I'd just take this directly, but I'd like more of
real changelog.

But I wonder if Eric is even reading the emails any more ;)

 Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Fwd: [PATCH] net:usb:cdc_ncm: fix that tag Huawei devices as wwan

2015-06-14 Thread Linus Torvalds
Hmm. Oliver is marked as the maintainer of the USB CDC code, but
others have touched it more recently. So I'm just wildly adding people
to the cc to comment on this patch and maybe apply it.
Oliver/David/Ben/Bjørn?

Linus


-- Forwarded message --
From: xiaomao xiaomao0...@hotmail.com
Date: Sun, Jun 14, 2015 at 1:18 PM
Subject: [PATCH] net:usb:cdc_ncm: fix that tag Huawei devices as wwan


Form: Qianni mamaqia...@huawei.com

Huawei devices is the gereric CDC_NCM devices, but not is WWAN
devices. In the patch, we deleted the code about tagging Huawei
devices as WWAN.
---

Signed-off-by:Qianni mamaqia...@huawei.com
--- linux-3.19/drivers/net/usb/cdc_ncm.c.orig   2015-06-15 01:29:52.354238079 
+0800
+++ linux-3.19/drivers/net/usb/cdc_ncm.c2015-06-15 01:31:04.074236246 
+0800
@@ -1573,12 +1573,12 @@ static const struct usb_device_id cdc_de
},
 
/* tag Huawei devices as wwan */
-   { USB_VENDOR_AND_INTERFACE_INFO(0x12d1,
+   /*{ USB_VENDOR_AND_INTERFACE_INFO(0x12d1,
USB_CLASS_COMM,
USB_CDC_SUBCLASS_NCM,
USB_CDC_PROTO_NONE),
  .driver_info = (unsigned long)wwan_info,
-   },
+   },*/
 
/* Infineon(now Intel) HSPA Modem platform */
{ USB_DEVICE_AND_INTERFACE_INFO(0x1519, 0x0443,


Re: [GIT PULL] USB driver patches for 3.19-rc1

2014-12-15 Thread Linus Torvalds
On Mon, Dec 15, 2014 at 3:32 AM, Hans de Goede hdego...@redhat.com wrote:

 Code wise this looks good and I've just given:

 git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master

 a spin with an uas disk enclosure, and verified that tcq is being used,
 and everything works fine.

Thanks for verifying,

  Linus
--
To unsubscribe from this list: send the line unsubscribe linux-usb in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL] USB driver patches for 3.19-rc1

2014-12-14 Thread Linus Torvalds
On Sun, Dec 14, 2014 at 2:35 PM, Greg KH gre...@linuxfoundation.org wrote:

 Hans de Goede (1):
   uas: Make uas work with blk-mq

So I got some fairly trivial conflicts on this one (conflicting with
the scsi cleanups mainly by Christoph Hellwig.

I resolved the conflict easily, and it all *looks* fine, but quite
frankly, I'd be a lot happier about it if somebody who has the
hardware were to actually test the end result.   The whole interaction
with .use_blk_tags etc should be verified by somebody who knows the
code.

Hans? Christoph?

  Linus
--
To unsubscribe from this list: send the line unsubscribe linux-usb in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL] USB driver patches for 3.19-rc1

2014-12-14 Thread Linus Torvalds
On Sun, Dec 14, 2014 at 3:17 PM, Stephen Rothwell s...@canb.auug.org.au wrote:

 Attached is the message I sent about this on Nov 24 to which I received
 no response and so assumed it was ok ...

Looks like the same resolution I got.

I'd still prefer to get an actual positive yup, tested it about it.

 Linus
--
To unsubscribe from this list: send the line unsubscribe linux-usb in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL] USB driver fixes for 3.18-rc7

2014-11-30 Thread Linus Torvalds
Hmm, Greg.

I seem to get this problem possibly more commonly at boot these days:

  usb 1-6: new full-speed USB device number 2 using xhci_hcd
  usb 1-6: device descriptor read/64, error -71
  xhci_hcd :00:14.0: Setup ERROR: setup context command for slot 1.
  usb 1-6: hub failed to enable device, error -22
  usb 1-6: new full-speed USB device number 3 using xhci_hcd
  usb 1-6: device descriptor read/64, error -71
  usb 1-6: hub failed to enable device, error -22
  usb 1-6: new full-speed USB device number 4 using xhci_hcd
  usb 1-6: Device not responding to setup address.
  usb 1-6: Device not responding to setup address.
  usb 1-6: device not accepting address 4, error -71
  usb 1-6: new full-speed USB device number 5 using xhci_hcd
  usb 1-6: Device not responding to setup address.
  usb 1-6: Device not responding to setup address.
  usb 1-6: device not accepting address 5, error -71
 usb usb1-port6: unable to enumerate USB device

and my keyboard doesn't work. I then unplug and re-plug it, and get

  usb 1-6: new full-speed USB device number 9 using xhci_hcd
  usb 1-6: New USB device found, idVendor=2516, idProduct=0020
  usb 1-6: New USB device strings: Mfr=1, Product=2, SerialNumber=0
  usb 1-6: Product: Quickfire Rapid i
  usb 1-6: Manufacturer: CM Storm

Any ideas? Some setup delay that isn't long enough at boot time for a
slightly finicky device? It has happened before, but now I've gotten
it twice within a couple of days:

  Nov 02 12:18:56 i7.lan kernel: usb 1-6: device descriptor read/64, error -71
  Nov 28 16:54:06 i7.lan kernel: usb 1-6: device descriptor read/64, error -71
  Nov 30 11:26:35 i7.lan kernel: usb 1-6: device descriptor read/64, error -71

(I've had this keyboard since mid-September, and looking at the logs
this machine has been rebooted 41 times since, with those three
failures..)

  Linus
--
To unsubscribe from this list: send the line unsubscribe linux-usb in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Linux 3.16-rc6

2014-07-24 Thread Linus Torvalds
On Thu, Jul 24, 2014 at 5:58 AM, Peter Zijlstra pet...@infradead.org wrote:

 So going by the nifty picture rostedt made:

 [   61.454336]CPU0CPU1
 [   61.454336]
 [   61.454336]   lock((p-alloc_lock)-rlock);
 [   61.454336]local_irq_disable();
 [   61.454336]lock(tasklist_lock);
 [   61.454336]lock((p-alloc_lock)-rlock);
 [   61.454336]   Interrupt
 [   61.454336] lock(tasklist_lock);

So this *should* be fine. It always has been in the past, and it was
certainly the *intention* that it should continue to work with
qrwlock, even in the presense of pending writers on other cpu's.

The qrwlock rules are that a read-lock in an interrupt is still going
to be unfair and succeed if there are other readers.

 the fact that CPU1 holds tasklist_lock for reading, does not
 automagically allow CPU0 to acquire tasklist_lock for reading too, for
 example if CPU2 (not in the picture) is waiting to acquire tasklist_lock
 for writing, CPU0's read acquire is made to wait.

No.

That is true for qrwlock in general. But *not* in interrupt context.
In interrupt context, it's unfair. At least that was the _intent_ of
the code, maybe that got screwed up some way.

 The only kind of recursion that's safe is same CPU interrupt.

Any read-lock from an irq should still be unfair, no same CPU rules.
See queue_read_lock_slowpath(), where it will just wait for any actual
write lockers (not *waiting* writers) to go away. So by definition, of
somebody else (not just the current CPU) holds the lock for reading,
taking it for reading is safe on all cpu's in irq context, because we
obviously won't be waiting for any actual write lock holders.

So it sounds to me like the new lockdep rules in tip/master are too
strict and are throwing a false positive.

  Linus
--
To unsubscribe from this list: send the line unsubscribe linux-usb in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Linux 3.16-rc6

2014-07-23 Thread Linus Torvalds
On Wed, Jul 23, 2014 at 2:53 AM, Borislav Petkov b...@alien8.de wrote:

 Well, it looks like we f*cked up something after -rc5 since I'm starting
 to see lockdep splats all over the place which I didn't see before. I'm
 running rc6 + tip/master.

 There was one in r8169 yesterday:

 https://lkml.kernel.org/r/20140722081840.ga6...@pd.tnic

 and now I'm seeing the following in a kvm guest. I'm adding some more
 lists to CC which look like might be related, judging from the stack
 traces.

Hmm. I'm not seeing the reason for this.

 [   31.704282] [ INFO: possible irq lock inversion dependency detected ]
 [   31.704282] 3.16.0-rc6+ #1 Not tainted
 [   31.704282] -
 [   31.704282] Xorg/3484 just changed the state of lock:
 [   31.704282]  (tasklist_lock){.?.+..}, at: [81184b19] 
 send_sigio+0x59/0x1b0
 [   31.704282] but this lock took another, HARDIRQ-unsafe lock in the past:
 [   31.704282]  ((p-alloc_lock)-rlock){+.+...}

Ok, so the claim is that there's a 'p-alloc_lock' (ie task_lock())
that is inside the tasklist_lock, which would indeed be wrong. But I'm
not seeing it. The shortest dependencies thing seems to imply
__set_task_comm(), but that only takes task_lock.

Unless there is something in tip/master. Can you check that this is
actually in plain -rc6?

Or maybe I'm just blind. Those lockdep splats are easy to get wrong.
Adding PeterZ and Ingo to the list just because they are my lockdep
go-to people.

 Linus
--
To unsubscribe from this list: send the line unsubscribe linux-usb in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL] USB patches for 3.15-rc1

2014-04-01 Thread Linus Torvalds
On Tue, Apr 1, 2014 at 11:49 AM, Greg KH gre...@linuxfoundation.org wrote:

 USB patches for 3.15-rc1

 Here's the big USB pull request for 3.15-rc1.

Hmm. I'm getting this when testing:

  warning: (AHCI_XGENE) selects PHY_XGENE which has unmet direct
dependencies (HAS_IOMEM  OF  (ARM64 || COMPILE_TEST))

which looks like AHCI_XGENE doesn't have the proper dependency on OF
(or alternatively PHY_XGENE has an incorrect dependemcy on OF).

According to google it looks like Fengguang reported this on
kbuild-all, but nowhere else.

The actual build then succeeds. But the Kconfig warning is real and
implies that something is seriously wrong wrt the dependencies for
this thing.

It looks like the select PHY_XGENE came in through the libata
update, but this USB update actually brought in the config PHY_XGENE
and thus this error. Which makes me wonder how this all worked. Why
does that select PHY_XGENE exist when apparently it's not needed?

Regardless, there's something broken somewhere.

  Linus
--
To unsubscribe from this list: send the line unsubscribe linux-usb in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [3.8-rc3 - 3.8-rc4 regression] Re: [PATCH] module, async: async_synchronize_full() on module init iff async is used

2013-11-26 Thread Linus Torvalds
On Tue, Nov 26, 2013 at 1:29 PM, Josh Hunt joshhun...@gmail.com wrote:

 Both ahci and sata_svw call ata_host_activate(), which call
 ata_host_register() and async_schedule(async_port_probe, ap).

Well, with the modern logic (only wait for async probing if the
module itself did async probing) the ahci and svw modules didn't
really change any behavior.

But other modules did. I wonder, for example, if people insmod the dm
module, and expect all devices to exist afterwards. Which the old
logic of we always wait for all async code regardless of whether we
started it ourselves would do, but the new logic does not.

Something similar might hit the (non-modular) md auto-detect ioctl.

So maybe we should just special-case those two issues, and say let's
just wait for async requests here

Something like the appended (whitespace-damaged) diff. Does that make
a difference to you guys? And if it does, can you check *which* of the
two async_synchronize_full() calls it is that matters for your cases?

 Linus

--- duh, apply by hand --

diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index 0704c523a76b..7e7a2f743b11 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -351,6 +351,7 @@ static int __init dm_init(void)
goto bad;
}

+   async_synchronize_full();
return 0;

   bad:
diff --git a/drivers/md/md.c b/drivers/md/md.c
index b6b7a2866c9e..1d173dc662fc 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -8602,6 +8602,7 @@ static void autostart_arrays(int part)
i_scanned = 0;
i_passed = 0;

+   async_synchronize_full();
printk(KERN_INFO md: Autodetecting RAID arrays.\n);

while (!list_empty(all_detected_devices)  i_scanned  INT_MAX) {
--
To unsubscribe from this list: send the line unsubscribe linux-usb in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [3.8-rc3 - 3.8-rc4 regression] Re: [PATCH] module, async: async_synchronize_full() on module init iff async is used

2013-11-26 Thread Linus Torvalds
On Tue, Nov 26, 2013 at 2:12 PM, Josh Hunt joshhun...@gmail.com wrote:

 I should have clarified that I'm not using dm/md in my setup. I know
 the modules are getting loaded in the log I attached, but root is not
 a md/dm device.

Hmm. The initcall debugging doesn't actually show any of the wait for
async events, because those debug messages come from
do_one_initcall(), and the waiting happens later. Plus your messages
don't actually show where you are trying to - and failing - to mount
the root filesystem.

Without that kind of information, it's kind of hard to guess. Maybe
you could add a few printk's to your kernel? Add one to
do_init_module() *after* the

if (current-flags  PF_USED_ASYNC)
async_synchronize_full();

thing, and another to fs/namespace.c: do_mount() (just put something like

printk(do_mount: %s at %s\n, dev_name, dir_name);

or whatever, so that we can see when that happens..

  Linus
--
To unsubscribe from this list: send the line unsubscribe linux-usb in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] usb: gadget LLVMLinux: Removing the use of VLAIS from the gadget driver

2013-09-23 Thread Linus Torvalds
On Mon, Sep 23, 2013 at 12:30 PM, Felipe Balbi ba...@ti.com wrote:

 I remember there was a discussion of not dropping variable length array
 support, wasn't there ?

We should definitely drop it. The feature is an abomination. I thought
gcc only allowed them at the end of structs, in the middle of a struct
it's just f*cking insane beyond belief.

That said, for *this* particular case, that USB widget driver already
does a ton of small kmalloc's and then remembers the addresses
independently. People may not care about performance, but it *might*
be a good idea to just do one kmalloc()/kfree(), and then still have
those pointer variables, but just be offsets within that one
allocation.

That's what gcc has to basically do for that thing internally
*anyway*, just hidden behind a horrible construct that should never
have existed.

  Linus
--
To unsubscribe from this list: send the line unsubscribe linux-usb in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] USB: remove remaining instances of USB_SUSPEND

2013-05-01 Thread Linus Torvalds
On Wed, May 1, 2013 at 9:13 AM, Alan Stern st...@rowland.harvard.edu wrote:

 This patch (as1677) removes the remaining instances of that symbol.

Btw, what are these as ID's, and what does the noise buy us?

Doing a

  git log | egrep -w 'as[0-9]{3,}'

shows that this has been going on forever, but it still doesn't make
any *sense*. It adds nothing worthwhile. In fact, doing

  git log | grep 'This patch ('

shows that you're pretty the only one doing it, but there's *one*
other one jidong xiao: jx001. I really don't see the advantage of
having this kind of random noise in the kernel logs.

Linus
--
To unsubscribe from this list: send the line unsubscribe linux-usb in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PATCH] USB patches for 3.10-rc1

2013-04-29 Thread Linus Torvalds
On Mon, Apr 29, 2013 at 9:23 AM, Greg KH gre...@linuxfoundation.org wrote:

   git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git/ 
 tags/usb-3.10-rc1

This has some odd things in it, but I made the mistake of pushing out
my merge before I noticed, so it's out there now regardless.

See here: commit 84ebc10294a3 (USB: remove CONFIG_USB_SUSPEND
option) claims to remove the USB_SUSPEND config option, but doesn't
actually do so.

And no, it's not my merge that is in error, although there may be
merge errors in addition to the oddities I've found. Doing a

   git grep -w USB_SUSPEND 84ebc10294a3

shows that there remains things that have depends on .. USB_SUSPEND
even in the very commit that claims to remove the option. Later
commits then move them around (eg the whole drivers/usb/otg -
drivers/usb/phy movement), but the USB_SUSPEND stays around.

Hmm? I'm assuming you applied a patch without noticing that in the
meantime other users of USB_SUSPEND had grow up?

Linus
--
To unsubscribe from this list: send the line unsubscribe linux-usb in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PATCH] USB patches for 3.10-rc1

2013-04-29 Thread Linus Torvalds
On Mon, Apr 29, 2013 at 2:14 PM, Alan Stern st...@rowland.harvard.edu wrote:

 What other things seemed odd about Greg's pull request?

The only other thing I noticed was the new CONFIG_USB_PHY quesiton,
which is not something that I think is sensible to ask from a user,
and the help text doesn't really help anything either.

I think the question may make sense, but the wording does not.

*EVERYBODY* wants a USB PHY. You can't have USB without a physical
layer unless it's a purely virtual device. There's one in a EHCI
controller too. It's like a network chip - without a PHY there's no
point. Why ask about whether you want to support a phy or not? The
question makes no sense.

So I don't think the question should be do you want a USB PHY. The
question should be Do you want a driver for some of the specialized
external USB controllers or something like that. Because as it is
now, anybody who actually reads the question is likely to answer y,
I think, even if he just wants one of the *normal* USB chips that
don't split out the PHY.

Hmm? Or does PHY have some magic other meaning in USB circles? In
which case the wording should still be totally redone to not be so
confusing and misleading.

 Linus
--
To unsubscribe from this list: send the line unsubscribe linux-usb in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/2] USB PM and PM QoS fixes (Re: gpf in pm_qos_remote_wakeup_show)

2013-03-31 Thread Linus Torvalds
On Sun, Mar 31, 2013 at 3:56 PM, Rafael J. Wysocki r...@sisk.pl wrote:

 So, I have two patches (on top of the Linus' tree) that will follow shortly:

Should I take these directly as patches, or expect them to show up in
a pull soon (ie do you have or expect to have other things pending)?

Linus
--
To unsubscribe from this list: send the line unsubscribe linux-usb in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PATCH] USB patches for 3.9-rc1

2013-02-22 Thread Linus Torvalds
On Fri, Feb 22, 2013 at 4:19 PM, Rafael J. Wysocki r...@sisk.pl wrote:

 It won't revert, there's more stuff on top of it.  And it is a fix, so
 reverting it is not really a good idea anyway.

Rafael, please don't *ever* write that crap again.

We revert stuff whether it fixed something else or not. The rule is
NO REGRESSIONS. It doesn't matter one whit if something fixes
something else or not - if it breaks an old case, it gets reverted.

Seriously. Why do I even have to mention this? Why do I have to
explain this to somebody pretty much *every* f*cking merge window?

This is not a new rule.

And btw, the *reason* for that rule becoming such a hard rule was
pretty much exactly suspend/resume and ACPI. Exactly because we used
to have those infinite let's fix one thing and break another dances.
So you should be well acquainted with the rule, and I'm surprised to
hear that kind of utter garbage from you in particular.

There is no excuse for regressions, and it is a fix is actually the
_least_ valid of all reasons.

A commit that causes a regression is - by definition - not a fix. So
please don't *ever* say something that stupid again.

Things that used to work are simply a million times more important
than things that historically didn't work.

So this had better get fixed asap, and I need to feel like people are
working on it. Otherwise we start reverting.

And no amount but it's a fix matters one whit. In fact, it just
makes me feel like I need to start reverting early, because the
maintainer doesn't seem to understand how serious a regression is.

  Linus
--
To unsubscribe from this list: send the line unsubscribe linux-usb in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PATCH] USB patches for 3.9-rc1

2013-02-22 Thread Linus Torvalds
On Fri, Feb 22, 2013 at 4:48 PM, Rafael J. Wysocki r...@sisk.pl wrote:

 The problem is, though, that even if bisection turns up something, it doesn't
 automatically mean that this particular commit is the one that caused the
 problem to happen in the first place.

No, I agree. I just react *very* strongly when somebody starts arguing
against reverting.

The fact that the commit that was bisected fixes another bug in a
previous commit does in fact just imply that that *previous* commit
should perhaps be reverted. Of course, I see that that is the first in
the whole series, so I understand the pain, but even so..

And looking at that commit that makes power resources be treated
specially, that looks completely bogus. It's not what the old code did
either, and it seems to be a total hack for the breakage that just
happens to fix that one machine for purely random reasons, as far as I
can tell. The ACPI_BUS_TYPE_POWER case always had match=1 before, no?

The thing that looks to have been dropped somewhere is that

  .acpi_op_start = !!context,

which first became

  device-add_type = context ? ACPI_BUS_ADD_START : ACPI_BUS_ADD_MATCH;

and then somehow that wasn't needed any more, so it became just an unconditional

  device-add_type = ACPI_BUS_ADD_START;

for unexplained reasons (the commit message says that the argument
context wasn't used, and renames it not_used to reinforce the point,
but doesn't explain why it can remove the use that was there).

That unconditional add_type = ACPI_BUS_ADD_START then later becomes
flags.match_driver = true, which is where the whole match_driver
thing comes in.

Anyway, I don't see how magically it became about ACPI_BUS_TYPE_POWER.
The code is not exactly obvious, though, so whatever. But that
context is suddenly not used commit (e3863094c6f9: ACPI: Drop the
second argument of acpi_bus_scan()) looks odd.

Maybe context was effectively always non-NULL, but it *looks* like
that series of commits is intended to have no semantic changes, and
that's the one that looked very dubious to me.

   Linus
--
To unsubscribe from this list: send the line unsubscribe linux-usb in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PATCH] USB patches for 3.9-rc1

2013-02-21 Thread Linus Torvalds
On Thu, Feb 21, 2013 at 10:40 AM, Greg KH gre...@linuxfoundation.org wrote:

 USB patches for 3.9-rc1

 Here's the big USB merge for 3.9-rc1

 Nothing major, lots of gadget fixes, and of course, xhci stuff.

Ok, so there were a couple of conflicts with Thierry Reding's series
to convert devm_request_and_ioremap() users into
devm_ioremap_resource(), where some of the old users had been
converted to use other helper functions (eg omap_get_control_dev()).

I left the omap_get_control_dev() users alone, but I do want to note
that omap_control_usb_probe() itself now uses that
devm_request_and_ioremap() function. And I did *not* extend the merge
to do that kind of conversion in the helper function, so I'm assuming
Thierry might want to extend his work. Assuming people care enough..

Linus
--
To unsubscribe from this list: send the line unsubscribe linux-usb in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] init, block: try to load default elevator module early during boot

2013-01-17 Thread Linus Torvalds
And here I was really hoping that there was a third patch in the
series that added the warning...

We don't currently have a am I an async worker helper function for
the warning to use, which is something very much up your alley.

Linus

On Wed, Jan 16, 2013 at 1:30 PM, Tejun Heo t...@kernel.org wrote:
 This patch adds default module loading and uses it to load the default
 block elevator.  During boot, it's called right after initramfs or
 initrd is made available and right before control is passed to
 userland.  This ensures that as long as the modules are available in
 the usual places in initramfs, initrd or the root filesystem, the
 default modules are loaded as soon as possible.

 This will replace the on-demand elevator module loading from elevator
 init path.

 Signed-off-by: Tejun Heo t...@kernel.org
 Cc: Jens Axboe ax...@kernel.dk
 Cc: Arjan van de Ven ar...@linux.intel.com
 Cc: Linus Torvalds torva...@linux-foundation.org
 Cc: Alex Riesen raa.l...@gmail.com
 ---
  block/elevator.c |   16 
  include/linux/elevator.h |1 +
  include/linux/init.h |1 +
  init/do_mounts_initrd.c  |3 +++
  init/initramfs.c |8 +++-
  init/main.c  |   16 
  6 files changed, 44 insertions(+), 1 deletion(-)

 --- a/block/elevator.c
 +++ b/block/elevator.c
 @@ -136,6 +136,22 @@ static int __init elevator_setup(char *s

  __setup(elevator=, elevator_setup);

 +/* called during boot to load the elevator chosen by the elevator param */
 +void __init load_default_elevator_module(void)
 +{
 +   struct elevator_type *e;
 +
 +   if (!chosen_elevator[0])
 +   return;
 +
 +   spin_lock(elv_list_lock);
 +   e = elevator_find(chosen_elevator);
 +   spin_unlock(elv_list_lock);
 +
 +   if (!e)
 +   request_module(%s-iosched, chosen_elevator);
 +}
 +
  static struct kobj_type elv_ktype;

  static struct elevator_queue *elevator_alloc(struct request_queue *q,
 --- a/include/linux/elevator.h
 +++ b/include/linux/elevator.h
 @@ -138,6 +138,7 @@ extern void elv_drain_elevator(struct re
  /*
   * io scheduler registration
   */
 +extern void __init load_default_elevator_module(void);
  extern int elv_register(struct elevator_type *);
  extern void elv_unregister(struct elevator_type *);

 --- a/include/linux/init.h
 +++ b/include/linux/init.h
 @@ -161,6 +161,7 @@ extern unsigned int reset_devices;
  /* used by init/main.c */
  void setup_arch(char **);
  void prepare_namespace(void);
 +void __init load_default_modules(void);

  extern void (*late_time_init)(void);

 --- a/init/do_mounts_initrd.c
 +++ b/init/do_mounts_initrd.c
 @@ -57,6 +57,9 @@ static void __init handle_initrd(void)
 sys_mkdir(/old, 0700);
 sys_chdir(/old);

 +   /* try loading default modules from initrd */
 +   load_default_modules();
 +
 /*
  * In case that a resume from disk is carried out by linuxrc or one of
  * its children, we need to tell the freezer not to wait for us.
 --- a/init/initramfs.c
 +++ b/init/initramfs.c
 @@ -592,7 +592,7 @@ static int __init populate_rootfs(void)
 initrd_end - initrd_start);
 if (!err) {
 free_initrd();
 -   return 0;
 +   goto done;
 } else {
 clean_rootfs();
 unpack_to_rootfs(__initramfs_start, __initramfs_size);
 @@ -607,6 +607,7 @@ static int __init populate_rootfs(void)
 sys_close(fd);
 free_initrd();
 }
 +   done:
  #else
 printk(KERN_INFO Unpacking initramfs...\n);
 err = unpack_to_rootfs((char *)initrd_start,
 @@ -615,6 +616,11 @@ static int __init populate_rootfs(void)
 printk(KERN_EMERG Initramfs unpacking failed: %s\n, 
 err);
 free_initrd();
  #endif
 +   /*
 +* Try loading default modules from initramfs.  This gives
 +* us a chance to load before device_initcalls.
 +*/
 +   load_default_modules();
 }
 return 0;
  }
 --- a/init/main.c
 +++ b/init/main.c
 @@ -70,6 +70,8 @@
  #include linux/perf_event.h
  #include linux/file.h
  #include linux/ptrace.h
 +#include linux/blkdev.h
 +#include linux/elevator.h

  #include asm/io.h
  #include asm/bugs.h
 @@ -794,6 +796,17 @@ static void __init do_pre_smp_initcalls(
 do_one_initcall(*fn);
  }

 +/*
 + * This function requests modules which should be loaded by default and is
 + * called twice right after initrd is mounted and right before init is
 + * exec'd.  If such modules are on either initrd or rootfs, they will be
 + * loaded before control is passed to userland.
 + */
 +void __init load_default_modules(void)
 +{
 +   load_default_elevator_module();
 +}
 +
  static int

Re: [PATCH] async: fix __lowest_in_progress()

2013-01-17 Thread Linus Torvalds
Tejun, mind explaining this one a bit more to me?

If ordering matters, and we have a running queue and a pending queue,
how could the pending queue *ever* be lower than the running one?

That implies that something was taken off the pending queue and put on
the running queue out of order, right?

And that in turn implies that there isn't much of a lowest ordering
at all, so how could anybody even care about what lowest is? It seems
to be a meaningless measure.

So with that in mind, I don't see what semantics the first part of the
patch fixes. Can you explain more?

   Linus
--
To unsubscribe from this list: send the line unsubscribe linux-usb in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] init, block: try to load default elevator module early during boot

2013-01-17 Thread Linus Torvalds
On Thu, Jan 17, 2013 at 10:59 AM, Tejun Heo t...@kernel.org wrote:

 If you're okay with it, I'll route these two and the patches to add
 warning through a wq branch.  There's already a wq/for-3.9 patch which
 am_i_async() can make use of, so it's gonna be easier this way.

Sounds good to me. Thanks,

 Linus
--
To unsubscribe from this list: send the line unsubscribe linux-usb in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/3] workqueue, async: implement work/async_current_func()

2013-01-17 Thread Linus Torvalds
On Thu, Jan 17, 2013 at 5:25 PM, Tejun Heo t...@kernel.org wrote:
 Implement work/async_current_func() which query whether the current
 task is a workqueue or async worker respectively and, if so, return
 the current function being executed along with work / async item
 related information.

So why the odd interface? The only user of it calls it with a
NULL/NULL pair of arguments, and in general it's just way too complex
to be an exported function at all. I *suspect* you chose that complex
interface because you feel you may have some use for it inside of the
async code itself, but why isn't that then not totally private to
there?

IOW, why isn't the interface just

   static struct worker *current_worker(void)
   {
  if (current-flags  PF_WQ_WORKER)
 return kthread_data(current);
  return NULL;
   }

   int current_is_async(void)
   {
  struct worker *worker = current_worker(void);
  return worker  worker-current_func == async_run_entry_fn;
   }

and that current_is_async() is enough for the exported interface.

Then, if you actually want to care about the work itself, you can use
that same current_worker() helper function, and look at the
different worker fields more. But why export that kind of logic?

Am I missing something?

   Linus
--
To unsubscribe from this list: send the line unsubscribe linux-usb in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/3] workqueue, async: implement work/async_current_func()

2013-01-17 Thread Linus Torvalds
On Thu, Jan 17, 2013 at 7:04 PM, Tejun Heo t...@kernel.org wrote:

 Another thing is that it seems like having introspection type
 interface often lead to abuses - work_pending(), work_busy() both
 ended up bringing more unnecessary dependencies and subtle bugginess
 on internal details than actual benefits.  Querying %current is much
 less likely to be harmful in itself but I'm afraid it might encourage
 its users to develop something crazy on top.  It might be a good idea
 to make it only available to async.

I'm not sure I understand what you mean? Do you mean trying to limit
work_current_func() to only be accessible to the async code? You'd
have to make some kind of private header file under kernel/ for that,
but I guess that would work fine. We already do something similar
inside filesystems etc, where they have their own local headers.

I still don't really see what other data the async code could possibly
ever want than the is this an async thread I guess if you want to
keep all these things private to their own C files with no leaving of
the structure definitions (even within just a private kernel/worker.h
file or something) you could make the interface be one like

 - kernel/workqueue.c:

  int current_uses_workfn(work_func_t match)
  {
 if (current-flags  PF_WQ_WORKER) {
struct worker *worker = kthread_data(current);
return worker  match == worker-current_func;
 }
 return 0;
  }

 - kernel/async.c:

  int current_is_async(void)
  {
 return current_uses_workfn(async_run_entry_fn);
  }

but quite frankly, we've generally tried to avoid those kinds of silly
wrappers just because it's very wasteful to do two function calls just
to hide some detail like this. The code generation is atrocious,
jumping around for no good reason just increases cache pressure etc.

Yes, yes, some globally optimizing compiler could sort it all out, but
I'd personally be inclined to just move all the structure definitions
into kernel/worker.h, and make the code be inline functions. The only
actual current *user* would also be in the kernel/ subdirectory, and
we don't know if we'd ever want to really expand it past there.

Hmm?

 Linus
--
To unsubscribe from this list: send the line unsubscribe linux-usb in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] module, async: async_synchronize_full() on module init iff async is used

2013-01-16 Thread Linus Torvalds
On Wed, Jan 16, 2013 at 9:03 AM, Arjan van de Ven ar...@linux.intel.com wrote:

 we can even try twice

 the first time right after we mount the initramfs
 the second time when the initramfs code exits, and before we exec init
 (the initramfs supposedly mounted the real root fs at this point)

Yes. This, together with don't try request_module for the default
elevator, and the warn if somebody does request_module from async
context would, I think, be the right thing to do.

In the meantime, I've applied Tejun's patch. It possibly speeds things
up regardless of this particular deadlock thing, and while it's not
pretty it certainly isn't horribly nasty or very invasive either, so I
don't see any reason to delay it just because there might be a better
solution some day.

  Linus
--
To unsubscribe from this list: send the line unsubscribe linux-usb in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: USB device cannot be reconnected and khubd blocked for more than 120 seconds

2013-01-15 Thread Linus Torvalds
[ Added Tejun to the discussion, since he's the async go-to-guy ]

On Mon, Jan 14, 2013 at 10:23 PM, Ming Lei ming@canonical.com wrote:

 But I have another idea to address the problem, and let module code call
 async_synchronize_full() only if the module requires that explicitly, so how
 about the below draft patch?

No way.

This kind of let's just let drivers tell us when they used async
helpers is basically *asking* for buggy code. In fact, just to prove
how bad it is, YOU SCREWED IT UP YOURSELF.

Because it's not just sd.c that uses async_schedule(), and would need
the async synchronize. It's floppy.c, it's generic scsi scanning (so
scsi tapes etc), and it's libata-core.c.

This kind of let's randomly encourage people to write subtly buggy
code that has magical timing dependencies, so that the developer won't
likely even see it because he has fast disks etc code is totally
unacceptable. And this code was *designed* to be that kind of buggy.

No, if we set a flag like this, then it needs to be set
*automatically*, so that a module cannot screw this up by mistake.

It could be as simple as having a per-thread flag that gets set by the
__async_schedule() function, and gets cleared by fork. Then the module
code could do something like

   /* before calling the module -init function */
   current-used_async = 0;
   ...
   if (current-used_async)
  async_synchronize_full();

or whatever.

Tejun, comments? You can see the whole thread on lkml, but the basic
problem is that the module loading doing the unconditional
async_synchronize_full() has caused problems, because we have

 - load module A
   - module A does per-controller async discovery of its devices (eg
scsi or ata probing)
   - in the async thread, it initializes somethign that needs another
module B (in this case the default IO scheduler module)
  - modprobe for B loads the IO scheduler module successfully
  at the end of the module load, it does
async_synchronize_full() to make sure load_module won't return before
the module is ready
  *DEADLOCK*, because the async_synchronize_full() thing
actually waits for not the module B async code (it didn't have any),
but for the module *A* async code, which is waiting for module B to
finish.

Now, I'll happily argue that we shouldn't have this kind of load
modules from random context behavior in the kernel, and I think the
block layer is to blame for doing the IO scheduler load at an insane
time. So don't do that then would be the best solution. Sadly, we
don't even have a good way to notice that we're doing it, so hacky
workaround that at least doesn't require driver authors to care is
likely the second-best workaround.

But the hacky workaround absolutely needs to be *automatic*. Because
the driver writers need to get this subtle untestable thing right is
*not* acceptable. That's the patch that Ming Lei did, and I refuse to
have that kind of fragile crap in the kernel.

  Linus
--
To unsubscribe from this list: send the line unsubscribe linux-usb in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: USB device cannot be reconnected and khubd blocked for more than 120 seconds

2013-01-15 Thread Linus Torvalds
On Tue, Jan 15, 2013 at 9:36 AM, Linus Torvalds
torva...@linux-foundation.org wrote:

 This kind of let's randomly encourage people to write subtly buggy
 code that has magical timing dependencies, so that the developer won't
 likely even see it because he has fast disks etc code is totally
 unacceptable. And this code was *designed* to be that kind of buggy.

Btw, we could *possibly* do this the other way around. Wait for all
async work by default, but then have a really hacky way to turn that
off for modules that explicitly don't want it, because they know they
can be loaded in async context, and they don't do any async work
themselves. Then we could make the IO schedulers set that flag (I
know I'm loaded from async space, and I know I'm not myself doing any
async init)

Quite frankly, I'd still much rather prefer the automated approach -
or even better, just avoiding the load modules in async context
entirely. But at least the I can put a huge comment about why I don't
want to be waited on would be much more acceptable than the I need
to explicitly tell the world that it needs to wait on me.

So Ming Lei's patch was easily subtly buggy by mistake (showing that
by the fact that it was indeed buggy), while the opposite model where
you have to explicitly ask people not to wait for you could still be
very buggy, but at least now it needs to explicitly do extra work in
order to be buggy.

So if an interface is fragile, it should aim to be fragile in the
right way - making the fragility explicit, so that people can grep for
it, and people can add comments to the particular code that marks it
fragile. The default behavior should be the robust one.

And if would be lovely to add a warning to the people loaded a module
from async context case, so that we'd *see* this.

Tejun, is there a good way for code to see I'm running in async
context? Then we could do something like

WARN_ON_ONCE(wait  system_state == SYSTEM_RUNNING  in_async_thread());

in kernel/kmod.c (__request_module()). That should at least warn about
this whole issue happening.

Linus
--
To unsubscribe from this list: send the line unsubscribe linux-usb in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: USB device cannot be reconnected and khubd blocked for more than 120 seconds

2013-01-15 Thread Linus Torvalds
On Tue, Jan 15, 2013 at 10:32 AM, Tejun Heo t...@kernel.org wrote:

 I think the root problem here, apart from request_module() from block
 - which is a bit nasty but making that part completely async would too
 be quite nasty albeit in a different way - is that
 async_synchronize_full() is way too indescriminate.  It's something
 only suitable for things like the end of system init.

 I'm wondering whether what we need is a rudimentray nesting like the
 following.

I think that is a good solution if it works, but look out: we need to
synchronize across *all* domains, not just the default one.  The sd.c
code, for example, uses its own scsi_sd_probe_domain for example,
and we *do* want to synchronize with it.

Can you do that with your suggested interface (ie it would have to be
a *global* sequence number).

   Linus
--
To unsubscribe from this list: send the line unsubscribe linux-usb in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: USB device cannot be reconnected and khubd blocked for more than 120 seconds

2013-01-15 Thread Linus Torvalds
On Tue, Jan 15, 2013 at 3:50 PM, Tejun Heo t...@kernel.org wrote:

 For now, I'm gonna implement simple I'm not gonna wait for myself
 self-deadlock avoidance.

You can't really do that. Or rather, it won't *help*.

The thing is, the module loading in particular is not necessarily
happening in the same context as what *started* the module loading. A
module loader will request the module from user space, and then later
user space - through possibly a totally unrelated process - will
finish it. So there is no myself. There's not even necessarily any
relationship that the kernel even knows about, because the module
loading request can have gone from usermode_helper over something like
dbus to systemd.

See?

There's a reason I asked for a warning for this. Or the let's flag
the current thread if it ever started anything asynchronous. Because
it's complicated.

  Linus
--
To unsubscribe from this list: send the line unsubscribe linux-usb in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: USB device cannot be reconnected and khubd blocked for more than 120 seconds

2013-01-15 Thread Linus Torvalds
On Tue, Jan 15, 2013 at 4:36 PM, Linus Torvalds
torva...@linux-foundation.org wrote:

 There's a reason I asked for a warning for this. Or the let's flag
 the current thread if it ever started anything asynchronous. Because
 it's complicated.

Btw, the sequence counter (that is *not* taking anything else into
account) is good enough in practice, exactly because the common case
for module loading is actually that nothing in the module init
sequence is done asynchronously.

Yes, device discovery (particularly for block devices) is often
asynchronous. But the modules it then asks to load usually wouldn't
be. So if we just have the flag did this thread ever even start async
work over the module init sequence, we can just avoid the async
serialization entirely for that case, and it breaks the deadlock chain
nicely in practice.

Only of a block device does async work and then wants to load another
module that does more async work in its init routine would it then
break. But at that point, I'll happily just put my foot down and tell
people they are crazy, and Let's not do that kind of crap.

Linus
--
To unsubscribe from this list: send the line unsubscribe linux-usb in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] module, async: async_synchronize_full() on module init iff async is used

2013-01-15 Thread Linus Torvalds
On Tue, Jan 15, 2013 at 6:52 PM, Tejun Heo t...@kernel.org wrote:

 It makes me feel dirty but makes the problem go away and I can't think
 of anything better, so here is the implementation of used async
 workaround.

Ok, people, can we get a tested-by (or Nope, doesn't work) from the
people who saw this?

That said, maybe we could just make the rule be that you can't pick a
default IO scheduler that is modular.

And I *would* like to see the warning we discussed. Maybe there are
other situations that can trigger this?

Because something like that

WARN_ON_ONCE(wait  i_am_async()  system_state == SYSTEM_RUNNING);

in kernel/kmod.c (__request_module()) still sounds like a good idea to
verify that this is the only thing that triggers it (of course, we'd
need to somehow avoid the warning for the known case with the known
workaround).

 Linus
--
To unsubscribe from this list: send the line unsubscribe linux-usb in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] module, async: async_synchronize_full() on module init iff async is used

2013-01-15 Thread Linus Torvalds
On Tue, Jan 15, 2013 at 7:25 PM, Tejun Heo t...@kernel.org wrote:
 Hello, Linus.

 On Tue, Jan 15, 2013 at 07:00:31PM -0800, Linus Torvalds wrote:
 That said, maybe we could just make the rule be that you can't pick a
 default IO scheduler that is modular.

 This is definitely much more preferable but it would affect use case
 where everything is built modular and the elevator is selected via
 kernel param.  This is way outside the usual usage and we can warn
 about the new behavior but it still is an observable behavior change.
 Do you think this would be okay?

I do want the same user-visible semantics, so it's not some one-liner.

The compiled-in elevator would be easy enough to handle in the Kconfig
file (maybe we do already, I didn't even bother to check). The real
problem is the chosen_elevator one, which is dynamic with the kernel
command line. And we could handle that one by just trying to load the
module early (but exactly _when_?) and then instead of looking things
up by name, just keep a pointer to the default elevator around.

But no, it's not just some trivial one-liner. Especially the question
about when to try to load the module that is given on the kernel
command line is not trivial. Do we require that the module is in the
initrd and loadable basically immediate at boot? Do we try again after
switching the root filesystem? Things like that..

 And then this warning can be added without introducing
 request_module_but_dont_warn_about_being_called_from_async().

I do agree that it would be much nicer that way.

   Linus
--
To unsubscribe from this list: send the line unsubscribe linux-usb in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: USB device cannot be reconnected and khubd blocked for more than 120 seconds

2013-01-14 Thread Linus Torvalds
On Sun, Jan 13, 2013 at 11:15 PM, Ming Lei ming@canonical.com wrote:

 The deadlock problem is caused by calling request_module() inside
 async function of do_scan_async(), and it was introduced by Linus's
 below commit:

 commit d6de2c80e9d758d2e36c21699117db6178c0f517
 Author: Linus Torvalds torva...@linux-foundation.org
 Date:   Fri Apr 10 12:17:41 2009 -0700

 async: Fix module loading async-work regression

 IMO, maybe the commit isn't a proper fix, considered the
 below fact:

 - it isn't good to allow async function to be marked as __init

Immaterial. For modules, __init is a non-issue. For non-modules, the
synchronization elsewhere is sufficient.

 - any user mode shouldn't expect that the device is ready just
 after completing of 'insmod'

Bullshit. That expectation is just a fact. People insmod a device
driver, and mount the device immediately in scripts.

We do not say user mode shouldn't. Seriously. EVER. User mode
*does*, and we deal with it. Learn it now, and stop ever saying that
again.

This is really starting to annoy me. Kernel developers who say user
mode should be fixes to not do that should go somewhere else. The
whole and *only* point of a kernel is to hide these kinds of issues
from user mode, and make things just work in user mode. User mode
should not ever worry about oh, doing X can trigger a module load, so
now the device might not be available immediately, so I should delay
and re-try until it is.

That's just f*cking crazy talk.

We have a very simple rule in the kernel: we don't break user space. EVER.

Learn that rule. I don't ever want to hear any user mode shouldn't
expect again. User mode *does* expect. End of discussion.

 - from view of driver, introducing async_synchronize_full() after
 do_one_initcall() inside do_init_module() is like a sync probe
 for drivers built as module, and cause this kind of deadlock easily.

 So could we revert the commit and fix the previous problems just
 case by case? or other better fix?

There's no way in hell we take a fix things one by one approach.
It's not going to work. And your suggestion seems to not do async
discovery of devices in general, which is a *much* worse fix than
anything else. It's just crazy.

But there are other approaches we might take. We might move the call to

async_synchronize_full();

to other places. For example, maybe we're better off doing it at
block/char device open instead?

  Linus
--
To unsubscribe from this list: send the line unsubscribe linux-usb in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: USB device cannot be reconnected and khubd blocked for more than 120 seconds

2013-01-14 Thread Linus Torvalds
On Mon, Jan 14, 2013 at 10:04 AM, Alan Stern st...@rowland.harvard.edu wrote:

 How about skipping that call if the current thread is one of the async
 helpers?  Is it possible to detect when that happens?

 Or maybe such a check should go inside async_synchronize_full() itself.

Do we have some idea of exactly what is waiting for what? Which async
context is causing the module load to happen in the first place?

I think *that* is what we should avoid - it sounds like the block
layer is loading the IO scheduler at the wrong point. I realize that
people like (for testing purposes) to change the IO scheduler at
random, but if that means that any IO can basically result in a
request_module(), then that sounds like a problem.

It seems to be elevator_get(), and I presume the chain is something
like load block driver async, the block driver does
blk_init_allocated_queue, that does request_module() to find the
elevator, the request_module() succeeds, but ends up waiting for async
work, which is the block driver load, which is waiting for the
request_module to finish.

And my gut feel is that blk_init_allocated_queue() probably shouldn't
do that request_module() at all. We migth want to do it when we *open*
the device, but not while loading the module for the device.

So my _feeling_ is that this is just a bug in the block layer, and
that it shouldn't set up block device drivers for this kind of crazy
need to load the elevator module while in the middle of scanning
devices. I think *that* is what we should aim to change.

Hmm?

That said, I think it might indeed be a good idea to make this problem
much easier to see, and that detect when it happens would be a good
thing (and then we should WARN_ON_ONCE() on people trying to do
request_module() calls from async context).

   Linus
--
To unsubscribe from this list: send the line unsubscribe linux-usb in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH 00/13] firmware loader: introduce cache/uncache firmware

2012-07-25 Thread Linus Torvalds
On Wed, Jul 25, 2012 at 5:35 AM, Ming Lei ming@canonical.com wrote:

 The below patch should fix the problem above.

Actually, I think we could make this even simpler.

There's nothing wrong with saying user mode is enabled *just* before
we unthaw things, if we also simply guarantee that there is no
sleeping lock or similar that we might get caught on (causing a
deadlock or other untimely waking) before we've actually thawed
everything.

So *if* the only problem wrt the USB hub code comes from this area,
then I think the solution might be as simple as just moving the
usermodehelper_enable() up a few lines, with a comment. Something
like the *untested* and whitespace-damaged thing below..

Rafael? Who has one of those isight things and has seen the warning to test?

   Linus

---
diff --git a/kernel/power/process.c b/kernel/power/process.c
index 19db29f67558..5bf50e488196 100644
--- a/kernel/power/process.c
+++ b/kernel/power/process.c
@@ -181,6 +181,12 @@ void thaw_processes(void)
pm_freezing = false;
pm_nosig_freezing = false;

+   /*
+* User mode helper are available again (or will be,
+* modulo scheduling)
+*/
+   usermodehelper_enable();
+
oom_killer_enable();

printk(Restarting tasks ... );
@@ -193,8 +199,6 @@ void thaw_processes(void)
} while_each_thread(g, p);
read_unlock(tasklist_lock);

-   usermodehelper_enable();
-
schedule();
printk(done.\n);
 }
--
To unsubscribe from this list: send the line unsubscribe linux-usb in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] firmware load: defer request_firmware during early boot and resume

2012-07-22 Thread Linus Torvalds
On Sun, Jul 22, 2012 at 5:58 AM, Borislav Petkov b...@alien8.de wrote:

 Question: is there any other reason

   [besides maybe embedded people who care about each single Kb of memory
on the system]

 why we don't make this cache/uncache firmware thing *implicit*? That is,
 load it once at driver open time and keep it in memory during the whole
 driver's lifetime. And this all taken care of by the driver core, btw.

So some firmware is a *lot* more than a few kB. We're talking
hundreds of kB, sometimes more. And to make matters worse, we keep it
in memory with vmalloc(), which is a limited resource on 32-bit
systems. So it can actually be worse than just the memory use itself.

Also, as you already mentioned, there's the cache coherency issue.
There are real advantages to reloading the firmware occasionally,
because it might have changed on disk. So..

I do think we might want to consider it, although I do suspect that we
do want to throw it out because of the problems with infinite caches.

Linus
--
To unsubscribe from this list: send the line unsubscribe linux-usb in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] firmware load: defer request_firmware during early boot and resume

2012-07-21 Thread Linus Torvalds
On Fri, Jul 20, 2012 at 5:33 AM, Ming Lei tom.leim...@gmail.com wrote:
 The RFC patch is just for discussing if the idea of deferring
 request_firmware is doable for addressing the issue of
 request_firmware in resume path, which is caused by driver
 unbind/rebind during resume.

NAK.

It really *has* to be handled some other way.

This whole approach seems to actively try to *silently* fix up broken
drivers. And it's wrong.

There's a reason we warn, and there's a reason we *have* to warn: to
let driver writers know that what they are attempting to do MAY NOT
WORK.

Really.

Sure, for a lot of devices it's fine to load the firmware later. But
some devices may be part of the resume sequence in very critical ways,
and deferring the firmware loading will just mean that the resume will
fail.

This we *need* the WARN_ON() - so that even in the case where it
happens to work, people are very much told that sure, your
suspend/resume may have worked, but it was doing fundamentally wrong
things that may mean that for other people it *won't* work.

For example, maybe it's a USB network dongle, and for *YOU* it is
perfectly fine to defer the firmware loading, so that the network
comes back up a few seconds after the system is up and running.

But in another machine, that exact same USB network dongle may
actually be hardwired on the motherboard (it's fairly common to use
USB as a system bus in some laptop and embedded devices), and maybe
that other machine actually is a thin client that has some tiny rdinit
thing, and then everything else is NFS-mounted, and if you resume
without networking, the machine is simply *dead*.

Ok, so that was a completely made-up example, but we have actually had
situations kind of like that, where a device is just not that
important for lots of people, but in other situations it's critical
for the rest of the suspend/resume to succeed.

This is why I'm so vehemently against silently hiding these issues.

If you have a driver that has problems, make THAT ONE PARTICULAR
driver do the deferral explicitly. Don't make some generic silently
defer if there are problems patch.

See what I'm saying? You're solving things in exactly the wrong place,
and in exactly the wrong way. You're papering things over, and making
the generic code silently just make broken cases work. That's really
really bad, because it makes it *easier* for driver writers to do the
wrong thing without even thinking about it, and without ever seeing
the problem. And then when people say suspend/resume doesn't work,
the driver author says it works for me and ignores the problem.
Because you've systemically made it easy to ignore the problem, and
made it easy to do the wrong thing by default.

So we should make driver writers do the right thing by default, and if
they cannot do the right thing (and the isight camera always comes
up, and f*ck it, just fix that driver) then they should do extra work.

Seriously. People should load their firmware *before* the
suspend/resume cycle. And if that isn't possible, then the system
should ABSOLUTELY NOT silently say whatever and defer it until
later.  We should have that big failure and the big noisy warning, and
drivers that need to defer need to do so themselves, so that we never
*ever* have that silent automatic defer situation.

  Linus
--
To unsubscribe from this list: send the line unsubscribe linux-usb in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html