Re: Bug#986561: linux: Regression in drivers/hid/hid-dr.c causing horizontal D-pad to malfunction on SNES joystick

2021-04-14 Thread Ioan-Adrian Ratiu

Hi,

On Wed, 14 Apr 2021, Salvatore Bonaccorso  
wrote:
Hi Ioan-Adrian, 

On Wed, Apr 07, 2021 at 02:47:24PM +0200, Alessandro Grassi 
wrote: 
Source: linux Severity: normal Tags: upstream X-Debbugs-Cc: 
alessan...@aggro.it  Greetings,  I am encountering the issue 
described in this thread[1], using a gamepad identified as 
"DragonRise" with USB ID 0079:0011.   The joypad works as 
intended except for the D-pad: up and down are detected in 
jstest (though misinterpreted: the input graph shows the points 
in the left up/down corners instead of the center), the left 
and right buttons are completely ignored.   Running 
'input-events' shows events 0/127 and 255/127 on up and down 
respectively, nothing at all on left and right.   I was able to 
identify that the misbehaviour was caused by this commit[2] on 
the kernel source tree. To determine this I have rebuilt the 
Debian kernel using hid-dr.c from the previous commit[3] and 
loaded hid-dr.ko manually, with which the gamepad worked as 
intended. I have replaced the file again with the one from the 
breaking commit iself ([2]) and the behaviour was again broken. 
Furthermore, to confirm that that was the breaking commit, I 
have commented line 315 (the input mapping one in the struct) 
from the current Debian source tree and rebuilt it, the joypad 
works as it should.   Regards, Alessandro  [1]: 
https://retropie.org.uk/forum/topic/25657/controler-issue-no-left-and-right-not-working-at-all 
[2]: 
https://github.com/torvalds/linux/commit/e15944099870f374ca7efc62f98cf23ba272ef43 
[3]: 
https://github.com/torvalds/linux/commit/313726cad3b68039c8e4dcad5a2840a0d375678c 


A user in Debian reported that e15944099870 ("HID: hid-dr: add 
input mapping for axis selection") introduced a regression, 
described above. 

Does this ring some bell to you? 


Unfortunately no and I do not have the HW to test anymore.

It is possible that change introduced a regression on newer 
"DragonRise" gamepads and maybe that mapping logic needs to be a 
bit more complex, depending on the HW differences.


Sorry I can't be more helpful,
Adrian



Regards,
Salvatore


Re: [RFC][PATCH] Revert "ARM: dts: bcm2837: Fix polarity of wifi reset GPIOs"

2019-01-28 Thread Ioan-Adrian Ratiu

Hi Stefan and thank you for looking into this,

On Sun, 27 Jan 2019, Stefan Wahren  wrote:
Hi Ioan-Adrian, 

Ioan-Adrian Ratiu  hat am 27. Januar 2019 um 
21:28 geschrieben:   This reverts commit 
bea8a160c621d19f7f78b13e14e03f4b8e44cd4b.   Contrary to what 
the commit message says, on my rpi 3 b v1.2 changing the 
polarity causes the exact behaviour this commit intends to fix, 
as described at the referenced link below (wlan0 disapears). 
With reset-gpios = ... GPIO_ACTIVE_HIGH, brcmfmac errors in 
dmesg:  [7.977512] brcmfmac: brcmf_sdio_bus_sleep: error 
while changing bus sleep state -110 [7.977623] brcmfmac: 
brcmf_sdio_txfail: sdio error, abort command and terminate 
frame [7.978007] brcmfmac: brcmf_sdio_txfail: sdio error, 
abort command and terminate frame [7.978377] brcmfmac: 
brcmf_sdio_txfail: sdio error, abort command and terminate 
frame [7.978724] brcmfmac: brcmf_sdio_dpc: failed backplane 
access over SDIO, halting operation [7.978734] brcmfmac: 
brcmf_proto_bcdc_query_dcmd: brcmf_proto_bcdc_msg failed 
w/status -110 [7.978747] brcmfmac: 
brcmf_cfg80211_get_channel: chanspec failed (-110) [ 
7.982817] brcmfmac: brcmf_sdio_bus_sleep: error while changing 
bus sleep state -110 [7.982880] brcmfmac: 
brcmf_sdio_txfail: sdio error, abort command and terminate 
frame [7.983255] brcmfmac: brcmf_sdio_txfail: sdio error, 
abort command and terminate frame  The only solution I 
currently have is to revert and everything works as expected 
and as before changing the polarity.   Link: 
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=911443 
Signed-off-by: Ioan-Adrian Ratiu  --- 


reverting this commit will only workaround the real issue. It 
lies in the sdhci-iproc driver. 

Could please try this instead patch [1]? 


Yes, the patch fixes the bug and it's also in Linus' master branch 
in v5.0-rc3 as commit 2bd44da ("mmc: sdhci-iproc: handle 
mmc_of_parse() errors during prob").


There was just a minor conflict cherry-picking it to 4.19.y 
(sdhci_get_property -> sdhci_get_of_property).


I would really like to have 2bd44da backported to the 4.19 stable 
tree to have wifi working again. Please?


Tested-by: Ioan-Adrian Ratiu  #4.19.18



[1] - https://patchwork.kernel.org/patch/10741809/


Re: [RFC][PATCH] Revert "ARM: dts: bcm2837: Fix polarity of wifi reset GPIOs"

2019-01-27 Thread Ioan-Adrian Ratiu

Link to the full 4.19.18 config I'm using:

https://drive.google.com/open?id=1ZI3MeGB2fkYMsEjzGQYXUk2wqr0h9h7R

On Sun, 27 Jan 2019, Ioan-Adrian Ratiu  wrote:

This reverts commit bea8a160c621d19f7f78b13e14e03f4b8e44cd4b.

Contrary to what the commit message says, on my rpi 3 b v1.2 changing
the polarity causes the exact behaviour this commit intends to fix, as
described at the referenced link below (wlan0 disapears).

With reset-gpios = ... GPIO_ACTIVE_HIGH, brcmfmac errors in dmesg:

[7.977512] brcmfmac: brcmf_sdio_bus_sleep: error while changing bus sleep 
state -110
[7.977623] brcmfmac: brcmf_sdio_txfail: sdio error, abort command and 
terminate frame
[7.978007] brcmfmac: brcmf_sdio_txfail: sdio error, abort command and 
terminate frame
[7.978377] brcmfmac: brcmf_sdio_txfail: sdio error, abort command and 
terminate frame
[7.978724] brcmfmac: brcmf_sdio_dpc: failed backplane access over SDIO, 
halting operation
[7.978734] brcmfmac: brcmf_proto_bcdc_query_dcmd: brcmf_proto_bcdc_msg 
failed w/status -110
[7.978747] brcmfmac: brcmf_cfg80211_get_channel: chanspec failed (-110)
[7.982817] brcmfmac: brcmf_sdio_bus_sleep: error while changing bus sleep 
state -110
[7.982880] brcmfmac: brcmf_sdio_txfail: sdio error, abort command and 
terminate frame
[7.983255] brcmfmac: brcmf_sdio_txfail: sdio error, abort command and 
terminate frame

The only solution I currently have is to revert and everything works
as expected and as before changing the polarity.

Link: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=911443
Signed-off-by: Ioan-Adrian Ratiu 
---
 arch/arm/boot/dts/bcm2837-rpi-3-b-plus.dts | 2 +-
 arch/arm/boot/dts/bcm2837-rpi-3-b.dts  | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/arm/boot/dts/bcm2837-rpi-3-b-plus.dts 
b/arch/arm/boot/dts/bcm2837-rpi-3-b-plus.dts
index 93762244be7f..4adb85e66be3 100644
--- a/arch/arm/boot/dts/bcm2837-rpi-3-b-plus.dts
+++ b/arch/arm/boot/dts/bcm2837-rpi-3-b-plus.dts
@@ -31,7 +31,7 @@
 
 	wifi_pwrseq: wifi-pwrseq {

compatible = "mmc-pwrseq-simple";
-   reset-gpios = <&expgpio 1 GPIO_ACTIVE_LOW>;
+   reset-gpios = <&expgpio 1 GPIO_ACTIVE_HIGH>;
};
 };
 
diff --git a/arch/arm/boot/dts/bcm2837-rpi-3-b.dts b/arch/arm/boot/dts/bcm2837-rpi-3-b.dts

index 89e6fd547c75..c318bcbc6ba7 100644
--- a/arch/arm/boot/dts/bcm2837-rpi-3-b.dts
+++ b/arch/arm/boot/dts/bcm2837-rpi-3-b.dts
@@ -26,7 +26,7 @@
 
 	wifi_pwrseq: wifi-pwrseq {

compatible = "mmc-pwrseq-simple";
-   reset-gpios = <&expgpio 1 GPIO_ACTIVE_LOW>;
+   reset-gpios = <&expgpio 1 GPIO_ACTIVE_HIGH>;
};
 };
 
--

2.20.1


[RFC][PATCH] Revert "ARM: dts: bcm2837: Fix polarity of wifi reset GPIOs"

2019-01-27 Thread Ioan-Adrian Ratiu
This reverts commit bea8a160c621d19f7f78b13e14e03f4b8e44cd4b.

Contrary to what the commit message says, on my rpi 3 b v1.2 changing
the polarity causes the exact behaviour this commit intends to fix, as
described at the referenced link below (wlan0 disapears).

With reset-gpios = ... GPIO_ACTIVE_HIGH, brcmfmac errors in dmesg:

[7.977512] brcmfmac: brcmf_sdio_bus_sleep: error while changing bus sleep 
state -110
[7.977623] brcmfmac: brcmf_sdio_txfail: sdio error, abort command and 
terminate frame
[7.978007] brcmfmac: brcmf_sdio_txfail: sdio error, abort command and 
terminate frame
[7.978377] brcmfmac: brcmf_sdio_txfail: sdio error, abort command and 
terminate frame
[7.978724] brcmfmac: brcmf_sdio_dpc: failed backplane access over SDIO, 
halting operation
[7.978734] brcmfmac: brcmf_proto_bcdc_query_dcmd: brcmf_proto_bcdc_msg 
failed w/status -110
[7.978747] brcmfmac: brcmf_cfg80211_get_channel: chanspec failed (-110)
[7.982817] brcmfmac: brcmf_sdio_bus_sleep: error while changing bus sleep 
state -110
[7.982880] brcmfmac: brcmf_sdio_txfail: sdio error, abort command and 
terminate frame
[7.983255] brcmfmac: brcmf_sdio_txfail: sdio error, abort command and 
terminate frame

The only solution I currently have is to revert and everything works
as expected and as before changing the polarity.

Link: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=911443
Signed-off-by: Ioan-Adrian Ratiu 
---
 arch/arm/boot/dts/bcm2837-rpi-3-b-plus.dts | 2 +-
 arch/arm/boot/dts/bcm2837-rpi-3-b.dts  | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/arm/boot/dts/bcm2837-rpi-3-b-plus.dts 
b/arch/arm/boot/dts/bcm2837-rpi-3-b-plus.dts
index 93762244be7f..4adb85e66be3 100644
--- a/arch/arm/boot/dts/bcm2837-rpi-3-b-plus.dts
+++ b/arch/arm/boot/dts/bcm2837-rpi-3-b-plus.dts
@@ -31,7 +31,7 @@
 
wifi_pwrseq: wifi-pwrseq {
compatible = "mmc-pwrseq-simple";
-   reset-gpios = <&expgpio 1 GPIO_ACTIVE_LOW>;
+   reset-gpios = <&expgpio 1 GPIO_ACTIVE_HIGH>;
};
 };
 
diff --git a/arch/arm/boot/dts/bcm2837-rpi-3-b.dts 
b/arch/arm/boot/dts/bcm2837-rpi-3-b.dts
index 89e6fd547c75..c318bcbc6ba7 100644
--- a/arch/arm/boot/dts/bcm2837-rpi-3-b.dts
+++ b/arch/arm/boot/dts/bcm2837-rpi-3-b.dts
@@ -26,7 +26,7 @@
 
wifi_pwrseq: wifi-pwrseq {
compatible = "mmc-pwrseq-simple";
-   reset-gpios = <&expgpio 1 GPIO_ACTIVE_LOW>;
+   reset-gpios = <&expgpio 1 GPIO_ACTIVE_HIGH>;
};
 };
 
-- 
2.20.1



[PATCH] usb: dwc2: detect power supplies to reduce spam

2018-03-03 Thread Ioan-Adrian Ratiu
By statically hardcoding at compile time the number of supplies
("#define DWC2_NUM_SUPPLIES ARRAY_SIZE(dwc2_hsotg_supply_names)"), the
driver assumed that every controller uses supplies and issued a
warning if none were detected via the device tree [1], even though the
vast majority of devices (with 1 exception from Samsung) don't need
nor use them.

So to return to normality and stop warning everyone unconditionally,
detect if there are any supplies and no-op just like the dummy
regulator which got loudly auto-asigned does.

This issue has been previously discussed based on an alternative fix
at [2] a year back but nothing came out of it then.

[1]
dwc2 3f98.usb: 3f98.usb supply vusb_d not found, using dummy regulator
dwc2 3f98.usb: 3f98.usb supply vusb_a not found, using dummy regulator

[2]
https://www.spinics.net/lists/linux-usb/msg153010.html

Signed-off-by: Ioan-Adrian Ratiu 
---
 drivers/usb/dwc2/core.h |  5 +++--
 drivers/usb/dwc2/platform.c | 46 +++--
 2 files changed, 35 insertions(+), 16 deletions(-)

diff --git a/drivers/usb/dwc2/core.h b/drivers/usb/dwc2/core.h
index cd77af3b1565..c50b9fc4a162 100644
--- a/drivers/usb/dwc2/core.h
+++ b/drivers/usb/dwc2/core.h
@@ -128,7 +128,7 @@ static const char * const dwc2_hsotg_supply_names[] = {
"vusb_a",   /* analog USB supply, 1.1V */
 };
 
-#define DWC2_NUM_SUPPLIES ARRAY_SIZE(dwc2_hsotg_supply_names)
+#define DWC2_MAX_SUPPLIES ARRAY_SIZE(dwc2_hsotg_supply_names)
 
 /*
  * EP0_MPS_LIMIT
@@ -921,7 +921,8 @@ struct dwc2_hsotg {
struct phy *phy;
struct usb_phy *uphy;
struct dwc2_hsotg_plat *plat;
-   struct regulator_bulk_data supplies[DWC2_NUM_SUPPLIES];
+   struct regulator_bulk_data supplies[DWC2_MAX_SUPPLIES];
+   u8 num_supplies;
u32 phyif;
 
spinlock_t lock;
diff --git a/drivers/usb/dwc2/platform.c b/drivers/usb/dwc2/platform.c
index 4703478f702f..3acf658af4e9 100644
--- a/drivers/usb/dwc2/platform.c
+++ b/drivers/usb/dwc2/platform.c
@@ -126,10 +126,12 @@ static int __dwc2_lowlevel_hw_enable(struct dwc2_hsotg 
*hsotg)
struct platform_device *pdev = to_platform_device(hsotg->dev);
int ret;
 
-   ret = regulator_bulk_enable(ARRAY_SIZE(hsotg->supplies),
-   hsotg->supplies);
-   if (ret)
-   return ret;
+   if (hsotg->num_supplies) {
+   ret = regulator_bulk_enable(hsotg->num_supplies,
+   hsotg->supplies);
+   if (ret)
+   return ret;
+   }
 
if (hsotg->clk) {
ret = clk_prepare_enable(hsotg->clk);
@@ -186,8 +188,9 @@ static int __dwc2_lowlevel_hw_disable(struct dwc2_hsotg 
*hsotg)
if (hsotg->clk)
clk_disable_unprepare(hsotg->clk);
 
-   ret = regulator_bulk_disable(ARRAY_SIZE(hsotg->supplies),
-hsotg->supplies);
+   if (hsotg->num_supplies)
+   ret = regulator_bulk_disable(hsotg->num_supplies,
+hsotg->supplies);
 
return ret;
 }
@@ -210,6 +213,7 @@ int dwc2_lowlevel_hw_disable(struct dwc2_hsotg *hsotg)
 
 static int dwc2_lowlevel_hw_init(struct dwc2_hsotg *hsotg)
 {
+   struct regulator *reg;
int i, ret;
 
hsotg->reset = devm_reset_control_get_optional(hsotg->dev, "dwc2");
@@ -290,16 +294,30 @@ static int dwc2_lowlevel_hw_init(struct dwc2_hsotg *hsotg)
dev_dbg(hsotg->dev, "cannot get otg clock\n");
}
 
-   /* Regulators */
-   for (i = 0; i < ARRAY_SIZE(hsotg->supplies); i++)
-   hsotg->supplies[i].supply = dwc2_hsotg_supply_names[i];
+   /* Regulators, vast majority of dwc2 devices don't use them at all */
+   for (i = 0; i < DWC2_MAX_SUPPLIES; i++) {
+   reg = regulator_get_optional(hsotg->dev,
+dwc2_hsotg_supply_names[i]);
+
+   /* All or nothing (bulk): regs either are or aren't present */
+   if (IS_ERR(reg)) {
+   while (--i >= 0) {
+   regulator_put(hsotg->supplies[i].consumer);
+   hsotg->supplies[i].consumer = NULL;
+   hsotg->supplies[i].supply = NULL;
+   --hsotg->num_supplies;
+   }
+   break;
+   }
 
-   ret = devm_regulator_bulk_get(hsotg->dev, ARRAY_SIZE(hsotg->supplies),
- hsotg->supplies);
-   if (ret) {
-   dev_err(hsotg->dev, "failed to request supplies: %d\n", ret);
-   return ret;
+   hsotg->supplies[i].consumer =

[PATCH v3 1/2] ALSA: usb-audio: Fix irq/process data synchronization

2017-01-04 Thread Ioan-Adrian Ratiu
Commit 16200948d83 ("ALSA: usb-audio: Fix race at stopping the stream") was
incomplete causing another more severe kernel panic, so it got reverted.
This fixes both the original problem and its fallout kernel race/crash.

The original fix is to move the endpoint member NULL clearing logic inside
wait_clear_urbs() so the irq triggering the urb completion doesn't call
retire_capture/playback_urb() after the NULL clearing and generate a panic.

However this creates a new race between snd_usb_endpoint_start()'s call
to wait_clear_urbs() and the irq urb completion handler which again calls
retire_capture/playback_urb() leading to a new NULL dereference.

We keep the EP deactivation code in snd_usb_endpoint_start() because
removing it will break the EP reference counting (see [1] [2] for info),
however we don't need the "can_sleep" mechanism anymore because a new
function was introduced (snd_usb_endpoint_sync_pending_stop()) which
synchronizes pending stops and gets called inside the pcm prepare callback.

It also makes sense to remove can_sleep because it was also removed from
deactivate_urbs() signature in [3] so we benefit from more simplification.

[1] commit 015618b90 ("ALSA: snd-usb: Fix URB cancellation at stream start")
[2] commit e9ba389c5 ("ALSA: usb-audio: Fix scheduling-while-atomic bug in PCM 
capture stream")
[3] commit ccc1696d5 ("ALSA: usb-audio: simplify endpoint deactivation code")

Fixes: f8114f8583bb ("Revert "ALSA: usb-audio: Fix race at stopping the 
stream"")

Signed-off-by: Ioan-Adrian Ratiu 
---
 sound/usb/endpoint.c | 17 +++--
 sound/usb/endpoint.h |  2 +-
 sound/usb/pcm.c  | 10 +-
 3 files changed, 13 insertions(+), 16 deletions(-)

diff --git a/sound/usb/endpoint.c b/sound/usb/endpoint.c
index 15d1d5c63c3c..2f0ea70a998c 100644
--- a/sound/usb/endpoint.c
+++ b/sound/usb/endpoint.c
@@ -534,6 +534,11 @@ static int wait_clear_urbs(struct snd_usb_endpoint *ep)
alive, ep->ep_num);
clear_bit(EP_FLAG_STOPPING, &ep->flags);
 
+   ep->data_subs = NULL;
+   ep->sync_slave = NULL;
+   ep->retire_data_urb = NULL;
+   ep->prepare_data_urb = NULL;
+
return 0;
 }
 
@@ -912,9 +917,7 @@ int snd_usb_endpoint_set_params(struct snd_usb_endpoint *ep,
 /**
  * snd_usb_endpoint_start: start an snd_usb_endpoint
  *
- * @ep:the endpoint to start
- * @can_sleep: flag indicating whether the operation is executed in
- * non-atomic context
+ * @ep: the endpoint to start
  *
  * A call to this function will increment the use count of the endpoint.
  * In case it is not already running, the URBs for this endpoint will be
@@ -924,7 +927,7 @@ int snd_usb_endpoint_set_params(struct snd_usb_endpoint *ep,
  *
  * Returns an error if the URB submission failed, 0 in all other cases.
  */
-int snd_usb_endpoint_start(struct snd_usb_endpoint *ep, bool can_sleep)
+int snd_usb_endpoint_start(struct snd_usb_endpoint *ep)
 {
int err;
unsigned int i;
@@ -938,8 +941,6 @@ int snd_usb_endpoint_start(struct snd_usb_endpoint *ep, 
bool can_sleep)
 
/* just to be sure */
deactivate_urbs(ep, false);
-   if (can_sleep)
-   wait_clear_urbs(ep);
 
ep->active_mask = 0;
ep->unlink_mask = 0;
@@ -1020,10 +1021,6 @@ void snd_usb_endpoint_stop(struct snd_usb_endpoint *ep)
 
if (--ep->use_count == 0) {
deactivate_urbs(ep, false);
-   ep->data_subs = NULL;
-   ep->sync_slave = NULL;
-   ep->retire_data_urb = NULL;
-   ep->prepare_data_urb = NULL;
set_bit(EP_FLAG_STOPPING, &ep->flags);
}
 }
diff --git a/sound/usb/endpoint.h b/sound/usb/endpoint.h
index 6428392d8f62..584f295d7c77 100644
--- a/sound/usb/endpoint.h
+++ b/sound/usb/endpoint.h
@@ -18,7 +18,7 @@ int snd_usb_endpoint_set_params(struct snd_usb_endpoint *ep,
struct audioformat *fmt,
struct snd_usb_endpoint *sync_ep);
 
-int  snd_usb_endpoint_start(struct snd_usb_endpoint *ep, bool can_sleep);
+int  snd_usb_endpoint_start(struct snd_usb_endpoint *ep);
 void snd_usb_endpoint_stop(struct snd_usb_endpoint *ep);
 void snd_usb_endpoint_sync_pending_stop(struct snd_usb_endpoint *ep);
 int  snd_usb_endpoint_activate(struct snd_usb_endpoint *ep);
diff --git a/sound/usb/pcm.c b/sound/usb/pcm.c
index 34c6d4f2c0b6..9aa5b1855481 100644
--- a/sound/usb/pcm.c
+++ b/sound/usb/pcm.c
@@ -218,7 +218,7 @@ int snd_usb_init_pitch(struct snd_usb_audio *chip, int 
iface,
}
 }
 
-static int start_endpoints(struct snd_usb_substream *subs, bool can_sleep)
+static int start_endpoints(struct snd_usb_substream *subs)
 {
int err;
 
@@ -231,7 +231,7 @@ static int start_endpoints(struct snd_usb_substream *subs, 
bool can_sleep)
dev_d

[PATCH v3 2/2] ALSA: usb-audio: test EP_FLAG_RUNNING at urb completion

2017-01-04 Thread Ioan-Adrian Ratiu
Testing EP_FLAG_RUNNING in snd_complete_urb() before running the completion
logic allows us to save a few cpu cycles by returning early, skipping the
pending urb in case the stream was stopped; the stop logic handles the urb
and sets the completion callbacks to NULL.

Signed-off-by: Ioan-Adrian Ratiu 
---
 sound/usb/endpoint.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/sound/usb/endpoint.c b/sound/usb/endpoint.c
index 2f0ea70a998c..c90607ebe155 100644
--- a/sound/usb/endpoint.c
+++ b/sound/usb/endpoint.c
@@ -384,6 +384,9 @@ static void snd_complete_urb(struct urb *urb)
if (unlikely(atomic_read(&ep->chip->shutdown)))
goto exit_clear;
 
+   if (unlikely(!test_bit(EP_FLAG_RUNNING, &ep->flags)))
+   goto exit_clear;
+
if (usb_pipeout(ep->pipe)) {
retire_outbound_urb(ep, ctx);
/* can be stopped during retire callback */
-- 
2.11.0



[PATCH v3 0/2] ALSA: Fix usb-audio races

2017-01-04 Thread Ioan-Adrian Ratiu
Changes since v2:
* Fixed snd_usb_*lock_shutdown imbalance caused by an early return
in snd_usb_pcm_prepare()

Ioan-Adrian Ratiu (2):
  ALSA: usb-audio: Fix irq/process data synchronization
  ALSA: usb-audio: test EP_FLAG_RUNNING at urb completion

 sound/usb/endpoint.c | 20 ++--
 sound/usb/endpoint.h |  2 +-
 sound/usb/pcm.c  | 10 +-
 3 files changed, 16 insertions(+), 16 deletions(-)

-- 
2.11.0



Re: [PATCH v2 1/2] ALSA: usb-audio: Fix irq/process data synchronization

2017-01-04 Thread Ioan-Adrian Ratiu
On Wed, 04 Jan 2017, Takashi Iwai  wrote:
> On Mon, 02 Jan 2017 16:50:30 +0100,
> Ioan-Adrian Ratiu wrote:
>> 
>> --- a/sound/usb/pcm.c
>> +++ b/sound/usb/pcm.c
> (snip)
>> @@ -850,7 +850,7 @@ static int snd_usb_pcm_prepare(struct snd_pcm_substream 
>> *substream)
>>  /* for playback, submit the URBs now; otherwise, the first hwptr_done
>>   * updates for all URBs would happen at the same time when starting */
>>  if (subs->direction == SNDRV_PCM_STREAM_PLAYBACK)
>> -ret = start_endpoints(subs, true);
>> +return start_endpoints(subs);
>
> Here you miss the unlock below.
>
>>  
>>   unlock:
>>  snd_usb_unlock_shutdown(subs->stream->chip);
>
> ... and this must be the reason of the hang up at disconnection, where
> the driver ways forever at wait_event() in usb_audio_disconnect().
>
> Could you fix this and resubmit v3?  Other than that, it looks OK.

Sure, I'll resubmit v3 by tonight hopefully.

Thank you for taking a look at this, I didn't have a chance to look it
again since I posted v2.

Ionel

>
>
> thanks,
>
> Takashi


Re: [PATCH v2 0/2] ALSA: Fix usb-audio races

2017-01-03 Thread Ioan-Adrian Ratiu
On Tue, 03 Jan 2017, Ioan-Adrian Ratiu  wrote:
> On Mon, 02 Jan 2017, Ioan-Adrian Ratiu  wrote:
>> Many thanks to Takashi Iwai & Sakamoto for their awesome feedback.
>>
>> Changes since v1:
>> * Rebased my fix on top of tiwai's revert and integrated the changes
>> from the original fix into this.
>> * Dropped the stop_endpoints() call inside snd_usb_pcm_prepare() and
>> kept the previously existing snd_usb_endpoint_sync_pending_stop() call.
>> * Retained the deactivate_urbs() call in snd_usb_pcm_prepare(), I only
>> removed the can_sleep logic.
>> * Split the EP_FLAG_RUNNING check in a separate commit to keep the log
>> clean since this is not part of the race fix.
>>
>> Ioan-Adrian Ratiu (2):
>>   ALSA: usb-audio: Fix irq/process data synchronization
>>   ALSA: usb-audio: test EP_FLAG_RUNNING at urb completion
>>
>>  sound/usb/endpoint.c | 20 ++--
>>  sound/usb/endpoint.h |  2 +-
>>  sound/usb/pcm.c  | 10 +-
>>  3 files changed, 16 insertions(+), 16 deletions(-)
>>
>> -- 
>> 2.11.0
>
> @tiwai
>
> You mentioned in a previous email to the first version I introduced a bug
> by unballancing snd_usb_*lock_shutdown(), I think I managed to reproduce it
> with v2. Is this trace a symptom of the bug you were refering to or is
> this something newly introduced in v2?
>
> I can't for the life of me figure out for sure how snd_usb_*lock_shutdown()
> ballancing is affected by my patch. Any pointers how to fix this are
> greatly appreciated.

A bit more context on the trace: usb_audio_disconnect hangs while
waiting for all pending tasks to finish while the implicit feedback
endpoint sending errors at usb_submit_urb (err -19).

I could figure these out on my own but at the moment my time is very
limited and I don't want to keep you guys waiting after me so maybe
asking about it will be faster.

Thanks,
Ionel

>
> [  492.558350] INFO: task kworker/2:3:133 blocked for more than 120 seconds.
> [  492.558355]   Not tainted 4.10.0-rc2-g0c771e68bbc8 #25
> [  492.558356] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
> this message.
> [  492.558359] kworker/2:3 D0   133  2 0x
> [  492.558368] Workqueue: usb_hub_wq hub_event
> [  492.558370] Call Trace:
> [  492.558377]  ? __schedule+0x1dd/0x7f0
> [  492.558380]  ? trace_hardirqs_on+0xd/0x10
> [  492.558384]  schedule+0x3b/0x90
> [  492.558392]  usb_audio_disconnect+0x1bf/0x220 [snd_usb_audio]
> [  492.558394]  ? wake_atomic_t_function+0x50/0x50
> [  492.558397]  usb_unbind_interface+0x7a/0x260
> [  492.558399]  device_release_driver_internal+0x158/0x210
> [  492.558400]  device_release_driver+0xd/0x10
> [  492.558401]  bus_remove_device+0x10f/0x190
> [  492.558411]  device_del+0x208/0x330
> [  492.558413]  ? usb_remove_ep_devs+0x1a/0x30
> [  492.558414]  usb_disable_device+0x99/0x270
> [  492.558416]  usb_disconnect+0x87/0x2a0
> [  492.558417]  hub_event+0x962/0x1530
> [  492.558419]  process_one_work+0x221/0x4b0
> [  492.558420]  ? process_one_work+0x1bb/0x4b0
> [  492.558421]  worker_thread+0x46/0x4f0
> [  492.558423]  kthread+0x102/0x140
> [  492.558424]  ? process_one_work+0x4b0/0x4b0
> [  492.558425]  ? kthread_create_on_node+0x40/0x40
> [  492.558427]  ret_from_fork+0x27/0x40
> [  492.558428] 
>Showing all locks held in the system:
> [  492.558433] 2 locks held by khungtaskd/34:
> [  492.558434]  #0:  (rcu_read_lock){..}, at: [] 
> watchdog+0x9f/0x4a0
> [  492.558438]  #1:  (tasklist_lock){.+.+..}, at: [] 
> debug_show_all_locks+0x3d/0x1a0
> [  492.558449] 6 locks held by kworker/2:3/133:
> [  492.558449]  #0:  ("usb_hub_wq"){.+.+.+}, at: [] 
> process_one_work+0x1bb/0x4b0
> [  492.558452]  #1:  ((&hub->events)){+.+.+.}, at: [] 
> process_one_work+0x1bb/0x4b0
> [  492.558455]  #2:  (&dev->mutex){..}, at: [] 
> hub_event+0x5c/0x1530
> [  492.558459]  #3:  (&dev->mutex){..}, at: [] 
> usb_disconnect+0x4e/0x2a0
> [  492.558461]  #4:  (&dev->mutex){..}, at: [] 
> device_release_driver_internal+0x34/0x210
> [  492.558464]  #5:  (register_mutex#3){+.+.+.}, at: [] 
> usb_audio_disconnect+0x2e/0x220 [snd_usb_audio]
> [  492.558485] 2 locks held by idn/975:
> [  492.558485]  #0:  (&tty->ldisc_sem){++}, at: [] 
> ldsem_down_read+0x2d/0x40
> [  492.558489]  #1:  (&ldata->atomic_read_lock){+.+.+.}, at: 
> [] n_tty_read+0xb0/0x890
> [  492.558494] 2 locks held by idn/1311:
> [  492.558494]  #0:  (&tty->ldisc_sem){++}, at: [] 
> ldsem_down_read+0x2d/0x40
> [  492.558497]  #1:  (&ldata->atomic_read_lock){+.+.+.}, at: 
> [] n_tty_read+0xb0/0x890
>
> [  492.558500] =


Re: [PATCH v2 0/2] ALSA: Fix usb-audio races

2017-01-02 Thread Ioan-Adrian Ratiu
On Mon, 02 Jan 2017, Ioan-Adrian Ratiu  wrote:
> Many thanks to Takashi Iwai & Sakamoto for their awesome feedback.
>
> Changes since v1:
> * Rebased my fix on top of tiwai's revert and integrated the changes
> from the original fix into this.
> * Dropped the stop_endpoints() call inside snd_usb_pcm_prepare() and
> kept the previously existing snd_usb_endpoint_sync_pending_stop() call.
> * Retained the deactivate_urbs() call in snd_usb_pcm_prepare(), I only
> removed the can_sleep logic.
> * Split the EP_FLAG_RUNNING check in a separate commit to keep the log
> clean since this is not part of the race fix.
>
> Ioan-Adrian Ratiu (2):
>   ALSA: usb-audio: Fix irq/process data synchronization
>   ALSA: usb-audio: test EP_FLAG_RUNNING at urb completion
>
>  sound/usb/endpoint.c | 20 ++--
>  sound/usb/endpoint.h |  2 +-
>  sound/usb/pcm.c  | 10 +-
>  3 files changed, 16 insertions(+), 16 deletions(-)
>
> -- 
> 2.11.0

@tiwai

You mentioned in a previous email to the first version I introduced a bug
by unballancing snd_usb_*lock_shutdown(), I think I managed to reproduce it
with v2. Is this trace a symptom of the bug you were refering to or is
this something newly introduced in v2?

I can't for the life of me figure out for sure how snd_usb_*lock_shutdown()
ballancing is affected by my patch. Any pointers how to fix this are
greatly appreciated.

[  492.558350] INFO: task kworker/2:3:133 blocked for more than 120 seconds.
[  492.558355]   Not tainted 4.10.0-rc2-g0c771e68bbc8 #25
[  492.558356] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this 
message.
[  492.558359] kworker/2:3 D0   133  2 0x
[  492.558368] Workqueue: usb_hub_wq hub_event
[  492.558370] Call Trace:
[  492.558377]  ? __schedule+0x1dd/0x7f0
[  492.558380]  ? trace_hardirqs_on+0xd/0x10
[  492.558384]  schedule+0x3b/0x90
[  492.558392]  usb_audio_disconnect+0x1bf/0x220 [snd_usb_audio]
[  492.558394]  ? wake_atomic_t_function+0x50/0x50
[  492.558397]  usb_unbind_interface+0x7a/0x260
[  492.558399]  device_release_driver_internal+0x158/0x210
[  492.558400]  device_release_driver+0xd/0x10
[  492.558401]  bus_remove_device+0x10f/0x190
[  492.558411]  device_del+0x208/0x330
[  492.558413]  ? usb_remove_ep_devs+0x1a/0x30
[  492.558414]  usb_disable_device+0x99/0x270
[  492.558416]  usb_disconnect+0x87/0x2a0
[  492.558417]  hub_event+0x962/0x1530
[  492.558419]  process_one_work+0x221/0x4b0
[  492.558420]  ? process_one_work+0x1bb/0x4b0
[  492.558421]  worker_thread+0x46/0x4f0
[  492.558423]  kthread+0x102/0x140
[  492.558424]  ? process_one_work+0x4b0/0x4b0
[  492.558425]  ? kthread_create_on_node+0x40/0x40
[  492.558427]  ret_from_fork+0x27/0x40
[  492.558428] 
   Showing all locks held in the system:
[  492.558433] 2 locks held by khungtaskd/34:
[  492.558434]  #0:  (rcu_read_lock){..}, at: [] 
watchdog+0x9f/0x4a0
[  492.558438]  #1:  (tasklist_lock){.+.+..}, at: [] 
debug_show_all_locks+0x3d/0x1a0
[  492.558449] 6 locks held by kworker/2:3/133:
[  492.558449]  #0:  ("usb_hub_wq"){.+.+.+}, at: [] 
process_one_work+0x1bb/0x4b0
[  492.558452]  #1:  ((&hub->events)){+.+.+.}, at: [] 
process_one_work+0x1bb/0x4b0
[  492.558455]  #2:  (&dev->mutex){..}, at: [] 
hub_event+0x5c/0x1530
[  492.558459]  #3:  (&dev->mutex){..}, at: [] 
usb_disconnect+0x4e/0x2a0
[  492.558461]  #4:  (&dev->mutex){..}, at: [] 
device_release_driver_internal+0x34/0x210
[  492.558464]  #5:  (register_mutex#3){+.+.+.}, at: [] 
usb_audio_disconnect+0x2e/0x220 [snd_usb_audio]
[  492.558485] 2 locks held by idn/975:
[  492.558485]  #0:  (&tty->ldisc_sem){++}, at: [] 
ldsem_down_read+0x2d/0x40
[  492.558489]  #1:  (&ldata->atomic_read_lock){+.+.+.}, at: 
[] n_tty_read+0xb0/0x890
[  492.558494] 2 locks held by idn/1311:
[  492.558494]  #0:  (&tty->ldisc_sem){++}, at: [] 
ldsem_down_read+0x2d/0x40
[  492.558497]  #1:  (&ldata->atomic_read_lock){+.+.+.}, at: 
[] n_tty_read+0xb0/0x890

[  492.558500] =


[PATCH v2 0/2] ALSA: Fix usb-audio races

2017-01-02 Thread Ioan-Adrian Ratiu
Many thanks to Takashi Iwai & Sakamoto for their awesome feedback.

Changes since v1:
* Rebased my fix on top of tiwai's revert and integrated the changes
from the original fix into this.
* Dropped the stop_endpoints() call inside snd_usb_pcm_prepare() and
kept the previously existing snd_usb_endpoint_sync_pending_stop() call.
* Retained the deactivate_urbs() call in snd_usb_pcm_prepare(), I only
removed the can_sleep logic.
* Split the EP_FLAG_RUNNING check in a separate commit to keep the log
clean since this is not part of the race fix.

Ioan-Adrian Ratiu (2):
  ALSA: usb-audio: Fix irq/process data synchronization
  ALSA: usb-audio: test EP_FLAG_RUNNING at urb completion

 sound/usb/endpoint.c | 20 ++--
 sound/usb/endpoint.h |  2 +-
 sound/usb/pcm.c  | 10 +-
 3 files changed, 16 insertions(+), 16 deletions(-)

-- 
2.11.0



[PATCH v2 2/2] ALSA: usb-audio: test EP_FLAG_RUNNING at urb completion

2017-01-02 Thread Ioan-Adrian Ratiu
Testing EP_FLAG_RUNNING in snd_complete_urb() before running the completion
logic allows us to save a few cpu cycles by returning early, skipping the
pending urb in case the stream was stopped; the stop logic handles the urb
and sets the completion callbacks to NULL.

Signed-off-by: Ioan-Adrian Ratiu 
---
 sound/usb/endpoint.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/sound/usb/endpoint.c b/sound/usb/endpoint.c
index 2f0ea70a998c..c90607ebe155 100644
--- a/sound/usb/endpoint.c
+++ b/sound/usb/endpoint.c
@@ -384,6 +384,9 @@ static void snd_complete_urb(struct urb *urb)
if (unlikely(atomic_read(&ep->chip->shutdown)))
goto exit_clear;
 
+   if (unlikely(!test_bit(EP_FLAG_RUNNING, &ep->flags)))
+   goto exit_clear;
+
if (usb_pipeout(ep->pipe)) {
retire_outbound_urb(ep, ctx);
/* can be stopped during retire callback */
-- 
2.11.0



[PATCH v2 1/2] ALSA: usb-audio: Fix irq/process data synchronization

2017-01-02 Thread Ioan-Adrian Ratiu
Commit 16200948d83 ("ALSA: usb-audio: Fix race at stopping the stream") was
incomplete causing another more severe kernel panic, so it got reverted.
This fixes both the original problem and its fallout kernel race/crash.

The original fix is to move the endpoint member NULL clearing logic inside
wait_clear_urbs() so the irq triggering the urb completion doesn't call
retire_capture/playback_urb() after the NULL clearing and generate a panic.

However this creates a new race between snd_usb_endpoint_start()'s call
to wait_clear_urbs() and the irq urb completion handler which again calls
retire_capture/playback_urb() leading to a new NULL dereference.

We keep the EP deactivation code in snd_usb_endpoint_start() because
removing it will break the EP reference counting (see [1] [2] for info),
however we don't need the "can_sleep" mechanism anymore because a new
function was introduced (snd_usb_endpoint_sync_pending_stop()) which
synchronizes pending stops and gets called inside the pcm prepare callback.

It also makes sense to remove can_sleep because it was also removed from
deactivate_urbs() signature in [3] so we benefit from more simplification.

[1] commit 015618b90 ("ALSA: snd-usb: Fix URB cancellation at stream start")
[2] commit e9ba389c5 ("ALSA: usb-audio: Fix scheduling-while-atomic bug in PCM 
capture stream")
[3] commit ccc1696d5 ("ALSA: usb-audio: simplify endpoint deactivation code")

Fixes: f8114f8583bb ("Revert "ALSA: usb-audio: Fix race at stopping the 
stream"")

Signed-off-by: Ioan-Adrian Ratiu 
---
 sound/usb/endpoint.c | 17 +++--
 sound/usb/endpoint.h |  2 +-
 sound/usb/pcm.c  | 10 +-
 3 files changed, 13 insertions(+), 16 deletions(-)

diff --git a/sound/usb/endpoint.c b/sound/usb/endpoint.c
index 15d1d5c63c3c..2f0ea70a998c 100644
--- a/sound/usb/endpoint.c
+++ b/sound/usb/endpoint.c
@@ -534,6 +534,11 @@ static int wait_clear_urbs(struct snd_usb_endpoint *ep)
alive, ep->ep_num);
clear_bit(EP_FLAG_STOPPING, &ep->flags);
 
+   ep->data_subs = NULL;
+   ep->sync_slave = NULL;
+   ep->retire_data_urb = NULL;
+   ep->prepare_data_urb = NULL;
+
return 0;
 }
 
@@ -912,9 +917,7 @@ int snd_usb_endpoint_set_params(struct snd_usb_endpoint *ep,
 /**
  * snd_usb_endpoint_start: start an snd_usb_endpoint
  *
- * @ep:the endpoint to start
- * @can_sleep: flag indicating whether the operation is executed in
- * non-atomic context
+ * @ep: the endpoint to start
  *
  * A call to this function will increment the use count of the endpoint.
  * In case it is not already running, the URBs for this endpoint will be
@@ -924,7 +927,7 @@ int snd_usb_endpoint_set_params(struct snd_usb_endpoint *ep,
  *
  * Returns an error if the URB submission failed, 0 in all other cases.
  */
-int snd_usb_endpoint_start(struct snd_usb_endpoint *ep, bool can_sleep)
+int snd_usb_endpoint_start(struct snd_usb_endpoint *ep)
 {
int err;
unsigned int i;
@@ -938,8 +941,6 @@ int snd_usb_endpoint_start(struct snd_usb_endpoint *ep, 
bool can_sleep)
 
/* just to be sure */
deactivate_urbs(ep, false);
-   if (can_sleep)
-   wait_clear_urbs(ep);
 
ep->active_mask = 0;
ep->unlink_mask = 0;
@@ -1020,10 +1021,6 @@ void snd_usb_endpoint_stop(struct snd_usb_endpoint *ep)
 
if (--ep->use_count == 0) {
deactivate_urbs(ep, false);
-   ep->data_subs = NULL;
-   ep->sync_slave = NULL;
-   ep->retire_data_urb = NULL;
-   ep->prepare_data_urb = NULL;
set_bit(EP_FLAG_STOPPING, &ep->flags);
}
 }
diff --git a/sound/usb/endpoint.h b/sound/usb/endpoint.h
index 6428392d8f62..584f295d7c77 100644
--- a/sound/usb/endpoint.h
+++ b/sound/usb/endpoint.h
@@ -18,7 +18,7 @@ int snd_usb_endpoint_set_params(struct snd_usb_endpoint *ep,
struct audioformat *fmt,
struct snd_usb_endpoint *sync_ep);
 
-int  snd_usb_endpoint_start(struct snd_usb_endpoint *ep, bool can_sleep);
+int  snd_usb_endpoint_start(struct snd_usb_endpoint *ep);
 void snd_usb_endpoint_stop(struct snd_usb_endpoint *ep);
 void snd_usb_endpoint_sync_pending_stop(struct snd_usb_endpoint *ep);
 int  snd_usb_endpoint_activate(struct snd_usb_endpoint *ep);
diff --git a/sound/usb/pcm.c b/sound/usb/pcm.c
index 34c6d4f2c0b6..657d54dffd2d 100644
--- a/sound/usb/pcm.c
+++ b/sound/usb/pcm.c
@@ -218,7 +218,7 @@ int snd_usb_init_pitch(struct snd_usb_audio *chip, int 
iface,
}
 }
 
-static int start_endpoints(struct snd_usb_substream *subs, bool can_sleep)
+static int start_endpoints(struct snd_usb_substream *subs)
 {
int err;
 
@@ -231,7 +231,7 @@ static int start_endpoints(struct snd_usb_substream *subs, 
bool can_sleep)
dev_d

Re: [PATCH] ALSA: snd-usb: fix IRQ triggered NULL pointer dereference

2016-12-30 Thread Ioan-Adrian Ratiu
On Fri, 30 Dec 2016, Takashi Iwai  wrote:
> On Wed, 21 Dec 2016 11:38:54 +0100,
> Ioan-Adrian Ratiu wrote:
>> 
>> >> > Please take the time to fully analyze my patch and let's have a
>> >> > discussion based on it, not reject it outright and replace it with
>> >> > a quick and dirty delay hack.
>> >> 
>> >> OK. I'll deliberately check it again so that I have no overlooks. I
>> >> with this thread also catch the other developers enough helpful to
>> >> you. (I just eventually caught your patch in LKML and not developer
>> >> for this category of devices.)
>> >
>> > Sorry for the late reply, as I've been (still) off and had bad net
>> > connections.
>> >
>> > About your fix: Sakamoto-san is right, it's no good way to fix this
>> > kind or problem.  The easiest option right now is just to revert my
>> > previous fix, as it obviously introduces another regression.  The
>> > correct fix will be taken after that.
>> >
>> > I'm going to prepare a revert patch and ask Linus to take it before
>> > rc1.
>> 
>> I agree with reverting the initial commit decision because my problem
>> disappears with it.
>> 
>> But can you please state a reason for why my patch is "no good way to
>> fix"? Being too intrusive is not a good reason.
>
> "Being too intrusive" is the exact reason why it's not good as a
> "regression fix" like this case.  The logic you've implemented in the
> patch itself looks good (although the code introduces a bug, the
> unbalance of snd_usb_*lock_shutdown()).  The only point I couldn't
> take it is that it's rather a fundamental change, not a quick fix for
> a regression.
>
> What's the worst case scenario in a regression fix?  It's when a fix
> introduces yet another regression.  It'd be much worse if the new
> regression is deeper.  The complicated logical change has a potential
> risk of such.  Thus an intrusive change is not always good as a
> "regression fix".
>
> In general, if the change were trivial, it's obviously OK to take as a
> regression fix -- where the logic is also trivial in most cases, too.
> But when the fix becomes complex, we need to rethink.  Especially when
> the original buggy commit is small, reverting it is often better as a
> clear cut.
>
> Think in that way: you're addressing a deeper problem that was
> revealed accidentally by my previous bad fix.  Writing the change as
> if it were merely a regression fix gives the wrong understanding to
> readers :)

Yes, this makes sense. Thank you.

>
> That said, I'd appreciate if you respin the fix again with a
> combination of my previous fix and brush up the code a bit as a whole,
> and document it not as a regression fix but as a complete fix of the
> existing races.  I can write it in my side quickly, but it'd be almost
> in the same form as you posted, I suppose.

I'll find some time to do a rebase either tomorrow (31 Dec) or until
2 Jan at the latest.

Ionel

>
>
> thanks,
>
> Takashi


Re: [PATCH] ALSA: snd-usb: fix IRQ triggered NULL pointer dereference

2016-12-21 Thread Ioan-Adrian Ratiu
Hi

On Wed, 21 Dec 2016, Takashi Iwai  wrote:
> On Tue, 20 Dec 2016 14:04:56 +0100,
> Takashi Sakamoto wrote:
>> 
>> Hi,
>> 
>> On Dec 20 2016 20:47, Ioan-Adrian Ratiu wrote:
>> > On Tue, 20 Dec 2016, Takashi Sakamoto  wrote:
>> >> ---
>> >>
>> >> Hi,
>> >>
>> >>> Commit 16200948d835 ("ALSA: usb-audio: Fix race at stopping the stream")
>> >>> fixes a race-codition but it also introduces another really nasty data 
>> >>> race
>> >>> regression which makes my usb sound card [1] completely useless, throwing
>> >>> the kernel into a panic if anything from userspace tries to start 
>> >>> playback.
>> >>>
>> >>> The problem is this: ep->data_subs is now set to NULL every time inside
>> >>> wait_clear_urbs(). ep->data_subs is initalized only in one place in
>> >>> start_endpoints(), then it is immediately wiped by the pre-existing call 
>> >>> to
>> >>> wait_clear_urbs() inside snd_usb_endpoint_start().
>> >>>
>> >>> To ilustrate, this is what happens in the non-irq part of the code:
>> >>>
>> >>> Step 1 (inside start_endpoints): ep->data_subs = subs;
>> >>> Step 2 (inside start_endpoints): snd_usb_endpoint_start(ep, can_sleep);
>> >>> Step 3 (inside snd_usb_endpoint_start): wait_clear_urbs(ep);
>> >>> Step 4 (inside wait_clear_urbs): ep->data_subs = NULL;
>> >>>
>> >>> Here's a simplified call stack for the IRQ part (full one at the end):
>> >>>
>> >>> (NULL dereference, param "subs" is passed NULL value of ep->data_subs)
>> >>> retire_playback_urb
>> >>> retire_outbound_urb
>> >>> snd_complete_urb
>> >>> (...)
>> >>> xhci_irq
>> >>>
>> >>> Looking at the git log there seems to be quite a lot of history in this
>> >>> part of the codebase, dating back to 2012 or earlier. My fix is based on
>> >>> 015618b90 ("ALSA: snd-usb: Fix URB cancellation at stream start") and
>> >>> e9ba389c5 ("ALSA: usb-audio: Fix scheduling-while-atomic bug in PCM 
>> >>> capture
>> >>> stream") with a few modifications of my own. My idea is to do the
>> >>> ep->data_subs wiping before endpoint initialization in a manner similar
>> >>> to these older commits by using stop_endpoints() in snd_usb_pcm_prepare()
>> >>> and at the same time keep the ep->data_subs = NULL in wait_clear_urbs() 
>> >>> to
>> >>> not trigger the recently fixed stream stopping race again.
>> >>>
>> >>> Full call stack:
>> >>>
>> >>> [  131.093240] BUG: unable to handle kernel NULL pointer dereference at 
>> >>> 0010
>> >>> [  131.094313] IP: retire_playback_urb+0x1b/0x160 [snd_usb_audio]
>> >>> [  131.095046] PGD 0
>> >>> [  131.095047]
>> >>> [  131.096509] Oops:  [#1] PREEMPT SMP
>> >>> [  131.097255] Modules linked in: fuse snd_usb_audio snd_usbmidi_lib 
>> >>> snd_rawmidi snd_seq_device ctr ccm arc4 ath9k intel_rapl ath9k_common 
>> >>> x86_pkg_temp_thermal ath9k_hw intel_powerclamp coretemp joydev mousedev 
>> >>> ath kvm_intel mac80211 kvm
>> >>>  input_leds hid_generic psmouse irqbypass usbhid hid crct10dif_pclmul 
>> >>> crc32_pclmul serio_raw crc32c_intel atkbd ghash_clmulni_intel 
>> >>> snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic 
>> >>> snd_hda_intel pcbc aesni_intel libps2 cfg8
>> >>> 0211 aes_x86_64 crypto_simd dell_laptop glue_helper r8169 dell_smbios 
>> >>> snd_hda_codec sr_mod cryptd led_class mii rfkill snd_hwdep cdrom 
>> >>> snd_hda_core ac dcdbas i8042 xhci_pci serio xhci_hcd snd_pcm tpm_tis 
>> >>> battery tpm_tis_core lpc_ich tpm
>> >>> snd_timer evdev shpchp i2c_i801 sch_fq_codel
>> >>> [  131.105551] CPU: 2 PID: 165 Comm: irq/29-xhci_hcd Not tainted 
>> >>> 4.9.0-gd824cdc58ba0 #10
>> >>> [  131.107516] Hardware name: Dell Inc. Inspiron 3521/018DYG, BIOS A14 
>> >>> 07/31/2015
>> >>> [  131.109592] task: 880154a7 task.stack: c9f48000
>> >>> [  131.111746] RIP: 0010:retire_playback_urb+0x1b/0x160 [snd_usb_audio]
>&g

Re: [PATCH] ALSA: snd-usb: fix IRQ triggered NULL pointer dereference

2016-12-20 Thread Ioan-Adrian Ratiu
xhci_hcd]
>> [  131.148736]  xhci_irq+0x13a2/0x1ca0 [xhci_hcd]
>> [  131.150972]  ? trace_hardirqs_on+0xd/0x10
>> [  131.153209]  ? _raw_spin_unlock_irq+0x2c/0x50
>> [  131.155390]  ? irq_thread+0xb5/0x1d0
>> [  131.157771]  xhci_msi_irq+0x11/0x20 [xhci_hcd]
>> [  131.159938]  irq_forced_thread_fn+0x2f/0x70
>> [  131.162072]  ? irq_thread+0xb5/0x1d0
>> [  131.164141]  irq_thread+0x149/0x1d0
>> [  131.166132]  ? irq_finalize_oneshot.part.2+0xe0/0xe0
>> [  131.168142]  ? wake_threads_waitq+0x30/0x30
>> [  131.170149]  kthread+0x10f/0x150
>> [  131.172144]  ? irq_thread_dtor+0xc0/0xc0
>> [  131.174139]  ? kthread_create_on_node+0x40/0x40
>> [  131.176116]  ret_from_fork+0x2a/0x40
>> [  131.178074] Code: 8b 77 64 4c 89 e7 e8 e5 fe ff ff eb c3 0f 1f 00 0f 1f 
>> 44 00 00 55 31 d2 48 89 e5 41 57 41 56 41 55 41 54 53 48 89 fb 48 83 ec 08 
>> <48> 8b 47 10 48 8b 4f 70 48 c7 c7 88 7b 4d a0 4c 8b a0 78 01 00
>> [  131.182382] RIP: retire_playback_urb+0x1b/0x160 [snd_usb_audio] RSP: 
>> c9f4bc10
>> [  131.184562] CR2: 0010
>> 
>> [1] 041e:3232 Creative Technology SoundBlaster X-FI HD
>> [2] 
>> http://mailman.alsa-project.org/pipermail/alsa-devel/2016-December/115425.html
>
> I can regenerate this bug with my EMU 0404 USB. My understanding of this bug 
> is
> quite similar to your perspective. This bug is quite critical. In my
> understanding, we encounter this bug in all of cases in which snd-usb-audio is
> applied for PCM playback substream.
>
> Below workaround can also suppress this bug.
>
> diff --git a/sound/usb/endpoint.c b/sound/usb/endpoint.c
> index a2cdf33..9b34f76 100644
> --- a/sound/usb/endpoint.c
> +++ b/sound/usb/endpoint.c
> @@ -537,11 +537,14 @@ static int wait_clear_urbs(struct snd_usb_endpoint *ep)
> alive, ep->ep_num);
> clear_bit(EP_FLAG_STOPPING, &ep->flags);
>  
> -   ep->data_subs = NULL;
> ep->sync_slave = NULL;
> ep->retire_data_urb = NULL;
> ep->prepare_data_urb = NULL;
>  
> +   msleep(200);
> +
> +   ep->data_subs = NULL;
> +
> return 0;
>  }
>
> If initialization of 'struct snd_usb_endpoint.data_subs' can be done after all
> of queued URBs are processed and corresponding complete interrupts are cought,
> we can solve this critical bug.
>
>> Signed-off-by: Ioan-Adrian Ratiu 
>> ---
>>  sound/usb/endpoint.c | 11 ++-
>>  sound/usb/endpoint.h |  2 +-
>>  sound/usb/pcm.c  | 13 ++---
>>  3 files changed, 9 insertions(+), 17 deletions(-)
>> 
>> diff --git a/sound/usb/endpoint.c b/sound/usb/endpoint.c
>> index a2cdf3370afe..4465f324c2c2 100644
>> --- a/sound/usb/endpoint.c
>> +++ b/sound/usb/endpoint.c
>> @@ -920,9 +920,7 @@ int snd_usb_endpoint_set_params(struct snd_usb_endpoint 
>> *ep,
>>  /**
>>   * snd_usb_endpoint_start: start an snd_usb_endpoint
>>   *
>> - * @ep: the endpoint to start
>> - * @can_sleep:  flag indicating whether the operation is executed in
>> - *  non-atomic context
>> + * @ep: the endpoint to start
>>   *
>>   * A call to this function will increment the use count of the endpoint.
>>   * In case it is not already running, the URBs for this endpoint will be
>> @@ -932,7 +930,7 @@ int snd_usb_endpoint_set_params(struct snd_usb_endpoint 
>> *ep,
>>   *
>>   * Returns an error if the URB submission failed, 0 in all other cases.
>>   */
>> -int snd_usb_endpoint_start(struct snd_usb_endpoint *ep, bool can_sleep)
>> +int snd_usb_endpoint_start(struct snd_usb_endpoint *ep)
>>  {
>>  int err;
>>  unsigned int i;
>> @@ -944,11 +942,6 @@ int snd_usb_endpoint_start(struct snd_usb_endpoint *ep, 
>> bool can_sleep)
>>  if (++ep->use_count != 1)
>>  return 0;
>>  
>> -/* just to be sure */
>> -deactivate_urbs(ep, false);
>> -if (can_sleep)
>> -wait_clear_urbs(ep);
>> -
>>  ep->active_mask = 0;
>>  ep->unlink_mask = 0;
>>  ep->phase = 0;
>> diff --git a/sound/usb/endpoint.h b/sound/usb/endpoint.h
>> index 6428392d8f62..584f295d7c77 100644
>> --- a/sound/usb/endpoint.h
>> +++ b/sound/usb/endpoint.h
>> @@ -18,7 +18,7 @@ int snd_usb_endpoint_set_params(struct snd_usb_endpoint 
>> *ep,
>>  struct audioformat *fmt,
>>  struct snd_usb_endpoint *sync_ep);
>>  
>> -int  snd_usb_endpoint_start(struct snd_usb_endpoint *ep, bool ca

[PATCH] ALSA: snd-usb: fix IRQ triggered NULL pointer dereference

2016-12-17 Thread Ioan-Adrian Ratiu
 a0 4c 8b a0 78 01 00
[  131.182382] RIP: retire_playback_urb+0x1b/0x160 [snd_usb_audio] RSP: 
c9f4bc10
[  131.184562] CR2: 0010

[1] 041e:3232 Creative Technology SoundBlaster X-FI HD
[2] 
http://mailman.alsa-project.org/pipermail/alsa-devel/2016-December/115425.html

Signed-off-by: Ioan-Adrian Ratiu 
---
 sound/usb/endpoint.c | 11 ++-
 sound/usb/endpoint.h |  2 +-
 sound/usb/pcm.c  | 13 ++---
 3 files changed, 9 insertions(+), 17 deletions(-)

diff --git a/sound/usb/endpoint.c b/sound/usb/endpoint.c
index a2cdf3370afe..4465f324c2c2 100644
--- a/sound/usb/endpoint.c
+++ b/sound/usb/endpoint.c
@@ -920,9 +920,7 @@ int snd_usb_endpoint_set_params(struct snd_usb_endpoint *ep,
 /**
  * snd_usb_endpoint_start: start an snd_usb_endpoint
  *
- * @ep:the endpoint to start
- * @can_sleep: flag indicating whether the operation is executed in
- * non-atomic context
+ * @ep: the endpoint to start
  *
  * A call to this function will increment the use count of the endpoint.
  * In case it is not already running, the URBs for this endpoint will be
@@ -932,7 +930,7 @@ int snd_usb_endpoint_set_params(struct snd_usb_endpoint *ep,
  *
  * Returns an error if the URB submission failed, 0 in all other cases.
  */
-int snd_usb_endpoint_start(struct snd_usb_endpoint *ep, bool can_sleep)
+int snd_usb_endpoint_start(struct snd_usb_endpoint *ep)
 {
int err;
unsigned int i;
@@ -944,11 +942,6 @@ int snd_usb_endpoint_start(struct snd_usb_endpoint *ep, 
bool can_sleep)
if (++ep->use_count != 1)
return 0;
 
-   /* just to be sure */
-   deactivate_urbs(ep, false);
-   if (can_sleep)
-   wait_clear_urbs(ep);
-
ep->active_mask = 0;
ep->unlink_mask = 0;
ep->phase = 0;
diff --git a/sound/usb/endpoint.h b/sound/usb/endpoint.h
index 6428392d8f62..584f295d7c77 100644
--- a/sound/usb/endpoint.h
+++ b/sound/usb/endpoint.h
@@ -18,7 +18,7 @@ int snd_usb_endpoint_set_params(struct snd_usb_endpoint *ep,
struct audioformat *fmt,
struct snd_usb_endpoint *sync_ep);
 
-int  snd_usb_endpoint_start(struct snd_usb_endpoint *ep, bool can_sleep);
+int  snd_usb_endpoint_start(struct snd_usb_endpoint *ep);
 void snd_usb_endpoint_stop(struct snd_usb_endpoint *ep);
 void snd_usb_endpoint_sync_pending_stop(struct snd_usb_endpoint *ep);
 int  snd_usb_endpoint_activate(struct snd_usb_endpoint *ep);
diff --git a/sound/usb/pcm.c b/sound/usb/pcm.c
index 34c6d4f2c0b6..db26f767f851 100644
--- a/sound/usb/pcm.c
+++ b/sound/usb/pcm.c
@@ -218,7 +218,7 @@ int snd_usb_init_pitch(struct snd_usb_audio *chip, int 
iface,
}
 }
 
-static int start_endpoints(struct snd_usb_substream *subs, bool can_sleep)
+static int start_endpoints(struct snd_usb_substream *subs)
 {
int err;
 
@@ -231,7 +231,7 @@ static int start_endpoints(struct snd_usb_substream *subs, 
bool can_sleep)
dev_dbg(&subs->dev->dev, "Starting data EP @%p\n", ep);
 
ep->data_subs = subs;
-   err = snd_usb_endpoint_start(ep, can_sleep);
+   err = snd_usb_endpoint_start(ep);
if (err < 0) {
clear_bit(SUBSTREAM_FLAG_DATA_EP_STARTED, &subs->flags);
return err;
@@ -260,7 +260,7 @@ static int start_endpoints(struct snd_usb_substream *subs, 
bool can_sleep)
dev_dbg(&subs->dev->dev, "Starting sync EP @%p\n", ep);
 
ep->sync_slave = subs->data_endpoint;
-   err = snd_usb_endpoint_start(ep, can_sleep);
+   err = snd_usb_endpoint_start(ep);
if (err < 0) {
clear_bit(SUBSTREAM_FLAG_SYNC_EP_STARTED, &subs->flags);
return err;
@@ -809,8 +809,7 @@ static int snd_usb_pcm_prepare(struct snd_pcm_substream 
*substream)
goto unlock;
}
 
-   snd_usb_endpoint_sync_pending_stop(subs->sync_endpoint);
-   snd_usb_endpoint_sync_pending_stop(subs->data_endpoint);
+   stop_endpoints(subs, true);
 
ret = set_format(subs, subs->cur_audiofmt);
if (ret < 0)
@@ -850,7 +849,7 @@ static int snd_usb_pcm_prepare(struct snd_pcm_substream 
*substream)
/* for playback, submit the URBs now; otherwise, the first hwptr_done
 * updates for all URBs would happen at the same time when starting */
if (subs->direction == SNDRV_PCM_STREAM_PLAYBACK)
-   ret = start_endpoints(subs, true);
+   return start_endpoints(subs);
 
  unlock:
snd_usb_unlock_shutdown(subs->stream->chip);
@@ -1666,7 +1665,7 @@ static int snd_usb_substream_capture_trigger(struct 
snd_pcm_substream *substream
 
switch (cmd) {
case SNDRV_PCM_TRIGGER_START:
-   err = start_endpoints(subs, false);
+   err = start_endpoints(subs);
if (err < 0)
return err;
 
-- 
2.11.0



Re: [PATCH v2 1/2] Revert "HID: dragonrise: fix HID Descriptor for 0x0006 PID"

2016-10-05 Thread Ioan-Adrian Ratiu
On Tue, 27 Sep 2016, Ioan-Adrian Ratiu  wrote:
> This reverts commit 18339f59c3a6 ("HID: dragonrise: fix HID...") because it
> breaks certain dragonrise 0079:0006 gamepads. While it may fix a breakage
> caused by commit 79346d620e9d ("HID: input: force generic axis to be mapped
> to their user space axis"), it is probable that the manufacturer released
> different hardware with the same PID so this fix works for only a subset
> and breaks the other gamepads sharing the PID.
>
> What is needed is another more generic solution which fixes 79346d620e9d
> ("HID: input: force generic axis ...") breakage for this controller: we
> need to add an exception for this driver to make it keep the old behaviour
> previous to the initial breakage (this is done in patch 2 of this series).
>
> Signed-off-by: Ioan-Adrian Ratiu 

Ping?

Vladislav confirmed that this patch series also fixes the 0079:0011
DragonRise gamepad, so everything is good. Any info about merging?

Thank you,
Ionel

> ---
>  drivers/hid/hid-dr.c | 58 
> 
>  1 file changed, 58 deletions(-)
>
> diff --git a/drivers/hid/hid-dr.c b/drivers/hid/hid-dr.c
> index 8fd4bf7..2523f8a 100644
> --- a/drivers/hid/hid-dr.c
> +++ b/drivers/hid/hid-dr.c
> @@ -234,58 +234,6 @@ static __u8 pid0011_rdesc_fixed[] = {
>   0xC0/*  End Collection  */
>  };
>  
> -static __u8 pid0006_rdesc_fixed[] = {
> - 0x05, 0x01,/* Usage Page (Generic Desktop)  */
> - 0x09, 0x04,/* Usage (Joystick)  */
> - 0xA1, 0x01,/* Collection (Application)  */
> - 0xA1, 0x02,/*   Collection (Logical)*/
> - 0x75, 0x08,/* Report Size (8)   */
> - 0x95, 0x05,/* Report Count (5)  */
> - 0x15, 0x00,/* Logical Minimum (0)   */
> - 0x26, 0xFF, 0x00,  /* Logical Maximum (255) */
> - 0x35, 0x00,/* Physical Minimum (0)  */
> - 0x46, 0xFF, 0x00,  /* Physical Maximum (255)*/
> - 0x09, 0x30,/* Usage (X) */
> - 0x09, 0x33,/* Usage (Ry)*/
> - 0x09, 0x32,/* Usage (Z) */
> - 0x09, 0x31,/* Usage (Y) */
> - 0x09, 0x34,/* Usage (Ry)*/
> - 0x81, 0x02,/* Input (Variable)  */
> - 0x75, 0x04,/* Report Size (4)   */
> - 0x95, 0x01,/* Report Count (1)  */
> - 0x25, 0x07,/* Logical Maximum (7)   */
> - 0x46, 0x3B, 0x01,  /* Physical Maximum (315)*/
> - 0x65, 0x14,/* Unit (Centimeter) */
> - 0x09, 0x39,/* Usage (Hat switch)*/
> - 0x81, 0x42,/* Input (Variable)  */
> - 0x65, 0x00,/* Unit (None)   */
> - 0x75, 0x01,/* Report Size (1)   */
> - 0x95, 0x0C,/* Report Count (12) */
> - 0x25, 0x01,/* Logical Maximum (1)   */
> - 0x45, 0x01,/* Physical Maximum (1)  */
> - 0x05, 0x09,/* Usage Page (Button)   */
> - 0x19, 0x01,/* Usage Minimum (0x01)  */
> - 0x29, 0x0C,/* Usage Maximum (0x0C)  */
> - 0x81, 0x02,/* Input (Variable)  */
> - 0x06, 0x00, 0xFF,  /* Usage Page (Vendor Defined)   */
> - 0x75, 0x01,/* Report Size (1)   */
> - 0x95, 0x08,/* Report Count (8)  */
> - 0x25, 0x01,/* Logical Maximum (1)   */
> - 0x45, 0x01,/* Physical Maximum (1)  */
> - 0x09, 0x01,/* Usage (0x01)  */
> - 0x81, 0x02,/* Input (Variable)  */
> - 0xC0,  /*   End Collection  */
> - 0xA1, 0x02,/*   Collection (Logical)*/
> - 0x75, 0x08,/* Report Size (8)   */
> - 0x95, 0x07,/* Report Count (7)  */
> - 0x46, 0xFF, 0x00,  /* Physical Maximum (255)*/
> - 0x26, 0xFF, 0x00,  /* Logical Maximum (255) */
> - 0x09, 0x02,/* Usage (0x02)  */
> - 0x91, 0x02,/* Output (Variable) */
> - 0xC0,  /*   End Collection  */
> - 0xC0   /* End Collection*/
> -};
>

Re: [PATCH 2/2] hid: input: add HID_QUIRK_REUSE_AXES and fix dragonrise

2016-09-28 Thread Ioan-Adrian Ratiu
On Wed, 28 Sep 2016, Vladislav Naumov  wrote:
> On Tue, Sep 27, 2016 at 10:44 PM, Ioan-Adrian Ratiu  wrote:
>> Can you please wait a little until I post v2 later today and test v2
>> directly? Because the change in it's current form has no effect on
>> 0079:0011 (the current quirk is enabled only for 0006).
>
> Sure thing!
> Just drop me a line when it's ready.

Hi

I posted v2 last night but instead of adding you to CC now I see I've
added Nikolai by mistake.

You can use this branch [1] from github if you want to build the patches
applied directly to linus' master (they are HEAD and HEAD~1 commits).

[1] https://github.com/10ne1/linux/commits/dev/aratiu/fix-dragonrise-gamepad

Cheers,
Adrian


[PATCH v2 2/2] hid: hid-dr: add input mapping for axis selection

2016-09-27 Thread Ioan-Adrian Ratiu
Commit 79346d620e9d ("HID: input: force generic axis to be mapped to their
user space axis") made mapping generic axes to their userspace equivalents
mandatory and some lower end gamepads which were depending on the previous
behaviour suffered severe regressions because they were reusing axes and
expecting hid-input to multiplex their map to the respective userspace axis
by always searching for and using the next available axis.

One solution is to add a hid quirk for this type of "previous" behaviour in
hid-input to bypass the new axes policy in favour of the old one, but since
only one hardware vendor seems to be affected negatively we're better off
making and exception and mapping in the driver for now; if more vendors or
drivers turn out to experience the problem we should reconsider the quirk
solution.

Signed-off-by: Ioan-Adrian Ratiu 
---
 drivers/hid/hid-dr.c | 25 +
 1 file changed, 25 insertions(+)

diff --git a/drivers/hid/hid-dr.c b/drivers/hid/hid-dr.c
index 2523f8a..818ea7d9 100644
--- a/drivers/hid/hid-dr.c
+++ b/drivers/hid/hid-dr.c
@@ -248,6 +248,30 @@ static __u8 *dr_report_fixup(struct hid_device *hdev, __u8 
*rdesc,
return rdesc;
 }
 
+#define map_abs(c)  hid_map_usage(hi, usage, bit, max, EV_ABS, (c))
+#define map_rel(c)  hid_map_usage(hi, usage, bit, max, EV_REL, (c))
+
+static int dr_input_mapping(struct hid_device *hdev, struct hid_input *hi,
+   struct hid_field *field, struct hid_usage *usage,
+   unsigned long **bit, int *max)
+{
+   switch (usage->hid) {
+   /*
+* revert to the old hid-input behavior where axes
+* can be randomly assigned when hid->usage is reused.
+*/
+   case HID_GD_X: case HID_GD_Y: case HID_GD_Z:
+   case HID_GD_RX: case HID_GD_RY: case HID_GD_RZ:
+   if (field->flags & HID_MAIN_ITEM_RELATIVE)
+   map_rel(usage->hid & 0xf);
+   else
+   map_abs(usage->hid & 0xf);
+   return 1;
+   }
+
+   return 0;
+}
+
 static int dr_probe(struct hid_device *hdev, const struct hid_device_id *id)
 {
int ret;
@@ -294,6 +318,7 @@ static struct hid_driver dr_driver = {
.id_table = dr_devices,
.report_fixup = dr_report_fixup,
.probe = dr_probe,
+   .input_mapping = dr_input_mapping,
 };
 module_hid_driver(dr_driver);
 
-- 
2.10.0



[PATCH v2 1/2] Revert "HID: dragonrise: fix HID Descriptor for 0x0006 PID"

2016-09-27 Thread Ioan-Adrian Ratiu
This reverts commit 18339f59c3a6 ("HID: dragonrise: fix HID...") because it
breaks certain dragonrise 0079:0006 gamepads. While it may fix a breakage
caused by commit 79346d620e9d ("HID: input: force generic axis to be mapped
to their user space axis"), it is probable that the manufacturer released
different hardware with the same PID so this fix works for only a subset
and breaks the other gamepads sharing the PID.

What is needed is another more generic solution which fixes 79346d620e9d
("HID: input: force generic axis ...") breakage for this controller: we
need to add an exception for this driver to make it keep the old behaviour
previous to the initial breakage (this is done in patch 2 of this series).

Signed-off-by: Ioan-Adrian Ratiu 
---
 drivers/hid/hid-dr.c | 58 
 1 file changed, 58 deletions(-)

diff --git a/drivers/hid/hid-dr.c b/drivers/hid/hid-dr.c
index 8fd4bf7..2523f8a 100644
--- a/drivers/hid/hid-dr.c
+++ b/drivers/hid/hid-dr.c
@@ -234,58 +234,6 @@ static __u8 pid0011_rdesc_fixed[] = {
0xC0/*  End Collection  */
 };
 
-static __u8 pid0006_rdesc_fixed[] = {
-   0x05, 0x01,/* Usage Page (Generic Desktop)  */
-   0x09, 0x04,/* Usage (Joystick)  */
-   0xA1, 0x01,/* Collection (Application)  */
-   0xA1, 0x02,/*   Collection (Logical)*/
-   0x75, 0x08,/* Report Size (8)   */
-   0x95, 0x05,/* Report Count (5)  */
-   0x15, 0x00,/* Logical Minimum (0)   */
-   0x26, 0xFF, 0x00,  /* Logical Maximum (255) */
-   0x35, 0x00,/* Physical Minimum (0)  */
-   0x46, 0xFF, 0x00,  /* Physical Maximum (255)*/
-   0x09, 0x30,/* Usage (X) */
-   0x09, 0x33,/* Usage (Ry)*/
-   0x09, 0x32,/* Usage (Z) */
-   0x09, 0x31,/* Usage (Y) */
-   0x09, 0x34,/* Usage (Ry)*/
-   0x81, 0x02,/* Input (Variable)  */
-   0x75, 0x04,/* Report Size (4)   */
-   0x95, 0x01,/* Report Count (1)  */
-   0x25, 0x07,/* Logical Maximum (7)   */
-   0x46, 0x3B, 0x01,  /* Physical Maximum (315)*/
-   0x65, 0x14,/* Unit (Centimeter) */
-   0x09, 0x39,/* Usage (Hat switch)*/
-   0x81, 0x42,/* Input (Variable)  */
-   0x65, 0x00,/* Unit (None)   */
-   0x75, 0x01,/* Report Size (1)   */
-   0x95, 0x0C,/* Report Count (12) */
-   0x25, 0x01,/* Logical Maximum (1)   */
-   0x45, 0x01,/* Physical Maximum (1)  */
-   0x05, 0x09,/* Usage Page (Button)   */
-   0x19, 0x01,/* Usage Minimum (0x01)  */
-   0x29, 0x0C,/* Usage Maximum (0x0C)  */
-   0x81, 0x02,/* Input (Variable)  */
-   0x06, 0x00, 0xFF,  /* Usage Page (Vendor Defined)   */
-   0x75, 0x01,/* Report Size (1)   */
-   0x95, 0x08,/* Report Count (8)  */
-   0x25, 0x01,/* Logical Maximum (1)   */
-   0x45, 0x01,/* Physical Maximum (1)  */
-   0x09, 0x01,/* Usage (0x01)  */
-   0x81, 0x02,/* Input (Variable)  */
-   0xC0,  /*   End Collection  */
-   0xA1, 0x02,/*   Collection (Logical)*/
-   0x75, 0x08,/* Report Size (8)   */
-   0x95, 0x07,/* Report Count (7)  */
-   0x46, 0xFF, 0x00,  /* Physical Maximum (255)*/
-   0x26, 0xFF, 0x00,  /* Logical Maximum (255) */
-   0x09, 0x02,/* Usage (0x02)  */
-   0x91, 0x02,/* Output (Variable) */
-   0xC0,  /*   End Collection  */
-   0xC0   /* End Collection*/
-};
-
 static __u8 *dr_report_fixup(struct hid_device *hdev, __u8 *rdesc,
unsigned int *rsize)
 {
@@ -296,12 +244,6 @@ static __u8 *dr_report_fixup(struct hid_device *hdev, __u8 
*rdesc,
*rsize = sizeof(pid0011_rdesc_fixed);
}
break;
-   case 0x0006:
-   if (*rsize == sizeof(pid0006_rdesc_fixed)) {
-   rdesc = pid0006_rdesc_fixed;
-   *rsiz

Re: [PATCH 2/2] hid: input: add HID_QUIRK_REUSE_AXES and fix dragonrise

2016-09-27 Thread Ioan-Adrian Ratiu
On Tue, 27 Sep 2016, Vladislav Naumov  wrote:
> Yes, I still have one of those!
> 0079:0011 DragonRise Inc. Gamepad
> Left shift buttons are broken now, but axis and main buttons are still 
> working.
> Axis is handled properly with 3.16.0-4-686-pae #1 SMP Debian
> 3.16.7-ckt25-2 (2016-04-08) i686 GNU/Linux from debian/stable.
> I can test what you want.

Can you please wait a little until I post v2 later today and test v2
directly? Because the change in it's current form has no effect on
0079:0011 (the current quirk is enabled only for 0006).

When I add the input mapping in the hid-dr driver then it will affect
both 0006 and 0011 so that's the patch really worth testing.

Thanks a lot for taking time to test this,
Ionel

> Should I apply the patch from forwarded message to upstream kernel, or
> I can just pull it from some host with everything applied?
>
> On Mon, Sep 26, 2016 at 4:53 PM, Nikolai Kondrashov  wrote:
>> Hi Benjamin,
>>
>> On 09/26/2016 12:29 PM, Benjamin Tissoires wrote:
>>>
>>> Ideally, we need to have Dragon Rise 0x0011 tested too. Nick, would you
>>> mind checking it if you still have this particular device?
>>
>>
>> I never had it, but perhaps Vladislav still has some.
>>
>> Vladislav, would you be able to test a change to the kernel module for your
>> Dragonrise gamepads?
>>
>> Please see below for context.
>>
>> Thank you.
>>
>> Nick
>>
>> On 09/26/2016 12:29 PM, Benjamin Tissoires wrote:
>>>
>>> On Sep 26 2016 or thereabouts, Ioan-Adrian Ratiu wrote:
>>>>
>>>> Commit 79346d620e9d ("HID: input: force generic axis to be mapped to
>>>> their
>>>> user space axis") made mapping generic axes to their userspace
>>>> equivalents
>>>> mandatory and some lower end gamepads which were depending on the
>>>> previous
>>>> behaviour suffered severe regressions because they were reusing axes and
>>>> expecting hid-input to multiplex their map to the respective userspace
>>>> axis
>>>> by always searching for and using the next available axis.
>>>
>>>
>>> Yes, I apologies for the breakage and the regression, though I must say
>>> that for now, only one hardware maker and one device (or range of devices
>>> from the look of it) has needed to be quirked.
>>>
>>>>
>>>> Now the result is that different device axes appear on a single axis in
>>>> userspace, which is clearly a regression in the hid-input driver because
>>>> it
>>>> needs to continue to handle this hardware as expected before the forcing
>>>> to provide the same interface to userspace.
>>>>
>>>> Since these lower-end gamepads like 0079:0006 are definitely the
>>>> exception,
>>>> create a quirk to fix them.
>>>
>>>
>>> Given that we only have this particular vendor that is an issue, I'd
>>> rather see the fix in hid-dr.c. The reason being that you actually don't
>>> need to have a global quirk and this simplifies the path in hid-input.
>>> Plus for users, they can just upgrade hid-dr without having to recompile
>>> their kernel when hid-core is not compiled as a module.
>>>
>>> The cleanest solution that wouldn't require any quirk in hid-core is to
>>> simply add an .input_mapping() callback in hid-dr.c.
>>>
>>> The code of the callback could be something like (untested):
>>>
>>> static int dr_input_mapping(struct hid_device *hdev, struct hid_input *hi,
>>> struct hid_field *field, struct hid_usage *usage,
>>> unsigned long **bit, int *max)
>>> {
>>> switch (usage->hid) {
>>> /*
>>>  * revert the old hid-input behavior where axes
>>>  * can be randomly assigned when the hid usage is
>>>  * reused.
>>>  */
>>> case HID_GD_X: case HID_GD_Y: case HID_GD_Z:
>>> case HID_GD_RX: case HID_GD_RY: case HID_GD_RZ:
>>> if (field->flags & HID_MAIN_ITEM_RELATIVE)
>>> map_rel(usage->hid & 0xf);
>>> else
>>> map_abs(usage->hid & 0xf);
>>> return 1;
>>> }
>>>
>>> return 0;
>>> }
>>>
>>> Hopefully, something like this should revert the old behavior for all
>>> hid-dr touchpads.
>>&

Re: [PATCH 1/2] Revert "HID: dragonrise: fix HID Descriptor for 0x0006 PID"

2016-09-26 Thread Ioan-Adrian Ratiu
Hi

On Mon, 26 Sep 2016, Benjamin Tissoires  wrote:
> Thanks for the patch series. I am not against it, but I'd rather see the
> commit message of this one amended, and the second patch changed.

Sorry if I came out too aggresive (I'll amend), I'm just annoyed that I
had to spend a weekend night digging through this crap because my fiancee
came crying to me that her gamepad stopped working after a kernel update.

>
> On Sep 26 2016 or thereabouts, Ioan-Adrian Ratiu wrote:
>> This reverts commit 18339f59c3a6 ("HID: dragonrise: fix HID...")
>> because the "fix" is bogus. That report descriptor is different in
>
> I am pretty sure this "fix" works for many. You seem to have a different
> hardware (generation probably) that makes the "fix" to fail for you.
> The issue is more that the manufacturer doesn't bother to reallocate a
> new PID for the new device when they change something in it, so please
> don't blame the author of the fix (which I am not).

We don't actually know that it's a new generation of hardware without a
new PID and I really doubt manufacturers are *that* stupid, they should
at least have increased the device revision number, but... China.

>
>> hardware (see below) and it's the way the hardware works, it can't be
>
> Well, it's kind of hard to compare the lsusb output to the fixup in the
> kernel. I'd like to know what changed, but I can't...
>
>> fixed at this level because it reuses axes by design.
>
> It can be fixed in hid-dr. See my comments on the next patch.

Thanks, I'll try to add an input mapping in the driver. However if
indeed there are different hardware with the same PID and this "fix" is
for another issue than the one I'm having then I'd really rather not
revert this if possible to not break other people's hardware. But I also
can't keep it because it breaks my hardware.

Does anyone have any suggestions what to do in this case?

>
>> 
>> What this change tried to fix is a regression caused by commit 20aef664f139
>
> As mentioned in your next patch in the series, the correct commit id is
> 79346d620e9de87912de73337f6df8b7f9a46888

Thanks, this was a slip-up on my part.

>
>> ("HID: input: force generic axis to be mapped to their user space axis") by
>> working around the problem and trying to change the report descriptor in
>> hid-dr, which obviously can't work and introduces more breakage because it
>> adds another unnecessary layer of multiplexing/indirection, making the
>> dragonrise gamepad practically unusable in userspace.
>
> Not sure who you are blaming here. Is it me (the author of 79346d620e or
> the author of the fix in hid-dr)? If it's me, I agree, the patch was a
> little too aggressive, though I must say only one hardware maker has
> such a crappy device that we need to care of. So this is why I just let
> the patch in place without trying to have a better solution.
>
> If the blame is on the author of the hid-dr, I must say that I find the
> tone of this paragraph quite aggressive for nothing. Your device is
> different than the one that was used for the original fix, so it breaks.
> But I can guarantee you that the fix works for the intended device (I
> happen to have one I tested recently). So please blame the hardware
> maker, not the people involved in the community who are doing their
> best.

I'm not assigning blame because it's counterproductive. Of course, I
agree if we are to find a scapegoat it's the crappy manufacturers
because they make the crappy hardware with crappy or no Linux support
(btw I am in no way affiliated with dragonrise or any other vendor).

The only thing that annoys me is that this known hid-input kernel
regression has been ignored for all this time, leaving users like me
dead in the watter with a broken driver upon kernel update.

>
>> 
>> This needs to be fixed where the regression was initially introduced in
>> hid-input (the next patch does this).
>
> Actually, again, I tend to disagree :)
> I'll go more in details in the next patch.
>
> Cheers,
> Benjamin
>
>> 
>> Here's the descriptor taken directly from the device via lsusb:
>> HID Device Descriptor:
>>   bLength 9
>>   bDescriptorType33
>>   bcdHID   1.10
>>   bCountryCode   33 US
>>   bNumDescriptors 1
>>   bDescriptorType34 Report
>>   wDescriptorLength 101
>>   Report Descriptor: (length is 101)
>> Item(Global): Usage Page, data= [ 0x01 ] 1
>>

[PATCH 1/2] Revert "HID: dragonrise: fix HID Descriptor for 0x0006 PID"

2016-09-25 Thread Ioan-Adrian Ratiu
7
Item(Global): Physical Maximum, data= [ 0xff 0x00 ] 255
Item(Global): Logical Maximum, data= [ 0xff 0x00 ] 255
Item(Local ): Usage, data= [ 0x02 ] 2
(null)
Item(Main  ): Output, data= [ 0x02 ] 2
Data Variable Absolute No_Wrap Linear
Preferred_State No_Null_Position Non_Volatile 
Bitfield
Item(Main  ): End Collection, data=none
Item(Main  ): End Collection, data=none

Signed-off-by: Ioan-Adrian Ratiu 
---
 drivers/hid/hid-dr.c | 58 
 1 file changed, 58 deletions(-)

diff --git a/drivers/hid/hid-dr.c b/drivers/hid/hid-dr.c
index 8fd4bf7..2523f8a 100644
--- a/drivers/hid/hid-dr.c
+++ b/drivers/hid/hid-dr.c
@@ -234,58 +234,6 @@ static __u8 pid0011_rdesc_fixed[] = {
0xC0/*  End Collection  */
 };
 
-static __u8 pid0006_rdesc_fixed[] = {
-   0x05, 0x01,/* Usage Page (Generic Desktop)  */
-   0x09, 0x04,/* Usage (Joystick)  */
-   0xA1, 0x01,/* Collection (Application)  */
-   0xA1, 0x02,/*   Collection (Logical)*/
-   0x75, 0x08,/* Report Size (8)   */
-   0x95, 0x05,/* Report Count (5)  */
-   0x15, 0x00,/* Logical Minimum (0)   */
-   0x26, 0xFF, 0x00,  /* Logical Maximum (255) */
-   0x35, 0x00,/* Physical Minimum (0)  */
-   0x46, 0xFF, 0x00,  /* Physical Maximum (255)*/
-   0x09, 0x30,/* Usage (X) */
-   0x09, 0x33,/* Usage (Ry)*/
-   0x09, 0x32,/* Usage (Z) */
-   0x09, 0x31,/* Usage (Y) */
-   0x09, 0x34,/* Usage (Ry)*/
-   0x81, 0x02,/* Input (Variable)  */
-   0x75, 0x04,/* Report Size (4)   */
-   0x95, 0x01,/* Report Count (1)  */
-   0x25, 0x07,/* Logical Maximum (7)   */
-   0x46, 0x3B, 0x01,  /* Physical Maximum (315)*/
-   0x65, 0x14,/* Unit (Centimeter) */
-   0x09, 0x39,/* Usage (Hat switch)*/
-   0x81, 0x42,/* Input (Variable)  */
-   0x65, 0x00,/* Unit (None)   */
-   0x75, 0x01,/* Report Size (1)   */
-   0x95, 0x0C,/* Report Count (12) */
-   0x25, 0x01,/* Logical Maximum (1)   */
-   0x45, 0x01,/* Physical Maximum (1)  */
-   0x05, 0x09,/* Usage Page (Button)   */
-   0x19, 0x01,/* Usage Minimum (0x01)  */
-   0x29, 0x0C,/* Usage Maximum (0x0C)  */
-   0x81, 0x02,/* Input (Variable)  */
-   0x06, 0x00, 0xFF,  /* Usage Page (Vendor Defined)   */
-   0x75, 0x01,/* Report Size (1)   */
-   0x95, 0x08,/* Report Count (8)  */
-   0x25, 0x01,/* Logical Maximum (1)   */
-   0x45, 0x01,/* Physical Maximum (1)  */
-   0x09, 0x01,/* Usage (0x01)  */
-   0x81, 0x02,/* Input (Variable)  */
-   0xC0,  /*   End Collection  */
-   0xA1, 0x02,/*   Collection (Logical)*/
-   0x75, 0x08,/* Report Size (8)   */
-   0x95, 0x07,/* Report Count (7)  */
-   0x46, 0xFF, 0x00,  /* Physical Maximum (255)*/
-   0x26, 0xFF, 0x00,  /* Logical Maximum (255) */
-   0x09, 0x02,/* Usage (0x02)  */
-   0x91, 0x02,/* Output (Variable) */
-   0xC0,  /*   End Collection  */
-   0xC0   /* End Collection*/
-};
-
 static __u8 *dr_report_fixup(struct hid_device *hdev, __u8 *rdesc,
unsigned int *rsize)
 {
@@ -296,12 +244,6 @@ static __u8 *dr_report_fixup(struct hid_device *hdev, __u8 
*rdesc,
*rsize = sizeof(pid0011_rdesc_fixed);
}
break;
-   case 0x0006:
-   if (*rsize == sizeof(pid0006_rdesc_fixed)) {
-   rdesc = pid0006_rdesc_fixed;
-   *rsize = sizeof(pid0006_rdesc_fixed);
-   }
-   break;
}
return rdesc;
 }
-- 
2.10.0



[PATCH 2/2] hid: input: add HID_QUIRK_REUSE_AXES and fix dragonrise

2016-09-25 Thread Ioan-Adrian Ratiu
Commit 79346d620e9d ("HID: input: force generic axis to be mapped to their
user space axis") made mapping generic axes to their userspace equivalents
mandatory and some lower end gamepads which were depending on the previous
behaviour suffered severe regressions because they were reusing axes and
expecting hid-input to multiplex their map to the respective userspace axis
by always searching for and using the next available axis.

Now the result is that different device axes appear on a single axis in
userspace, which is clearly a regression in the hid-input driver because it
needs to continue to handle this hardware as expected before the forcing
to provide the same interface to userspace.

Since these lower-end gamepads like 0079:0006 are definitely the exception,
create a quirk to fix them.

Signed-off-by: Ioan-Adrian Ratiu 
---
 drivers/hid/hid-dr.c|  2 ++
 drivers/hid/hid-input.c | 16 +++-
 include/linux/hid.h |  1 +
 3 files changed, 14 insertions(+), 5 deletions(-)

diff --git a/drivers/hid/hid-dr.c b/drivers/hid/hid-dr.c
index 2523f8a..27fc826 100644
--- a/drivers/hid/hid-dr.c
+++ b/drivers/hid/hid-dr.c
@@ -274,6 +274,8 @@ static int dr_probe(struct hid_device *hdev, const struct 
hid_device_id *id)
hid_hw_stop(hdev);
goto err;
}
+   /* has only 5 axes and reuses X, Y */
+   hdev->quirks |= HID_QUIRK_REUSE_AXES;
break;
}
 
diff --git a/drivers/hid/hid-input.c b/drivers/hid/hid-input.c
index fb9ace1..1cc6fe4 100644
--- a/drivers/hid/hid-input.c
+++ b/drivers/hid/hid-input.c
@@ -633,11 +633,17 @@ static void hidinput_configure_usage(struct hid_input 
*hidinput, struct hid_fiel
/* These usage IDs map directly to the usage codes. */
case HID_GD_X: case HID_GD_Y: case HID_GD_Z:
case HID_GD_RX: case HID_GD_RY: case HID_GD_RZ:
-   if (field->flags & HID_MAIN_ITEM_RELATIVE)
-   map_rel(usage->hid & 0xf);
-   else
-   map_abs_clear(usage->hid & 0xf);
-   break;
+
+   /* if quirk is active don't force the userspace mapping,
+* instead search and use the next available axis.
+*/
+   if (!(device->quirks & HID_QUIRK_REUSE_AXES)) {
+   if (field->flags & HID_MAIN_ITEM_RELATIVE)
+   map_rel(usage->hid & 0xf);
+   else
+   map_abs_clear(usage->hid & 0xf);
+   break;
+   }
 
case HID_GD_SLIDER: case HID_GD_DIAL: case HID_GD_WHEEL:
if (field->flags & HID_MAIN_ITEM_RELATIVE)
diff --git a/include/linux/hid.h b/include/linux/hid.h
index 75b66ec..0979920 100644
--- a/include/linux/hid.h
+++ b/include/linux/hid.h
@@ -320,6 +320,7 @@ struct hid_item {
 #define HID_QUIRK_NO_EMPTY_INPUT   0x0100
 #define HID_QUIRK_NO_INIT_INPUT_REPORTS0x0200
 #define HID_QUIRK_ALWAYS_POLL  0x0400
+#define HID_QUIRK_REUSE_AXES   0x0800
 #define HID_QUIRK_SKIP_OUTPUT_REPORTS  0x0001
 #define HID_QUIRK_SKIP_OUTPUT_REPORT_ID0x0002
 #define HID_QUIRK_NO_OUTPUT_REPORTS_ON_INTR_EP 0x0004
-- 
2.10.0



[PATCH] merge_config.sh: fail loudly if make also fails

2016-08-30 Thread Ioan-Adrian Ratiu
There are cases in which the make call to fill missing symbols can fail.
I encountered one such case while building an Yocto/OpenEmbedded based
custom Linux kernel because by default after late 2014 OE poisons the
default compiler sysroot path [1], making it point to a bogus location
like /does/not/exist. This means the "--sysroot" compiler arg needs to
be passed everywhere explicitly.

If the sysroot is poisoned and no sane sysroot is passed to the make
call inside merge_config.sh it can fail silently and the continuing
build process using the borked config which does not know can generate a
kernel which will not boot (this is what happens in Yocto/OE). This was
fixed in OE by passing --sysroot (contained in $TOOLCHAIN_OPTIONS) [2].

Nevertheless, even though OE was fixed to pass a correct --sysroot and
avoid the error, this make call shouldn't fail silently and lead to
unbootable kernels. Make it fail loudly so that other build systems like
OE can pick up the error from merge_config.sh and act accordingly.

[1] 
http://git.openembedded.org/openembedded-core/commit/?id=04b725511a505c582a3abdf63d096967f0320779
[2] 
http://git.yoctoproject.org/cgit/cgit.cgi/poky-contrib/commit/?h=zedd/kernel&id=87a9f321035428d9f4f8859ff99bb9acb70bd60c

Signed-off-by: Ioan-Adrian Ratiu 
---
 scripts/kconfig/merge_config.sh | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/scripts/kconfig/merge_config.sh b/scripts/kconfig/merge_config.sh
index 67d1314..5212f37 100755
--- a/scripts/kconfig/merge_config.sh
+++ b/scripts/kconfig/merge_config.sh
@@ -152,7 +152,10 @@ fi
 # alldefconfig: Fills in any missing symbols with Kconfig default
 # allnoconfig: Fills in any missing symbols with # CONFIG_* is not set
 make KCONFIG_ALLCONFIG=$TMP_FILE $OUTPUT_ARG $ALLTARGET
-
+if [ "$?" -ne 0 ]; then
+echo "Make failed to fill missing config symbols. Exit." >&2
+exit 1
+fi
 
 # Check all specified config values took (might have missed-dependency issues)
 for CFG in $(sed -n "$SED_CONFIG_EXP" $TMP_FILE); do
-- 
2.9.3



Re: ath9k: Fix beacon configuration assertion failure

2016-08-22 Thread Ioan-Adrian Ratiu
On Mon, 22 Aug 2016, Kalle Valo  wrote:
> Benjamin Berg  writes:
>
>> On Fr, 2016-08-19 at 13:03 +0300, Kalle Valo wrote:
>>> Actually, I see two patches which might be related but not identical:
>>> 
>>> ath9k: fix client mode beacon configuration
>>> https://patchwork.kernel.org/patch/9247699/
>>> 
>>> ath9k: Fix beacon configuration assertion failure
>>> https://patchwork.kernel.org/patch/9281191/
>>> 
>>> Felix (CCed) & Benjamin: please take a look and advice which one I
>>> should take.
>>
>> Yes, both patches are designed to fix the same issue in my patch.
>>
>> Felix solution looks entirely correct to me, the second solution seems
>> slightly wrong because it prevents the call to ath9k_beacon_config from
>> happening instead of ensuring the correct parameter value.
>> ath9k_beacon_config needs to be called even if iter_data.beaconsĀ is
>> false as it disables the interrupts.
>
> Ok, I'll the patch from Felix then. Thanks.

Tested Felix patch. Confirm it works great. Thanks!

>
> -- 
> Kalle Valo


Re: ath9k: Fix beacon configuration assertion failure

2016-08-19 Thread Ioan-Adrian Ratiu
Hi

On Fri, 19 Aug 2016, Kalle Valo  wrote:
> Kalle Valo  writes:
>
>> Adi Ratiu  wrote:
>>> commit cfda2d8e2314 ("ath9k: Fix beacon configuration for
>>> addition/removal of interfaces") reworked beacon configs to happen at
>>> IF changes and missed cases when NL80211_IFTYPE_STATION has no beacons
>>> with the corresponding values iter_data.primary_beacon_vif == 0 and
>>> iter_data.nbcnvifs == 0 in ath9k_calculate_summary_state(), thus
>>> calling ath9k_beacon_config() with null and giving the below warning.
>>> 
>>> Fix this by calling beacon config only when a beacon actually exists,
>>> i.e. by checking iter_data.beacons which should be set only inside
>>> ath9k_vif_iter_set_beacon() (the line "iter_data.beacons = true;" in
>>> ath9k_calculate_summary_state() is a bug in above rework commit).
>>> 
>>> [   16.910537] [ cut here ]
>>> [   16.910549] WARNING: CPU: 2 PID: 6 at 
>>> drivers/net/wireless/ath/ath9k/beacon.c:642 ath9k_beacon_config+0x12c/0x130 
>>> [ath9k]
>>> [   16.910551] Modules linked in: intel_rapl x86_pkg_temp_thermal 
>>> intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul 
>>> crc32_pclmul crc32c_intel ghash_clmulni_intel hid_generic aesni_intel 
>>> usbhid hid aes_x86_64 joydev mousedev arc4 lrw ath9k dell_laptop 
>>> ath9k_common ath9k_hw ath mac80211 gf128mul glue_helper ablk_helper 
>>> dell_smbios input_leds cryptd led_class snd_hda_codec_hdmi psmouse cfg80211 
>>> serio_raw atkbd snd_hda_codec_realtek libps2 rfkill r8169 sr_mod 
>>> snd_hda_codec_generic dcdbas cdrom mii snd_hda_intel snd_hda_codec 
>>> snd_hwdep snd_hda_core i8042 snd_pcm snd_timer serio ac xhci_pci xhci_hcd 
>>> battery i2c_i801 tpm_tis pcspkr tpm_tis_core evdev shpchp lpc_ich i2c_smbus 
>>> tpm sch_fq_codel ip_tables x_tables
>>> [   16.910620] CPU: 2 PID: 6 Comm: kworker/u16:0 Not tainted 
>>> 4.8.0-rc1-next-20160815-g118253a #1
>>> [   16.910621] Hardware name: Dell Inc. Inspiron 3521/018DYG, BIOS A14 
>>> 07/31/2015
>>> [   16.910648] Workqueue: phy0 ieee80211_iface_work [mac80211]
>>> [   16.910652]   880159f13630 813140f0 
>>> 
>>> [   16.910657]   880159f13670 8106b22b 
>>> 02820202
>>> [   16.910661]  880156bc1500  880153cc8018 
>>> 880153cc8018
>>> [   16.910666] Call Trace:
>>> [   16.910674]  [] dump_stack+0x63/0x83
>>> [   16.910678]  [] __warn+0xcb/0xf0
>>> [   16.910682]  [] warn_slowpath_null+0x1d/0x20
>>> [   16.910690]  [] ath9k_beacon_config+0x12c/0x130 [ath9k]
>>> [   16.910696]  [] 
>>> ath9k_calculate_summary_state+0xf6/0x350 [ath9k]
>>> [   16.910703]  [] ath9k_bss_info_changed+0x186/0x1a0 
>>> [ath9k]
>>> [   16.910720]  [] 
>>> ieee80211_bss_info_change_notify+0xb1/0x200 [mac80211]
>>> [   16.910737]  [] ieee80211_assoc_success+0x677/0xdeb 
>>> [mac80211]
>>> [   16.910746]  [] ? up+0x32/0x50
>>> [   16.910751]  [] ? wake_up_klogd+0x3b/0x50
>>> [   16.910755]  [] ? console_unlock+0x539/0x5f0
>>> [   16.910760]  [] ? vprintk_emit+0x254/0x490
>>> [   16.910765]  [] ? vprintk_default+0x1f/0x30
>>> [   16.910769]  [] ? printk+0x48/0x50
>>> [   16.910788]  [] 
>>> ieee80211_rx_mgmt_assoc_resp+0x152/0x4c0 [mac80211]
>>> [   16.910807]  [] 
>>> ieee80211_sta_rx_queued_mgmt+0x18f/0x840 [mac80211]
>>> [   16.910813]  [] ? lock_timer_base.isra.2+0x80/0xa0
>>> [   16.910817]  [] ? cpuacct_charge+0x86/0xa0
>>> [   16.910822]  [] ? update_curr+0xb7/0x160
>>> [   16.910827]  [] ? dequeue_entity+0x24c/0xa20
>>> [   16.910831]  [] ? dequeue_task_fair+0x5c3/0x960
>>> [   16.910848]  [] ? ieee80211_iface_work+0xd4/0x410 
>>> [mac80211]
>>> [   16.910865]  [] ieee80211_iface_work+0x295/0x410 
>>> [mac80211]
>>> [   16.910870]  [] ? finish_task_switch+0x77/0x1e0
>>> [   16.910875]  [] process_one_work+0x1e5/0x470
>>> [   16.910880]  [] worker_thread+0x48/0x4e0
>>> [   16.910885]  [] ? process_one_work+0x470/0x470
>>> [   16.910888]  [] kthread+0xc9/0xe0
>>> [   16.910894]  [] ? __switch_to+0x2c3/0x610
>>> [   16.910899]  [] ret_from_fork+0x1f/0x40
>>> [   16.910902]  [] ? kthread_create_on_node+0x40/0x40
>>> [   16.910904] ---[ end trace aa169ad4461f2f18 ]---
>>> 
>>> Signed-off-by: Ioan-Adrian Ratiu 
>>
>> Benjamin, does this look reasonable to you? I'm planning to queue this for 
>> 4.8.
>
> Actually, I see two patches which might be related but not identical:
>
> ath9k: fix client mode beacon configuration
> https://patchwork.kernel.org/patch/9247699/
>
> ath9k: Fix beacon configuration assertion failure
> https://patchwork.kernel.org/patch/9281191/

Definitely we're touching the same logic and I'm leaning in the
direction of dropping my patch and using Felix's.

I can't test the other patch right noum though because I'm on vacantion.
I'll test next week and report back.

>
> Felix (CCed) & Benjamin: please take a look and advice which one I
> should take.
>
> -- 
> Kalle Valo


[PATCH] ath9k: Fix beacon configuration assertion failure

2016-08-15 Thread Ioan-Adrian Ratiu
commit cfda2d8e2314 ("ath9k: Fix beacon configuration for
addition/removal of interfaces") reworked beacon configs to happen at
IF changes and missed cases when NL80211_IFTYPE_STATION has no beacons
with the corresponding values iter_data.primary_beacon_vif == 0 and
iter_data.nbcnvifs == 0 in ath9k_calculate_summary_state(), thus
calling ath9k_beacon_config() with null and giving the below warning.

Fix this by calling beacon config only when a beacon actually exists,
i.e. by checking iter_data.beacons which should be set only inside
ath9k_vif_iter_set_beacon() (the line "iter_data.beacons = true;" in
ath9k_calculate_summary_state() is a bug in above rework commit).

[   16.910537] [ cut here ]
[   16.910549] WARNING: CPU: 2 PID: 6 at 
drivers/net/wireless/ath/ath9k/beacon.c:642 ath9k_beacon_config+0x12c/0x130 
[ath9k]
[   16.910551] Modules linked in: intel_rapl x86_pkg_temp_thermal 
intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul 
crc32c_intel ghash_clmulni_intel hid_generic aesni_intel usbhid hid aes_x86_64 
joydev mousedev arc4 lrw ath9k dell_laptop ath9k_common ath9k_hw ath mac80211 
gf128mul glue_helper ablk_helper dell_smbios input_leds cryptd led_class 
snd_hda_codec_hdmi psmouse cfg80211 serio_raw atkbd snd_hda_codec_realtek 
libps2 rfkill r8169 sr_mod snd_hda_codec_generic dcdbas cdrom mii snd_hda_intel 
snd_hda_codec snd_hwdep snd_hda_core i8042 snd_pcm snd_timer serio ac xhci_pci 
xhci_hcd battery i2c_i801 tpm_tis pcspkr tpm_tis_core evdev shpchp lpc_ich 
i2c_smbus tpm sch_fq_codel ip_tables x_tables
[   16.910620] CPU: 2 PID: 6 Comm: kworker/u16:0 Not tainted 
4.8.0-rc1-next-20160815-g118253a #1
[   16.910621] Hardware name: Dell Inc. Inspiron 3521/018DYG, BIOS A14 
07/31/2015
[   16.910648] Workqueue: phy0 ieee80211_iface_work [mac80211]
[   16.910652]   880159f13630 813140f0 

[   16.910657]   880159f13670 8106b22b 
02820202
[   16.910661]  880156bc1500  880153cc8018 
880153cc8018
[   16.910666] Call Trace:
[   16.910674]  [] dump_stack+0x63/0x83
[   16.910678]  [] __warn+0xcb/0xf0
[   16.910682]  [] warn_slowpath_null+0x1d/0x20
[   16.910690]  [] ath9k_beacon_config+0x12c/0x130 [ath9k]
[   16.910696]  [] ath9k_calculate_summary_state+0xf6/0x350 
[ath9k]
[   16.910703]  [] ath9k_bss_info_changed+0x186/0x1a0 [ath9k]
[   16.910720]  [] 
ieee80211_bss_info_change_notify+0xb1/0x200 [mac80211]
[   16.910737]  [] ieee80211_assoc_success+0x677/0xdeb 
[mac80211]
[   16.910746]  [] ? up+0x32/0x50
[   16.910751]  [] ? wake_up_klogd+0x3b/0x50
[   16.910755]  [] ? console_unlock+0x539/0x5f0
[   16.910760]  [] ? vprintk_emit+0x254/0x490
[   16.910765]  [] ? vprintk_default+0x1f/0x30
[   16.910769]  [] ? printk+0x48/0x50
[   16.910788]  [] ieee80211_rx_mgmt_assoc_resp+0x152/0x4c0 
[mac80211]
[   16.910807]  [] ieee80211_sta_rx_queued_mgmt+0x18f/0x840 
[mac80211]
[   16.910813]  [] ? lock_timer_base.isra.2+0x80/0xa0
[   16.910817]  [] ? cpuacct_charge+0x86/0xa0
[   16.910822]  [] ? update_curr+0xb7/0x160
[   16.910827]  [] ? dequeue_entity+0x24c/0xa20
[   16.910831]  [] ? dequeue_task_fair+0x5c3/0x960
[   16.910848]  [] ? ieee80211_iface_work+0xd4/0x410 
[mac80211]
[   16.910865]  [] ieee80211_iface_work+0x295/0x410 [mac80211]
[   16.910870]  [] ? finish_task_switch+0x77/0x1e0
[   16.910875]  [] process_one_work+0x1e5/0x470
[   16.910880]  [] worker_thread+0x48/0x4e0
[   16.910885]  [] ? process_one_work+0x470/0x470
[   16.910888]  [] kthread+0xc9/0xe0
[   16.910894]  [] ? __switch_to+0x2c3/0x610
[   16.910899]  [] ret_from_fork+0x1f/0x40
[   16.910902]  [] ? kthread_create_on_node+0x40/0x40
[   16.910904] ---[ end trace aa169ad4461f2f18 ]---

Signed-off-by: Ioan-Adrian Ratiu 
---
 drivers/net/wireless/ath/ath9k/main.c | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/net/wireless/ath/ath9k/main.c 
b/drivers/net/wireless/ath/ath9k/main.c
index a394622..3c6f413 100644
--- a/drivers/net/wireless/ath/ath9k/main.c
+++ b/drivers/net/wireless/ath/ath9k/main.c
@@ -1154,7 +1154,6 @@ void ath9k_calculate_summary_state(struct ath_softc *sc,
bool changed = (iter_data.primary_sta != ctx->primary_sta);
 
if (iter_data.primary_sta) {
-   iter_data.beacons = true;
ath9k_set_assoc_state(sc, iter_data.primary_sta,
  changed);
ctx->primary_sta = iter_data.primary_sta;
@@ -1166,10 +1165,12 @@ void ath9k_calculate_summary_state(struct ath_softc *sc,
if (ath9k_hw_mci_is_enabled(sc->sc_ah))
ath9k_mci_update_wlan_channels(sc, true);
}
+   } else if (iter_data.beacons) {
+   sc->nbcnvifs = iter_data.nbcnvifs;
+   ath9k_beacon_config(

Re: [PATCH v2] hid: usbhid: hid-core: fix recursive deadlock

2015-11-29 Thread Ioan-Adrian Ratiu
On Fri, 20 Nov 2015 22:19:02 +0200
Ioan-Adrian Ratiu  wrote:

> The critical section protected by usbhid->lock in hid_ctrl() is too
> big and because of this it causes a recursive deadlock. "Too big" means
> the case statement and the call to hid_input_report() do not need to be
> protected by the spinlock (no URB operations are done inside them).
> 
> The deadlock happens because in certain rare cases drivers try to grab
> the lock while handling the ctrl irq which grabs the lock before them
> as described above. For example newer wacom tablets like 056a:033c try
> to reschedule proximity reads from wacom_intuos_schedule_prox_event()
> calling hid_hw_request() -> usbhid_request() -> usbhid_submit_report()
> which tries to grab the usbhid lock already held by hid_ctrl().
> 
> There are two ways to get out of this deadlock:
> 1. Make the drivers work "around" the ctrl critical region, in the
> wacom case for ex. by delaying the scheduling of the proximity read
> request itself to a workqueue.
> 2. Shrink the critical region so the usbhid lock protects only the
> instructions which modify usbhid state, calling hid_input_report()
> with the spinlock unlocked, allowing the device driver to grab the
> lock first, finish and then grab the lock afterwards in hid_ctrl().
> 
> This patch implements the 2nd solution.
> 
> Signed-off-by: Ioan-Adrian Ratiu 
> ---
>  drivers/hid/usbhid/hid-core.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/hid/usbhid/hid-core.c b/drivers/hid/usbhid/hid-core.c
> index 36712e9..5dd426f 100644
> --- a/drivers/hid/usbhid/hid-core.c
> +++ b/drivers/hid/usbhid/hid-core.c
> @@ -477,8 +477,6 @@ static void hid_ctrl(struct urb *urb)
>   struct usbhid_device *usbhid = hid->driver_data;
>   int unplug = 0, status = urb->status;
>  
> - spin_lock(&usbhid->lock);
> -
>   switch (status) {
>   case 0: /* success */
>   if (usbhid->ctrl[usbhid->ctrltail].dir == USB_DIR_IN)
> @@ -498,6 +496,8 @@ static void hid_ctrl(struct urb *urb)
>   hid_warn(urb->dev, "ctrl urb status %d received\n", status);
>   }
>  
> + spin_lock(&usbhid->lock);
> +
>   if (unplug) {
>   usbhid->ctrltail = usbhid->ctrlhead;
>   } else {

Hello again

Can this please be merged in the 4.4? There are quite a few people who have
their tablets deadlock and don't know how to patch their kernels so are stuck
waiting for a new release.

The severity of this issue is much bigger than I initially thought. Since I've
posted on the wacom mailing list that I have a fix for this deadlock I've
recived lots of email from people complaining of the same problem on a wide
range of tablets.

Some of those people know how to patch a kernel, some found this patch on the
mailing list and tested it and confirmed that it works on the wacom mailing
list (you can verify the deadlock + fix on that mailing list).

Thank you,
Adrian
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] tracing: add tracing C helper functions

2015-11-27 Thread Ioan-Adrian Ratiu
Tracing/debugging programatically from kernel C code is quite limited
compared to its tracefs alternative mainly due to lack of interfaces.
linux/kernel.h provides traceing_on/off() and *_snapshot(), but these
only allow access to a limited subset of the tracing functionality.

This patch adds 5 new functions to make tracing from C easier:
tracing_change_tracer - change the live tracer
tracing_clear_all - clear all cpu ringbuffers
tracing_clear_cpu - clear a specific cpu's buffer
tracing_resize_all - set ringbuffer size for all cpus
tracing_resize_cpu - set ringbuffer size for specific cpu

These functions are very helpful when combined with existing ones, for
example to debug the call serial8250_pnp_init() in serial8250_init():

tracing_resize_all(1 << 23);
tracing_change_tracer("function_graph");
tracing_on();

trace_printk("starting pnp init\n");
ret = serial8250_pnp_init();
trace_printk("done pnp init\n");

tracing_off();
tracing_clear_cpu((1 - smp_processor_id());

or conditionally trace driver_register() only for the serial driver:

if (!strncmp(drv->name, "serial", 6)) {
tracing_change_tracer("function_graph");
        tracing_on();
}

Signed-off-by: Ioan-Adrian Ratiu 
---
 include/linux/kernel.h |  5 +
 kernel/trace/trace.c   | 56 ++
 2 files changed, 61 insertions(+)

diff --git a/include/linux/kernel.h b/include/linux/kernel.h
index ae6adf8..7cd24dd 100644
--- a/include/linux/kernel.h
+++ b/include/linux/kernel.h
@@ -552,6 +552,11 @@ void tracing_off(void);
 int tracing_is_on(void);
 void tracing_snapshot(void);
 void tracing_snapshot_alloc(void);
+void tracing_clear_all(void);
+void tracing_clear_cpu(int cpu);
+void tracing_resize_all(unsigned long size);
+void tracing_resize_cpu(unsigned long size, int cpu);
+void tracing_change_tracer(const char *tracer);
 
 extern void tracing_start(void);
 extern void tracing_stop(void);
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index 87fb980..a8ed37e 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -533,6 +533,62 @@ void tracing_on(void)
 EXPORT_SYMBOL_GPL(tracing_on);
 
 /**
+ * tracing_change_tracer - switch live tracing to specified tracer
+ * @tracer: Name of the tracer to change to
+ */
+void tracing_change_tracer(const char *tracer)
+{
+   tracing_set_tracer(&global_trace, tracer);
+}
+EXPORT_SYMBOL_GPL(tracing_change_tracer);
+
+/**
+ * tracing_clear_cpu - reset a cpu live tracing buffer
+ * @cpu: cpu to clear
+ */
+void tracing_clear_cpu(int cpu)
+{
+   mutex_lock(&trace_types_lock);
+   tracing_reset(&global_trace.trace_buffer, cpu);
+   mutex_unlock(&trace_types_lock);
+}
+EXPORT_SYMBOL_GPL(tracing_clear_cpu);
+
+/**
+ * tracing_clear_all - reset all cpu tracing buffers
+ */
+void tracing_clear_all(void)
+{
+   mutex_lock(&trace_types_lock);
+   tracing_reset_online_cpus(&global_trace.trace_buffer);
+   mutex_unlock(&trace_types_lock);
+}
+EXPORT_SYMBOL_GPL(tracing_clear_all);
+
+static ssize_t tracing_resize_ring_buffer(struct trace_array *, unsigned long, 
int);
+
+/**
+ * tracing_resize_cpu - resize a cpu's ring buffer
+ * @size - new size in KB
+ * @cpu - target cpu for buf resize
+ */
+void tracing_resize_cpu(unsigned long size, int cpu)
+{
+   tracing_resize_ring_buffer(&global_trace, size, cpu);
+}
+EXPORT_SYMBOL_GPL(tracing_resize_cpu);
+
+/**
+ * tracing_resize_all - resize all per-cpu ring buffers
+ * @size - new size in KB
+ */
+void tracing_resize_all(unsigned long size)
+{
+   tracing_resize_ring_buffer(&global_trace, size, RING_BUFFER_ALL_CPUS);
+}
+EXPORT_SYMBOL_GPL(tracing_resize_all);
+
+/**
  * __trace_puts - write a constant string into the trace buffer.
  * @ip:   The address of the caller
  * @str:   The constant string to write
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2] hid: usbhid: hid-core: fix recursive deadlock

2015-11-20 Thread Ioan-Adrian Ratiu
The critical section protected by usbhid->lock in hid_ctrl() is too
big and because of this it causes a recursive deadlock. "Too big" means
the case statement and the call to hid_input_report() do not need to be
protected by the spinlock (no URB operations are done inside them).

The deadlock happens because in certain rare cases drivers try to grab
the lock while handling the ctrl irq which grabs the lock before them
as described above. For example newer wacom tablets like 056a:033c try
to reschedule proximity reads from wacom_intuos_schedule_prox_event()
calling hid_hw_request() -> usbhid_request() -> usbhid_submit_report()
which tries to grab the usbhid lock already held by hid_ctrl().

There are two ways to get out of this deadlock:
1. Make the drivers work "around" the ctrl critical region, in the
wacom case for ex. by delaying the scheduling of the proximity read
request itself to a workqueue.
2. Shrink the critical region so the usbhid lock protects only the
instructions which modify usbhid state, calling hid_input_report()
with the spinlock unlocked, allowing the device driver to grab the
lock first, finish and then grab the lock afterwards in hid_ctrl().

This patch implements the 2nd solution.

Signed-off-by: Ioan-Adrian Ratiu 
---
 drivers/hid/usbhid/hid-core.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/hid/usbhid/hid-core.c b/drivers/hid/usbhid/hid-core.c
index 36712e9..5dd426f 100644
--- a/drivers/hid/usbhid/hid-core.c
+++ b/drivers/hid/usbhid/hid-core.c
@@ -477,8 +477,6 @@ static void hid_ctrl(struct urb *urb)
struct usbhid_device *usbhid = hid->driver_data;
int unplug = 0, status = urb->status;
 
-   spin_lock(&usbhid->lock);
-
switch (status) {
case 0: /* success */
if (usbhid->ctrl[usbhid->ctrltail].dir == USB_DIR_IN)
@@ -498,6 +496,8 @@ static void hid_ctrl(struct urb *urb)
hid_warn(urb->dev, "ctrl urb status %d received\n", status);
}
 
+   spin_lock(&usbhid->lock);
+
if (unplug) {
usbhid->ctrltail = usbhid->ctrlhead;
} else {
-- 
2.6.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] hid: usbhid: hid-core: fix recursive deadlock

2015-11-20 Thread Ioan-Adrian Ratiu
On Thu, 19 Nov 2015 22:34:18 +0100 (CET)
Jiri Kosina  wrote:
> Could you please reformulate the changelog in this respect and resubmit?

Yes, of course, I tried to reformulate the problem and solution as clear and
succint as I could in v2, which I'll send shortly.

Thank you very much for your patience and feedback.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] hid: usbhid: hid-core: fix recursive deadlock

2015-11-19 Thread Ioan-Adrian Ratiu
On Thu, 19 Nov 2015 10:10:19 +0100 (CET)
Jiri Kosina  wrote:

> On Thu, 19 Nov 2015, Ioan-Adrian Ratiu wrote:
> 
> > First part of lockdep report:
> > http://imgur.com/clLsCWe
> > 
> > Second part:
> > http://imgur.com/Wa2PzRl
> > 
> > Here are some printk's of mine while reproducing + debugging the issue:
> > http://imgur.com/SETOHT7  
> 
> So the real problem is that Intuos driver is calling hid_hw_request() 
> (which tries to grab the lock in usbhid_submit_report()) while handling 
> the CTRL IRQ (lock gets acquired there).
> 
> So the proper way to fix seems to be delaying the scheduling of the 
> proximity read event in wacom_intuos_inout() to workqueue.
> 
> > I'll continue to research this more in depth, but progress is slow 
> > because I don't have much time, I'm doing this in my spare time because 
> > it's my girlfriend's tablet.  
> 
> Oh, now I understand the level of severity of this bug! :-)
> 
> Thanks,
> 

Yes, exactly, you are beginning to understand! :)  When I've put my 2 variants
above to solve this deadlock, by "removing the call from wacom" at 1) I was
trying to say exactly this, removing it from the irq to a workqueue.

But please understand further my reasoning for submitting this patch. Consider
if this is a bug in the wacom driver or in the usbhid core? IMO
this is a usbhid bug: the critical region in hid_ctrl() is too big, there
is no reason for the call to hid_input_report() to be protected by
usbhid->lock.

The correct way to fix this deadlock is to fix the critical section in
usbhid, not remove the call from the wacom irq. If wacom wants to
reschedule in the irq, it should not deadlock on usbhid. "Fixing" the wacom call
would just work around the critical region bug inside usbhid.

I hope I've made myself clear this time; I really needed to explain this
patch better :( sorry.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] hid: usbhid: hid-core: fix recursive deadlock

2015-11-18 Thread Ioan-Adrian Ratiu
On Wed, 18 Nov 2015 17:58:56 -0600
Josh Cartwright  wrote:

> On Wed, Nov 18, 2015 at 11:05:44PM +0200, Ioan-Adrian Ratiu wrote:
> > On Wed, 18 Nov 2015 21:37:42 +0100 (CET)
> > Jiri Kosina  wrote:
> >   
> > > On Wed, 18 Nov 2015, Ioan-Adrian Ratiu wrote:
> > >   
> > > > The critical section protected by usbhid->lock in hid_ctrl() is too
> > > > big and in rare cases causes a recursive deadlock because of its call
> > > > to hid_input_report().
> > > > 
> > > > This deadlock reproduces on newer wacom tablets like 056a:033c because
> > > > the wacom driver in its irq handler ends up calling hid_hw_request()
> > > > from wacom_intuos_schedule_prox_event() in wacom_wac.c. What this means
> > > > is that it submits a report to reschedule a proximity read through a
> > > > sync ctrl call which grabs the lock in hid_ctrl(struct urb *urb)
> > > > before calling hid_input_report(). When the irq kicks in on the same
> > > > cpu, it also tries to grab the lock resulting in a recursive deadlock.
> > > > 
> > > > The proper fix is to shrink the critical section in hid_ctrl() to
> > > > protect only the instructions which modify usbhid, thus move the lock
> > > > after the hid_input_report() call and the deadlock dissapears.
> > > 
> > > I think the proper fix actually is to spin_lock_irqsave() in hid_ctrl(), 
> > > isn't it?
> > >   
> > 
> > That was my first attempt, yes, but the deadlock still happens with
> > interrupts disabled. It is very weird, I know.  
> 
> I think your best course of action is to figure out why this is the
> case, instead of continuing with trying to solve the symptoms.  Do you
> have actual callstacks showing the cases where you hit?  That might be
> useful to share (your lockdep picture cuts out the callstacks).
> 
> Also, have you tried without the PREEMPT_RT patch in the picture at all?
> 
>   Josh

Yes, of course I tried it without PREEMPT_RT_FULL :) This happens on vanilla
mainline kernels (only after 4.4-rc1 which introduced support for this kind of
tablets).

I also backported all the wacom patches to 4.1 non-RT and the same deadlock
happens.

I've sent another email with some lockdep traces and printk's on a running
vanilla linux-next, maybe it didn't get through, here are the links again:

First part of lockdep report:
http://imgur.com/clLsCWe

Second part:
http://imgur.com/Wa2PzRl

Here are some printk's of mine while reproducing + debugging the issue:
http://imgur.com/SETOHT7

I'll continue to research this more in depth, but progress is slow because I
don't have much time, I'm doing this in my spare time because it's my
girlfriend's tablet.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] hid: usbhid: hid-core: fix recursive deadlock

2015-11-18 Thread Ioan-Adrian Ratiu
On Wed, 18 Nov 2015 21:37:42 +0100 (CET)
Jiri Kosina  wrote:

> On Wed, 18 Nov 2015, Ioan-Adrian Ratiu wrote:
> 
> > The critical section protected by usbhid->lock in hid_ctrl() is too
> > big and in rare cases causes a recursive deadlock because of its call
> > to hid_input_report().
> > 
> > This deadlock reproduces on newer wacom tablets like 056a:033c because
> > the wacom driver in its irq handler ends up calling hid_hw_request()
> > from wacom_intuos_schedule_prox_event() in wacom_wac.c. What this means
> > is that it submits a report to reschedule a proximity read through a
> > sync ctrl call which grabs the lock in hid_ctrl(struct urb *urb)
> > before calling hid_input_report(). When the irq kicks in on the same
> > cpu, it also tries to grab the lock resulting in a recursive deadlock.
> > 
> > The proper fix is to shrink the critical section in hid_ctrl() to
> > protect only the instructions which modify usbhid, thus move the lock
> > after the hid_input_report() call and the deadlock dissapears.  
> 
> I think the proper fix actually is to spin_lock_irqsave() in hid_ctrl(), 
> isn't it?
> 

That was my first attempt, yes, but the deadlock still happens with interrupts
disabled. It is very weird, I know. I tried many configurations, like disabling
PREEMPT_RT and other stuff which might affect the call stack in this case, but
the only two methods which actually avoid the deadlock are:

1. don't call wacom_intuos_schedule_prox_event() / hid_hw_request() from the
wacom driver

2. shrink the critical region to not cover hid_input_report() inside hid_ctrl()

I am very open to any ideas on how to better fix this, just to be able to use a
mainline kernel with my device without out of tree patching :)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] hid: usbhid: hid-core: fix recursive deadlock

2015-11-18 Thread Ioan-Adrian Ratiu
Here are some images with more information on this deadlock, might be helpful.

First part of lockdep report:
http://imgur.com/clLsCWe

Second part:
http://imgur.com/Wa2PzRl

Here are some printk's of mine while reproducing + debugging the issue:
http://imgur.com/SETOHT7
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] hid: usbhid: hid-core: fix recursive deadlock

2015-11-18 Thread Ioan-Adrian Ratiu
The critical section protected by usbhid->lock in hid_ctrl() is too
big and in rare cases causes a recursive deadlock because of its call
to hid_input_report().

This deadlock reproduces on newer wacom tablets like 056a:033c because
the wacom driver in its irq handler ends up calling hid_hw_request()
from wacom_intuos_schedule_prox_event() in wacom_wac.c. What this means
is that it submits a report to reschedule a proximity read through a
sync ctrl call which grabs the lock in hid_ctrl(struct urb *urb)
before calling hid_input_report(). When the irq kicks in on the same
cpu, it also tries to grab the lock resulting in a recursive deadlock.

The proper fix is to shrink the critical section in hid_ctrl() to
protect only the instructions which modify usbhid, thus move the lock
after the hid_input_report() call and the deadlock dissapears.

Signed-off-by: Ioan-Adrian Ratiu 
---
 drivers/hid/usbhid/hid-core.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/hid/usbhid/hid-core.c b/drivers/hid/usbhid/hid-core.c
index 36712e9..5dd426f 100644
--- a/drivers/hid/usbhid/hid-core.c
+++ b/drivers/hid/usbhid/hid-core.c
@@ -477,8 +477,6 @@ static void hid_ctrl(struct urb *urb)
struct usbhid_device *usbhid = hid->driver_data;
int unplug = 0, status = urb->status;
 
-   spin_lock(&usbhid->lock);
-
switch (status) {
case 0: /* success */
if (usbhid->ctrl[usbhid->ctrltail].dir == USB_DIR_IN)
@@ -498,6 +496,8 @@ static void hid_ctrl(struct urb *urb)
hid_warn(urb->dev, "ctrl urb status %d received\n", status);
}
 
+   spin_lock(&usbhid->lock);
+
if (unplug) {
usbhid->ctrltail = usbhid->ctrlhead;
} else {
-- 
2.6.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/1] drm/i915/dma: enforce pr_ consistency

2015-10-30 Thread Ioan-Adrian Ratiu
One branch of the if clause uses pr_info, the other pr_err; change
the 'false' branch to also use pr_info. This minor oversight has gone
unfixed since the initial vga_switcheroo implementation in 6a9ee8af.

Signed-off-by: Ioan-Adrian Ratiu 
---
 drivers/gpu/drm/i915/i915_dma.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index 2336af9..b23f465 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -338,7 +338,7 @@ static void i915_switcheroo_set_state(struct pci_dev *pdev, 
enum vga_switcheroo_
i915_resume_switcheroo(dev);
dev->switch_power_state = DRM_SWITCH_POWER_ON;
} else {
-   pr_err("switched off\n");
+   pr_info("switched off\n");
dev->switch_power_state = DRM_SWITCH_POWER_CHANGING;
i915_suspend_switcheroo(dev, pmm);
dev->switch_power_state = DRM_SWITCH_POWER_OFF;
-- 
2.6.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] staging: lustre: ptlrpc: add missing include directive

2015-08-15 Thread Ioan-Adrian Ratiu
On Fri, 14 Aug 2015 18:50:24 -0700
Greg KH  wrote:

> On Fri, Aug 14, 2015 at 12:57:06PM +0300, Ioan-Adrian Ratiu wrote:
> > Without including ptlrpc_internal.h, GCC gives prototype warnings
> > "pack_generic.c:642:5: warning: no previous prototype for ..."
> 
> It does?  What version of gcc give you that, I don't see that here.
> 

Yes, but it's a non-default warning (-Wmissing-prototypes)

$ gcc --version
gcc (Gentoo 4.9.3 p1.0, pie-0.6.2) 4.9.3

When testing the above patch I've used Ubuntu Vivid Vervet's gcc
(also 4.9 I think, I don't have access to that machine right now).

To see the warn I'm adding "ccflags-y := -Wmissing-prototypes"
to a Makefile then "make SUBDIRS=..."
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] staging: lustre: ptlrpc: add missing include directive

2015-08-14 Thread Ioan-Adrian Ratiu
Without including ptlrpc_internal.h, GCC gives prototype warnings
"pack_generic.c:642:5: warning: no previous prototype for ..."
and sparse also complains "pack_generic.c:642:5: warning: symbol
'lustre_unpack_req_ptlrpc_body' was not declared. ..."

Signed-off-by: Ioan-Adrian Ratiu 
---
 drivers/staging/lustre/lustre/ptlrpc/pack_generic.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/staging/lustre/lustre/ptlrpc/pack_generic.c 
b/drivers/staging/lustre/lustre/ptlrpc/pack_generic.c
index 2787bfd..84937ad 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/pack_generic.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/pack_generic.c
@@ -52,6 +52,8 @@
 #include "../include/obd_cksum.h"
 #include "../include/lustre/ll_fiemap.h"
 
+#include "ptlrpc_internal.h"
+
 static inline int lustre_msg_hdr_size_v2(int count)
 {
return cfs_size_round(offsetof(struct lustre_msg_v2,
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2] staging: rtl8192e: rtllib: fix macro style issue

2015-07-23 Thread Ioan-Adrian Ratiu
Remove macro and use explicit case statements. Code is a little
longer but clearer. Checkpatch.pl does not complain anymore.

Signed-off-by: Ioan-Adrian Ratiu 
---
 drivers/staging/rtl8192e/rtllib_rx.c | 80 +++-
 1 file changed, 52 insertions(+), 28 deletions(-)

diff --git a/drivers/staging/rtl8192e/rtllib_rx.c 
b/drivers/staging/rtl8192e/rtllib_rx.c
index 54dfff6..09f0820 100644
--- a/drivers/staging/rtl8192e/rtllib_rx.c
+++ b/drivers/staging/rtl8192e/rtllib_rx.c
@@ -1744,37 +1744,61 @@ static int rtllib_parse_qos_info_param_IE(struct 
rtllib_device *ieee,
return rc;
 }
 
-#define MFIE_STRING(x) case MFIE_TYPE_ ##x: return #x
-
 static const char *get_info_element_string(u16 id)
 {
switch (id) {
-   MFIE_STRING(SSID);
-   MFIE_STRING(RATES);
-   MFIE_STRING(FH_SET);
-   MFIE_STRING(DS_SET);
-   MFIE_STRING(CF_SET);
-   MFIE_STRING(TIM);
-   MFIE_STRING(IBSS_SET);
-   MFIE_STRING(COUNTRY);
-   MFIE_STRING(HOP_PARAMS);
-   MFIE_STRING(HOP_TABLE);
-   MFIE_STRING(REQUEST);
-   MFIE_STRING(CHALLENGE);
-   MFIE_STRING(POWER_CONSTRAINT);
-   MFIE_STRING(POWER_CAPABILITY);
-   MFIE_STRING(TPC_REQUEST);
-   MFIE_STRING(TPC_REPORT);
-   MFIE_STRING(SUPP_CHANNELS);
-   MFIE_STRING(CSA);
-   MFIE_STRING(MEASURE_REQUEST);
-   MFIE_STRING(MEASURE_REPORT);
-   MFIE_STRING(QUIET);
-   MFIE_STRING(IBSS_DFS);
-   MFIE_STRING(RSN);
-   MFIE_STRING(RATES_EX);
-   MFIE_STRING(GENERIC);
-   MFIE_STRING(QOS_PARAMETER);
+   case MFIE_TYPE_SSID:
+   return "SSID";
+   case MFIE_TYPE_RATES:
+   return "RATES";
+   case MFIE_TYPE_FH_SET:
+   return "FH_SET";
+   case MFIE_TYPE_DS_SET:
+   return "DS_SET";
+   case MFIE_TYPE_CF_SET:
+   return "CF_SET";
+   case MFIE_TYPE_TIM:
+   return "TIM";
+   case MFIE_TYPE_IBSS_SET:
+   return "IBSS_SET";
+   case MFIE_TYPE_COUNTRY:
+   return "COUNTRY";
+   case MFIE_TYPE_HOP_PARAMS:
+   return "HOP_PARAMS";
+   case MFIE_TYPE_HOP_TABLE:
+   return "HOP_TABLE";
+   case MFIE_TYPE_REQUEST:
+   return "REQUEST";
+   case MFIE_TYPE_CHALLENGE:
+   return "CHALLENGE";
+   case MFIE_TYPE_POWER_CONSTRAINT:
+   return "POWER_CONSTRAINT";
+   case MFIE_TYPE_POWER_CAPABILITY:
+   return "POWER_CAPABILITY";
+   case MFIE_TYPE_TPC_REQUEST:
+   return "TPC_REQUEST";
+   case MFIE_TYPE_TPC_REPORT:
+   return "TPC_REPORT";
+   case MFIE_TYPE_SUPP_CHANNELS:
+   return "SUPP_CHANNELS";
+   case MFIE_TYPE_CSA:
+   return "CSA";
+   case MFIE_TYPE_MEASURE_REQUEST:
+   return "MEASURE_REQUEST";
+   case MFIE_TYPE_MEASURE_REPORT:
+   return "MEASURE_REPORT";
+   case MFIE_TYPE_QUIET:
+   return "QUIET";
+   case MFIE_TYPE_IBSS_DFS:
+   return "IBSS_DFS";
+   case MFIE_TYPE_RSN:
+   return "RSN";
+   case MFIE_TYPE_RATES_EX:
+   return "RATES_EX";
+   case MFIE_TYPE_GENERIC:
+   return "GENERIC";
+   case MFIE_TYPE_QOS_PARAMETER:
+   return "QOS_PARAMETER";
default:
return "UNKNOWN";
}
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] staging: rtl8192e: rtllib: fix macro style issue

2015-07-23 Thread Ioan-Adrian Ratiu
Enclose defined macro in paranthesis to avoid checkpatch complaint
"ERROR: Macros with complex values should be enclosed in parentheses"

Signed-off-by: Ioan-Adrian Ratiu 
---
 drivers/staging/rtl8192e/rtllib_rx.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/staging/rtl8192e/rtllib_rx.c 
b/drivers/staging/rtl8192e/rtllib_rx.c
index 54dfff6..5f1650c 100644
--- a/drivers/staging/rtl8192e/rtllib_rx.c
+++ b/drivers/staging/rtl8192e/rtllib_rx.c
@@ -1744,7 +1744,7 @@ static int rtllib_parse_qos_info_param_IE(struct 
rtllib_device *ieee,
return rc;
 }
 
-#define MFIE_STRING(x) case MFIE_TYPE_ ##x: return #x
+#define MFIE_STRING(x) (case MFIE_TYPE_ ##x: return #x)
 
 static const char *get_info_element_string(u16 id)
 {
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/