Daniel Mack <dan...@zonque.org> writes:

> On Friday, May 18, 2018 01:28 PM, Kalle Valo wrote:
>> Daniel Mack <dan...@zonque.org> writes:
>>
>>> On Wednesday, May 16, 2018 04:08 PM, Daniel Mack wrote:
>>>> Hence I believe that some sort of firmware internal buffer is overrun if
>>>> too many SMD requests fly in in a short amount of time. The firmware
>>>> does, however, still ack all packets just fine on the SMD channels, and
>>>> also the DXE communication flows are all healthy. No errors are reported
>>>> anywhere, but nothing is being put on the ether anymore.
>>>
>>> And FTR, there is a commit in the prima repository that caught my
>>> attention a while back:
>>>
>>>    https://source.codeaurora.org/external/wlan/prima/commit/?id=93cd8f3c
>>>
>>> What this does (through an remarkable number of indirection layers) is
>>> sending the DUMP_COMMAND_REQ command with args = (274, 0, 0, 0, 0)
>>> when management frames get stuck, which smells pretty much like the
>>> issue I'm seeing. Doing the same with the mainline driver and the
>>> debugfs interface it exposes doesn't have any effect though.
>>>
>>> But even if it did work, I wouldn't see a way to detect the situation
>>> in which this is needed reliably.
>>
>> The firmware version might make a difference so I recommend always
>> mentioning the firmware version as well. For example, what if your
>> firmware does not support that command or parameter?
>
> Sure, that could be the case. FTR - the firmware I'm using is the one
> that came out of the Qualcomm r1034.2.1 BSP. It is recognized by the
> driver as 'WCN v2.0 RadioPhy vRhea_GF_1.12 with 19.2MHz XO'.

Ok, thanks. Please add that to the bug report.

>> Also I would recommend to file a bug to bugzilla.kernel.org so that all
>> the information is one place and it can be easily updated. Now it's
>> pretty difficult to get the big picture from various emails on the list.
>
> Yes, I agree it's a bit convoluted. However, there's already the bug
> report on 96board.org that Bjorn opened some time back, and I
> considered that sufficient. IMO, it has all the information needed,
> plus a link to a tool to reproduce the issue.
>
>   https://bugs.96boards.org/show_bug.cgi?id=538

Yeah, bugs.96boards.org is fine. As long as there's one place which
collects all the information about the bug.

But IMHO the bug report is not telling much, all I get is that TX frames
get stuck but not even that is confirmed. After reading it I have at
least these questions:

* Is it really confirmed that the issue is that TX frames are stuck? For
  example, using a wireless sniffer would confirm that.

* Are only management frames stuck or does it also involve data frames?

* Based on the bug report the TX stuck issue seems to happen during
  authentication, but what happens before that? Does wcn36xx get
  disconnected from AP or what?

* Any wcn36xx logs about the issue (with or without debug logs)? Also
  matching wpasupplicant logs would help.

* Does this only happen with encryption or also in open mode?

* How long does it take with qconnman-stress to reproduce the issue?

* Does the radio environment make any difference on reproducibility? For
  example, clear enviroment vs lots of traffic/interference?

-- 
Kalle Valo

Reply via email to