Re: [PATCH v5 0/7] ath10k: firmware crash dump

Ben Greear Sat, 09 Aug 2014 09:53:25 -0700


On 08/08/2014 10:50 PM, Kalle Valo wrote:

Ben Greear <[email protected]> writes:

On 08/08/2014 01:28 PM, Kalle Valo wrote:

Hi,

here's my reworked Ben's patchset adding firmware crash dump support to ath10k.
Unfortunately this crashes when reading the stack dump from the firmware but
time run out for me to fix that and I wanted to send this for comments anyway.

I did quite a lot of changes, basically to simplify the code, remove ifdefs and
so on. Here's some sort of list what I did:

* dump_data->tv_sec and tv_nsec to 64 bits (because long can be 32 bits
   on some platforms)


I did u32 on purpose because I know how to do ntohl, htonl if I need
to flip the machine order in user-space.


On my 32-bit box I got a compiler warning for this and that's why I
changed it to u64.


Casting could have fixed that I imagine.

Is there something similar for 64-bit numbers?


At least I haven't heard anything. But it should be easy to implement on
your own.


Sure...it's not a big deal, likely there will be at most two different apps
that ever decode this anyway (mine and qca's).

* latest crash dump is always stored (instead of the oldest unread)


At least with the kernel, the first crash is normally the most useful,
so I figured the same would be true of the firmware (a firmware crash
is excellent way to find hidden bugs in the driver, so upon crash/reload,
it is more likely that the driver will be screwed up and thus mis-configure
firmware...)


Think of an AP with 3 month uptime where firmware crashes once in June
and second time in August and when you retrieve the crash dump you get
the one from June. I find that very unintuitive.


AP should have gathered the crash the first time, but again, not a big deal,
as I would gather the crash logs as soon as they happen anyway.

* what should ath10k_fw_error_dump_open() do if firmware hasn't
   crashed? check crashed_since_read and return zero len file? or an
   error code? -ENOMSG?


Maybe a mostly empty crash log with just the header, should be easy enough
to figure out it's not really a crash.  If we use udev to gather these,
it is likely to be stupid shell scripts doing a lot of the work, so
being clever on return codes might not be helpful.


But with cp in that case you also get an empty file, right? That way
it's easy to detect that there's no crash dump available.

What is the benefit from providing an empty header to user space? I
don't get that.


An empty file could be any sort of error or mis-configuration
...but an empty header means the infrastructure is working properly.

I think with all of this, it mostly a matter of opinion, and whatever
you choose should be fine with me.  If we find any serious issues, we
can of course fix it later.

I'll go dig through the patches in a bit..have to figure out why systemd
is fcked up on my lab machine first. :P


But wasn't systemd supposed to fix all of our problems? ;)


On Fedora 19, the latest update forgot how to enable/disable programs.  It must
have a zero-regression-test policy as well as it's other issues :P

Thanks,
Ben

--
Ben Greear <[email protected]>
Candela Technologies Inc  http://www.candelatech.com

_______________________________________________
ath10k mailing list
[email protected]
http://lists.infradead.org/mailman/listinfo/ath10k

Re: [PATCH v5 0/7] ath10k: firmware crash dump

Reply via email to