Hi, I would like to pick up this thread as I have been doing a lot of work with the ATH10K firmware, and we are still no clearer on a workable solution on the Linux 4.14.4 kernel.
We see the firmware crash when the kernel memory starts to get fragmented - we are using a dd copy operation to simulate the clean cache becoming full. We can help to mitigate (but not stop) the crashes by forcing a cache clear (e.g. drop cache) at regular intervals, or by forcing kswapd to run more often by using the highest watermark_scale_factor (10%). The ATH10K driver can be made to crash with legacy or MSI interrupts, and occurs with or without SMP active. Furthermore, if I enable or disable IOMMU the fault symptoms do not change - this effectively replaces the DMA ops with SWIO vs the hard IOMMU on the Arm 64-bit core. There appear to be two types of crash - one where we seem to get an assertion in the QCA9984/QCA988X device, and an immediate crash notification via the NAPI callbacks. The second type is more mysterious, as the firmware just hangs there after completing one of the many receive hand-shakes, and then is eventually picked up by the firmware as a crash. To start, it is not clear what can make a remote device fail - can anyone help explain what might be causing the QCA988x and QCA9984 to fail in the first place? Best, Adam _______________________________________________ ath10k mailing list [email protected] http://lists.infradead.org/mailman/listinfo/ath10k
