On 26-4-2017 9:16, Johannes Berg wrote:
> On Fri, 2017-04-21 at 13:05 +0100, Arend van Spriel wrote:
>> Have proper request id filled in the SCHED_SCAN_RESULTS and
>> SCHED_SCAN_STOPPED notifications toward user-space by having the
>> driver provide it through the api.
>>
>> Reviewed-by: Hante Meuleman <[email protected]>
>> Reviewed-by: Pieter-Paul Giesberts <[email protected]
>> m>
>> Reviewed-by: Franky Lin <[email protected]>
>
> All your reviewers aren't paying attention ;-)
>
>> @@ -1722,6 +1723,7 @@ struct cfg80211_sched_scan_request {
>> struct cfg80211_bss_select_adjust rssi_adjust;
>>
>> /* internal */
>> + struct work_struct results_wk;
>> struct wiphy *wiphy;
>> struct net_device *dev;
>> unsigned long scan_start;
>
> Superficially, this is fine - but you need to think hard about the
> semantics of when you cancel this work.
>
>> +++ b/net/wireless/scan.c
>> @@ -369,15 +369,13 @@ void __cfg80211_sched_scan_results(struct
>> work_struct *wk)
>> struct cfg80211_registered_device *rdev;
>> struct cfg80211_sched_scan_request *request;
>>
>> - rdev = container_of(wk, struct cfg80211_registered_device,
>> - sched_scan_results_wk);
>> + request = container_of(wk, struct
>> cfg80211_sched_scan_request, results_wk);
>> + rdev = wiphy_to_rdev(request->wiphy);
>>
>> rtnl_lock();
>>
>> - request = cfg80211_find_sched_scan_req(rdev, 0);
>> -
>> /* we don't have sched_scan_req anymore if the scan is
>> stopping */
>
> That comment is kinda wrong now, afaict? Also you return
>
>> - if (!IS_ERR(request)) {
>> + if (request) {
>
> This seems wrong - you do return an ERR_PTR() from find. Not that
> there's all that much point in doing so vs. returning NULL, might be
> worth cleaning it up.
Indeed. Not sure if it was me being sloppy/confused or due to rebasing
the patches. Anyway, I will fix this.
>> -void cfg80211_sched_scan_results(struct wiphy *wiphy)
>> +void cfg80211_sched_scan_results(struct wiphy *wiphy, u64 reqid)
>> {
>> - struct cfg80211_registered_device *rdev =
>> wiphy_to_rdev(wiphy);
>> struct cfg80211_sched_scan_request *request;
>>
>> - trace_cfg80211_sched_scan_results(wiphy);
>> + trace_cfg80211_sched_scan_results(wiphy, reqid);
>> /* ignore if we're not scanning */
>>
>> rtnl_lock();
>> - request = cfg80211_find_sched_scan_req(rdev, 0);
>> + request = cfg80211_find_sched_scan_req(wiphy_to_rdev(wiphy),
>> reqid);
>> rtnl_unlock();
>>
>> if (!IS_ERR(request))
>> - queue_work(cfg80211_wq,
>> - &wiphy_to_rdev(wiphy)-
>>> sched_scan_results_wk);
>> + queue_work(cfg80211_wq, &request->results_wk);
>> + else
>> + wiphy_err(wiphy, "reqid %llu not found\n", reqid);
>> }
>
> This seems problematic - you use the request pointer outside the
> locking now. I'd move that rtnl_unlock() to afterwards. The message
> could also mention sched scan, that might be useful. Although I suspect
> that may happen due to races (e.g. netlink socket close vs. firmware
> stop) so maybe it's not all that useful.
When adding the worker in the request I was thinking about what might
happen between the queue_work and the work actually being scheduled.
> Most importantly though, you don't protect against queuing the work
> here, and then deleting the request. In the old case the comment that I
> pointed out above will save us, but in this new case it can't (the
> comment is now wrong), and that's a problem.
>
> I looked briefly at this and I suspect it will be very hard to fix that
> with a per-request work struct. It might be better to have a per-work
> status flag that you set here, then you schedule the global work, and
> that global work will send all the appropriate result messages after
> clearing the status bit again, similar to what I did with the netlink
> destroy now.
I guess it is in you repo as I did not see anything related on the
mailing list. So regarding this rework, do you want me to send the
series again or just this patch?
Regards,
Arend