Re: [PATCH net-next 3/5] iwlegacy: avoid -Wempty-body warning

2021-03-22 Thread Stanislaw Gruszka
On Mon, Mar 22, 2021 at 11:43:33AM +0100, Arnd Bergmann wrote:
> From: Arnd Bergmann 
> 
> There are a couple of warnings in this driver when building with W=1:
> 
> drivers/net/wireless/intel/iwlegacy/common.c: In function 'il_power_set_mode':
> drivers/net/wireless/intel/iwlegacy/common.c:1195:60: error: suggest braces 
> around empty body in an 'if' statement [-Werror=empty-body]
>  1195 | il->chain_noise_data.state);
>   |^
> drivers/net/wireless/intel/iwlegacy/common.c: In function 'il_do_scan_abort':
> drivers/net/wireless/intel/iwlegacy/common.c:1343:57: error: suggest braces 
> around empty body in an 'else' statement [-Werror=empty-body]
> 
> Change the empty debug macros to no_printk(), which avoids the
> warnings and adds useful format string checks.
> 
> Signed-off-by: Arnd Bergmann 
Acked-by: Stanislaw Gruszka 


Re: [PATCH] iwlegacy: Add missing check in il4965_commit_rxon

2021-02-28 Thread Stanislaw Gruszka
On Sun, Feb 28, 2021 at 08:25:22PM +0800, Dinghao Liu wrote:
> There is one il_set_tx_power() call in this function without
> return value check. Print error message and return error code
> on failure just like the other il_set_tx_power() call.

We have few calls for il_set_tx_power(), on some cases we
check return on some not. That correct as setting tx power
can be deferred internally if not possible at the moment.

> Signed-off-by: Dinghao Liu 
> ---
>  drivers/net/wireless/intel/iwlegacy/4965.c | 6 +-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/wireless/intel/iwlegacy/4965.c 
> b/drivers/net/wireless/intel/iwlegacy/4965.c
> index 9fa556486511..3235b8be1894 100644
> --- a/drivers/net/wireless/intel/iwlegacy/4965.c
> +++ b/drivers/net/wireless/intel/iwlegacy/4965.c
> @@ -1361,7 +1361,11 @@ il4965_commit_rxon(struct il_priv *il)
>* We do not commit tx power settings while channel changing,
>* do it now if tx power changed.
>*/
> - il_set_tx_power(il, il->tx_power_next, false);
> + ret = il_set_tx_power(il, il->tx_power_next, false);
> + if (ret) {
> + IL_ERR("Error sending TX power (%d)\n", ret);
> + return ret;
> + 

This is not good change. We do not check return value of
il_commit_rxon(), except when creating interface. So this change might
broke creating interface, what worked otherwise when il_set_tx_power()
returned error.

Stanislaw


Re: [PATCH] iwlegacy: 4965-mac: Simplify the calculation of variables

2021-02-18 Thread Stanislaw Gruszka
On Thu, Feb 18, 2021 at 03:20:14PM +0800, Jiapeng Chong wrote:
> Fix the following coccicheck warnings:
> 
> ./drivers/net/wireless/intel/iwlegacy/4965-mac.c:2596:54-56: WARNING !A
> || A && B is equivalent to !A || B.
> 
> Reported-by: Abaci Robot 
> Signed-off-by: Jiapeng Chong 
> ---
>  drivers/net/wireless/intel/iwlegacy/4965-mac.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/drivers/net/wireless/intel/iwlegacy/4965-mac.c 
> b/drivers/net/wireless/intel/iwlegacy/4965-mac.c
> index 28675a4..52db532 100644
> --- a/drivers/net/wireless/intel/iwlegacy/4965-mac.c
> +++ b/drivers/net/wireless/intel/iwlegacy/4965-mac.c
> @@ -2593,8 +2593,7 @@ struct il_mod_params il4965_mod_params = {
>*/
>   if (ret != IL_INVALID_STATION &&
>   (!(il->stations[ret].used & IL_STA_UCODE_ACTIVE) ||
> -  ((il->stations[ret].used & IL_STA_UCODE_ACTIVE) &&
> -   (il->stations[ret].used & IL_STA_UCODE_INPROGRESS {
> +   (il->stations[ret].used & IL_STA_UCODE_INPROGRESS))) {
>   IL_ERR("Requested station info for sta %d before ready.\n",
>  ret);
>   ret = IL_INVALID_STATION;

This patch was already applied to wireless-drivers-next.

Stanislaw


Re: [PATCH] iwlegacy: 4965-mac: Simplify the calculation of variables

2021-02-04 Thread Stanislaw Gruszka
On Thu, Feb 04, 2021 at 04:00:08PM +0800, Jiapeng Chong wrote:
> Fix the following coccicheck warnings:
> 
> ./drivers/net/wireless/intel/iwlegacy/4965-mac.c:2596:54-56: WARNING !A
> || A && B is equivalent to !A || B.
> 
> Reported-by: Abaci Robot 
> Signed-off-by: Jiapeng Chong 

Acked-by: Stanislaw Gruszka 


Re: "KMSAN: uninit-value in rt2500usb_bbp_read" and "KMSAN: uninit-value in rt2500usb_probe_hw" should be duplicate crash reports

2021-01-21 Thread Stanislaw Gruszka
On Thu, Jan 21, 2021 at 04:47:37PM +0800, 慕冬亮 wrote:
> ## Patch
> 
> I propose to memset reg variable before invoking
> rt2x00usb_vendor_req_buff_lock/rt2x00usb_vendor_request_buff.
> 
> 
> diff --git a/drivers/net/wireless/ralink/rt2x00/rt2500usb.c
> b/drivers/net/wireless/ralink/rt2x00/rt2500usb.c
> index fce05fc88aaf..f6c93a25b18c 100644
> --- a/drivers/net/wireless/ralink/rt2x00/rt2500usb.c
> +++ b/drivers/net/wireless/ralink/rt2x00/rt2500usb.c
> @@ -48,6 +48,7 @@ static u16 rt2500usb_register_read(struct rt2x00_dev
> *rt2x00dev,
>const unsigned int offset)
>  {
> __le16 reg;
> +   memset(, 0, sizeof(reg));

Simpler would be just to initialize like this: __le16 reg = 0; 

Stanislaw


Re: [PATCH] mm/filemap: add static for function __add_to_page_cache_locked

2020-12-09 Thread Stanislaw Gruszka
On Wed, Dec 09, 2020 at 03:08:26PM +, Matthew Wilcox wrote:
> On Wed, Dec 09, 2020 at 03:46:28PM +0100, Stanislaw Gruszka wrote:
> > At this point of release cycle we should probably go with revert,
> > but I think the main problem is that BPF and ERROR_INJECTION use
> > function that is not intended to be used externally. For external users
> > add_to_page_cache_lru() and add_to_page_cache_locked() are exported
> > and I think those should be used (see the patch below).
> 
> FWIW, I intend to do some consolidation/renaming in this area.  I
> trust that will not be a problem?

If it does not break anything, it will be not a problem ;-)

It's possible that __add_to_page_cache_locked() can be a global symbol
with add_to_page_cache_lru() + add_to_page_cache_locked() being just
static/inline wrappers around it.

Stanislaw


Re: [PATCH] mm/filemap: add static for function __add_to_page_cache_locked

2020-12-09 Thread Stanislaw Gruszka
On Mon, Dec 07, 2020 at 05:12:52PM -0800, Alexei Starovoitov wrote:
> > > > > > > >>> -noinline int __add_to_page_cache_locked(struct page *page,
> > > > > > > >>> +static noinline int __add_to_page_cache_locked(struct page 
> > > > > > > >>> *page,
> > > > > > > >>> struct address_space 
> > > > > > > >>> *mapping,
> > > > > > > >>> pgoff_t offset, gfp_t 
> > > > > > > >>> gfp,
> > > > > > > >>> void **shadowp)
> > > > > > > >
> > > > > > > > It's unclear to me whether BTF_ID() requires that the target 
> > > > > > > > symbol be
> > > > > > > > non-static.  It doesn't actually reference the symbol:
> > > > > > > >
> > > > > > > > #define BTF_ID(prefix, name) \
> > > > > > > > __BTF_ID(__ID(__BTF_ID__##prefix##__##name##__))

[snip]

> > > __add_to_page_cache_locked") made the function static which breaks the
> > > build in btfids phase - but it seems to happen only on some
> > > architectures. In our case, ppc64, ppc64le and riscv64 are broken,
> > > x86_64, i586 and s390x succeed. (I made a mistake above, aarch64 did not
> > > fail - but only because it was not built at all.)
> > >
> > > The thread starts with
> > > http://lkml.kernel.org/r/1604661895-5495-1-git-send-email-alex@linux.alibaba.com

I have 5.10-rc7 build failure because of this on x86_64:

  BTFIDS  vmlinux
FAILED unresolved symbol __add_to_page_cache_locked
make: *** [Makefile:1170: vmlinux] Error 255

> > Got it. So the above commit is wrong.
> > The addition of "static" is incorrect here.
> > Regardless of btf_id generation.
> > "static noinline" means that the error injection in that spot is unreliable.
> > Even when bpf is completely compiled out of the kernel.
> 
> I finally realized that the addition of 'static' was pushed into Linus's tree 
> :(
> Please revert commit 3351b16af494 ("mm/filemap: add static for
> function __add_to_page_cache_locked")

At this point of release cycle we should probably go with revert,
but I think the main problem is that BPF and ERROR_INJECTION use
function that is not intended to be used externally. For external users
add_to_page_cache_lru() and add_to_page_cache_locked() are exported
and I think those should be used (see the patch below).

Stanislaw

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 1388bf733071..dd6357802504 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -11487,7 +11487,8 @@ BTF_SET_START(btf_non_sleepable_error_inject)
 /* Three functions below can be called from sleepable and non-sleepable 
context.
  * Assume non-sleepable from bpf safety point of view.
  */
-BTF_ID(func, __add_to_page_cache_locked)
+BTF_ID(func, add_to_page_cache_locked)
+BTF_ID(func, add_to_page_cache_lru)
 BTF_ID(func, should_fail_alloc_page)
 BTF_ID(func, should_failslab)
 BTF_SET_END(btf_non_sleepable_error_inject)
diff --git a/mm/filemap.c b/mm/filemap.c
index 331f4261d723..168deec64a10 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -827,10 +827,10 @@ int replace_page_cache_page(struct page *old, struct page 
*new, gfp_t gfp_mask)
 }
 EXPORT_SYMBOL_GPL(replace_page_cache_page);
 
-static noinline int __add_to_page_cache_locked(struct page *page,
-   struct address_space *mapping,
-   pgoff_t offset, gfp_t gfp,
-   void **shadowp)
+static int __add_to_page_cache_locked(struct page *page,
+ struct address_space *mapping,
+ pgoff_t offset, gfp_t gfp,
+ void **shadowp)
 {
XA_STATE(xas, >i_pages, offset);
int huge = PageHuge(page);
@@ -907,7 +907,6 @@ static noinline int __add_to_page_cache_locked(struct page 
*page,
put_page(page);
return error;
 }
-ALLOW_ERROR_INJECTION(__add_to_page_cache_locked, ERRNO);
 
 /**
  * add_to_page_cache_locked - add a locked page to the pagecache
@@ -928,6 +927,7 @@ int add_to_page_cache_locked(struct page *page, struct 
address_space *mapping,
  gfp_mask, NULL);
 }
 EXPORT_SYMBOL(add_to_page_cache_locked);
+ALLOW_ERROR_INJECTION(add_to_page_cache_locked, ERRNO);
 
 int add_to_page_cache_lru(struct page *page, struct address_space *mapping,
pgoff_t offset, gfp_t gfp_mask)
@@ -957,6 +957,7 @@ int add_to_page_cache_lru(struct page *page, struct 
address_space *mapping,
return ret;
 }
 EXPORT_SYMBOL_GPL(add_to_page_cache_lru);
+ALLOW_ERROR_INJECTION(add_to_page_cache_lru, ERRNO);
 
 #ifdef CONFIG_NUMA
 struct page *__page_cache_alloc(gfp_t gfp)


Re: [PATCH v2] rt2x00: save survey for every channel visited

2020-11-04 Thread Stanislaw Gruszka
On Mon, Nov 02, 2020 at 06:19:32PM +0200, Kalle Valo wrote:
> Марков Михаил Александрович  writes:
> 
> > rt2800 only gives you survey for current channel.
> >
> > Survey-based ACS algorithms are failing to perform their job when working
> > with rt2800.
> >
> > Make rt2800 save survey for every channel visited and be able to give away
> > that information.
> >
> > There is a bug registered https://dev.archive.openwrt.org/ticket/19081 and
> > this patch solves the issue.
> >
> > Signed-off-by: Markov Mikhail 
> 
> Content-Type: text/plain; charset="koi8-r"
> 
> Just so you know I have no idea how patchwork or my scripts handle
> koi8-r. I recommend using utf-8 when submitting patches, but I can
> always try and see what breaks.

I've downloaded the patch from patchwork (using mbox button) and
it applied cleanly for me by 'git am'. So I think there should be
no problems, but if needed I can repost patch using utf-8.

Thanks
Stanislaw


Re: [PATCH v2] rt2x00: save survey for every channel visited

2020-10-21 Thread Stanislaw Gruszka
On Tue, Oct 20, 2020 at 07:43:12PM +, Марков Михаил Александрович wrote:
> rt2800 only gives you survey for current channel.
> 
> Survey-based ACS algorithms are failing to perform their job when working
> with rt2800.
> 
> Make rt2800 save survey for every channel visited and be able to give away
> that information.
> 
> There is a bug registered https://dev.archive.openwrt.org/ticket/19081 and
> this patch solves the issue.
> 
> Signed-off-by: Markov Mikhail 
>
> ---
> Changes are now aggregated in rt2800lib.c.

Acked-by: Stanislaw Gruszka 

Thanks
Stanislaw


Re: [PATCH] rt2x00: save survey for every channel visited

2020-10-20 Thread Stanislaw Gruszka
On Mon, Oct 19, 2020 at 07:06:47PM +, Марков Михаил Александрович wrote:
> rt2800 only gives you survey for current channel.

>  .watchdog= rt2800_watchdog,
> +.update_survey= rt2800_update_survey,

Since this feature is rt2800 specific, I would do not add new generic
callback. It could be fully done in rt2800* code, i.e ...

> diff --git a/drivers/net/wireless/ralink/rt2x00/rt2x00dev.c 
> b/drivers/net/wireless/ralink/rt2x00/rt2x00dev.c
> index 8c6d3099b19d..8eff57132154 100644
> --- a/drivers/net/wireless/ralink/rt2x00/rt2x00dev.c
> +++ b/drivers/net/wireless/ralink/rt2x00/rt2x00dev.c
> @@ -1026,6 +1026,12 @@ static int rt2x00lib_probe_hw_modes(struct rt2x00_dev 
> *rt2x00dev,
>  if (!rates)
>  goto exit_free_channels;
> 
> +rt2x00dev->chan_survey =
> +kcalloc(spec->num_channels, sizeof(struct rt2x00_chan_survey),
> +GFP_KERNEL);
> +if (!rt2x00dev->chan_survey)
> +goto exit_free_rates;
... this could be placed in rt2800_probe_hw_mode() just after
channel info array allocation ... 

> @@ -316,6 +316,15 @@ int rt2x00mac_config(struct ieee80211_hw *hw, u32 
> changed)
>  if (!test_bit(DEVICE_STATE_PRESENT, >flags))
>  return 0;
> 
> +/*
> + * To provide correct survey data for survey-based ACS algorithm
> + * we have to save survey data for current channel before switching.
> + */
> +if (rt2x00dev->ops->lib->update_survey &&
> +(changed & IEEE80211_CONF_CHANGE_CHANNEL)) {
... and this in rt2800_config()

Thanks
Stanislaw


Re: [PATCH] rtw88: pci: Use general byte arrays as the elements of RX ring

2019-07-30 Thread Stanislaw Gruszka
On Thu, Jul 25, 2019 at 04:09:26PM +0800, Jian-Hong Pan wrote:
> Each skb as the element in RX ring was expected with sized buffer 8216
> (RTK_PCI_RX_BUF_SIZE) bytes. However, the skb buffer's true size is
> 16640 bytes for alignment after allocated, x86_64 for example. And, the

rtw88 advertise IEEE80211_VHT_CAP_MAX_MPDU_LENGTH_11454, so maximum AMSDU
packet can be approximately 12kB. This might be accidental, but having
16kB skb's allow to handle such big AMSDUs. If you shrink buf size,
you can probably override memory after buffer end.

> difference will be enlarged 512 times (RTK_MAX_RX_DESC_NUM).
> To prevent that much wasted memory, this patch follows David's
> suggestion [1] and uses general buffer arrays, instead of skbs as the
> elements in RX ring.
> 
> [1] https://www.spinics.net/lists/linux-wireless/msg187870.html
> 
> Signed-off-by: Jian-Hong Pan 
> Cc: 

This does not fix any serious problem, it actually most likely 
introduce memory corruption problem described above. Should not
be targeted to stable anyway.

> - dev_kfree_skb_any(skb);
> + devm_kfree(rtwdev->dev, buf);

For what this is needed? devm_ allocations are used exactly to avoid
manual freeing.

> + len = pkt_stat.pkt_len + pkt_offset;
> + skb = dev_alloc_skb(len);
> + if (WARN_ONCE(!skb, "rx routine starvation\n"))
>   goto next_rp;
>  
>   /* put the DMA data including rx_desc from phy to new skb */
> - skb_put_data(new, skb->data, new_len);
> + skb_put_data(skb, rx_desc, len);

Coping big packets it quite inefficient. What drivers usually do is 
copy only for small packets and for big ones allocate new rx buf
(drop packet alloc if fail) and pas old buf to network stack via
skb_add_rx_frag(). See iwlmvm as example.

Stanislaw


Re: [PATCH net-next] rt2800usb: Add new rt2800usb device PLANEX GW-USMicroN

2019-07-29 Thread Stanislaw Gruszka
On Sun, Jul 28, 2019 at 11:07:42PM +0900, Masanari Iida wrote:
> This patch add a device ID for PLANEX GW-USMicroN.
> Without this patch, I had to echo the device IDs in order to
> recognize the device.
> 
> # lsusb |grep PLANEX
> Bus 002 Device 005: ID 2019:ed14 PLANEX GW-USMicroN
> 
> Signed-off-by: Masanari Iida 

Acked-by: Stanislaw Gruszka 


Re: [PATCH] sched/cputime: make scale_stime() more precise

2019-07-23 Thread Stanislaw Gruszka
On Mon, Jul 22, 2019 at 10:00:34PM +0200, Peter Zijlstra wrote:
> On Mon, Jul 22, 2019 at 12:52:41PM +0200, Stanislaw Gruszka wrote:
> > On Fri, Jul 19, 2019 at 01:03:49PM +0200, Peter Zijlstra wrote:
> > > > shows the problem even when sum_exec_runtime is not that big: 30 
> > > > secs.
> > > > 
> > > > The new implementation of scale_stime() does the additional 
> > > > div64_u64_rem()
> > > > in a loop but see the comment, as long it is used by cputime_adjust() 
> > > > this
> > > > can happen only once.
> > > 
> > > That only shows something after long long staring :/ There's no words on
> > > what the output actually means or what would've been expected.
> > > 
> > > Also, your example is incomplete; the below is a test for scale_stime();
> > > from this we can see that the division results in too large a number,
> > > but, important for our use-case in cputime_adjust(), it is a step
> > > function (due to loss in precision) and for every plateau we shift
> > > runtime into the wrong bucket.
> > > 
> > > Your proposed function works; but is atrocious, esp. on 32bit. That
> > > said, before we 'fixed' it, it had similar horrible divisions in, see
> > > commit 55eaa7c1f511 ("sched: Avoid cputime scaling overflow").
> > > 
> > > Included below is also an x86_64 implementation in 2 instructions.
> > > 
> > > I'm still trying see if there's anything saner we can do...
> > 
> > I was always proponent of removing scaling and export raw values
> > and sum_exec_runtime. But that has obvious drawback, reintroduce
> > 'top hiding' issue.
> 
> I think (but didn't grep) that we actually export sum_exec_runtime in
> /proc/ *somewhere*.

Yes, it is, in /proc/[PID]/schedstat for CONFIG_SCHEDSTAT=y and in
/proc/[PID]/sched for CONFIG_SCHED_DEBUG=y

> > But maybe we can export raw values in separate file i.e.
> > /proc/[pid]/raw_cpu_times ? So applications that require more precise
> > cputime values for very long-living processes can use this file.
> 
> There are no raw cpu_times, there are system and user samples, and
> samples are, by their very nature, an approximation. We just happen to
> track the samples in TICK_NSEC resolution these days, but they're still
> ticks (except on s390 and maybe other archs, which do time accounting in
> the syscall path).
> 
> But I think you'll find x86 people are quite opposed to doing TSC reads
> in syscall entry and exit :-)

By 'raw' I meant values that are stored in task_struct without
processing by cputime_adjust() -> scale_stime(). Idea is that
user space can make scaling using floating point, so do not have
to drop precision if numbers are big. 

That was discussed on my RFC and PATCH series posts:
https://lore.kernel.org/lkml/1364489605-5443-1-git-send-email-sgrus...@redhat.com/
https://lore.kernel.org/lkml/1365066635-2959-1-git-send-email-sgrus...@redhat.com/

There was objection that even if kernel does not tell what utime/stime
values mean exactly, users already got used to scaled behaviour and
changing this is 'semantic' ABI breakage. And obviously this would make
'top hiding' worm working again for unpatched top.

So perhaps we can add new exports of not-scaled utime/stime and achieve
the same goal without changing the meaning of existing fields. From
other hand, maybe nowadays better tools exist to provide exact cputimes
to user space i.e. 'perf stat' or bpf, so proposed addition is unneeded.

Stanislaw


Re: [PATCH] sched/cputime: make scale_stime() more precise

2019-07-22 Thread Stanislaw Gruszka
On Fri, Jul 19, 2019 at 01:03:49PM +0200, Peter Zijlstra wrote:
> > shows the problem even when sum_exec_runtime is not that big: 30 secs.
> > 
> > The new implementation of scale_stime() does the additional div64_u64_rem()
> > in a loop but see the comment, as long it is used by cputime_adjust() this
> > can happen only once.
> 
> That only shows something after long long staring :/ There's no words on
> what the output actually means or what would've been expected.
> 
> Also, your example is incomplete; the below is a test for scale_stime();
> from this we can see that the division results in too large a number,
> but, important for our use-case in cputime_adjust(), it is a step
> function (due to loss in precision) and for every plateau we shift
> runtime into the wrong bucket.
> 
> Your proposed function works; but is atrocious, esp. on 32bit. That
> said, before we 'fixed' it, it had similar horrible divisions in, see
> commit 55eaa7c1f511 ("sched: Avoid cputime scaling overflow").
> 
> Included below is also an x86_64 implementation in 2 instructions.
> 
> I'm still trying see if there's anything saner we can do...

I was always proponent of removing scaling and export raw values
and sum_exec_runtime. But that has obvious drawback, reintroduce
'top hiding' issue.

But maybe we can export raw values in separate file i.e.
/proc/[pid]/raw_cpu_times ? So applications that require more precise
cputime values for very long-living processes can use this file.

Stanislaw


Re: [PATCH] mac80211: don't warn about CW params when not using them

2019-07-18 Thread Stanislaw Gruszka
On Wed, Jul 17, 2019 at 06:57:12PM -0700, Brian Norris wrote:
> ieee80211_set_wmm_default() normally sets up the initial CW min/max for
> each queue, except that it skips doing this if the driver doesn't
> support ->conf_tx. We still end up calling drv_conf_tx() in some cases
> (e.g., ieee80211_reconfig()), which also still won't do anything
> useful...except it complains here about the invalid CW parameters.
> 
> Let's just skip the WARN if we weren't going to do anything useful with
> the parameters.
> 
> Signed-off-by: Brian Norris 
> ---
> Noticed because rtw88 does not currently implement .conf_tx()
> 
> I think there are several ways to slice this one. I picked one fix,
> which may not be the best one.

Fix looks fine for me. However I think rtw88 should implement
drv_conf_tx() because parameters can be different on different
network setups and maybe more important WMM/AC parameters become
quite recently part of ETSI regulatory.

Stanislaw 


Re: [PATCH v2 2/2] rt2x00usb: remove unnecessary rx flag checks

2019-07-01 Thread Stanislaw Gruszka
On Mon, Jul 01, 2019 at 12:53:14PM +0200, Soeren Moch wrote:
> In contrast to the TX path, there is no need to separately read the transfer
> status from the device after receiving RX data. Consequently, there is no
> real STATUS_PENDING RX processing queue entry state.
> Remove the unnecessary ENTRY_DATA_STATUS_PENDING flag checks from the RX path.
> Also remove the misleading comment about reading RX status from device.
> 
> Suggested-by: Stanislaw Gruszka 
> Signed-off-by: Soeren Moch 

Acked-by: Stanislaw Gruszka 


Re: [PATCH v2 1/2] rt2x00usb: fix rx queue hang

2019-07-01 Thread Stanislaw Gruszka
On Mon, Jul 01, 2019 at 12:53:13PM +0200, Soeren Moch wrote:
> Since commit ed194d136769 ("usb: core: remove local_irq_save() around
>  ->complete() handler") the handler rt2x00usb_interrupt_rxdone() is
> not running with interrupts disabled anymore. So this completion handler
> is not guaranteed to run completely before workqueue processing starts
> for the same queue entry.
> Be sure to set all other flags in the entry correctly before marking
> this entry ready for workqueue processing. This way we cannot miss error
> conditions that need to be signalled from the completion handler to the
> worker thread.
> Note that rt2x00usb_work_rxdone() processes all available entries, not
> only such for which queue_work() was called.
> 
> This patch is similar to what commit df71c9cfceea ("rt2x00: fix order
> of entry flags modification") did for TX processing.
> 
> This fixes a regression on a RT5370 based wifi stick in AP mode, which
> suddenly stopped data transmission after some period of heavy load. Also
> stopping the hanging hostapd resulted in the error message "ieee80211
> phy0: rt2x00queue_flush_queue: Warning - Queue 14 failed to flush".
> Other operation modes are probably affected as well, this just was
> the used testcase.
> 
> Fixes: ed194d136769 ("usb: core: remove local_irq_save() around ->complete() 
> handler")
> Cc: sta...@vger.kernel.org # 4.20+
> Signed-off-by: Soeren Moch 

Acked-by: Stanislaw Gruszka 




Re: [PATCH] rt2x00: fix rx queue hang

2019-07-01 Thread Stanislaw Gruszka
On Mon, Jul 01, 2019 at 12:49:50PM +0200, Soeren Moch wrote:
> Hello!
> 
> On 29.06.19 10:50, Stanislaw Gruszka wrote:
> > Hello
> >
> > On Wed, Jun 26, 2019 at 03:28:00PM +0200, Soeren Moch wrote:
> >> Hi Stanislaw,
> >>
> >> the good news is: your patch below also solves the issue for me. But
> >> removing the ENTRY_DATA_STATUS_PENDING check in
> >> rt2x00usb_kick_rx_entry() alone does not help, while removing this check
> >> in rt2x00usb_work_rxdone() alone does the trick.
> >>
> >> So the real race seems to be that the flags set in the completion
> >> handler are not yet visible on the cpu core running the workqueue. And
> >> because the worker is not rescheduled when aborted, the entry can just
> >> wait forever.
> >> Do you think this could make sense?
> > Yes.
> >
> >>> I'm somewhat reluctant to change the order, because TX processing
> >>> might relay on it (we first mark we wait for TX status and
> >>> then mark entry is no longer owned by hardware).
> >> OK, maybe it's just good luck that changing the order solves the rx
> >> problem. Or can memory barriers associated with the spinlock in
> >> rt2x00lib_dmadone() be responsible for that?
> >> (I'm testing on a armv7 system, Cortex-A9 quadcore.)
> > I'm not sure, rt2x00queue_index_inc() also disable/enable interrupts,
> > so maybe that make race not reproducible. 
> I tested some more, the race is between setting ENTRY_DATA_IO_FAILED (if
> needed) and enabling workqueue processing. This enabling was done via
> ENTRY_DATA_STATUS_PENDING in my patch. So setting
> ENTRY_DATA_STATUS_PENDING behind the spinlock in
> rt2x00lib_dmadone()/rt2x00queue_index_inc() moved this very close to
> setting of ENTRY_DATA_IO_FAILED (if needed). While still in the wrong
> order, this made it very unlikely for the race to show up.
> >
> >> While looking at it, why we double-clear ENTRY_OWNER_DEVICE_DATA in
> >> rt2x00usb_interrupt_rxdone() directly and in rt2x00lib_dmadone() again,
> > rt2x00lib_dmadone() is called also on other palaces (error paths)
> > when we have to clear flags.
> Yes, but also clearing ENTRY_OWNER_DEVICE_DATA in
> rt2x00usb_interrupt_rxdone() directly is not necessary and can lead to
> the wrong processing order.
> >>  while not doing the same for tx? 
> > If I remember correctly we have some races on rx (not happened on tx)
> > that was solved by using test_and_clear_bit(ENTRY_OWNER_DEVICE_DATA).
> I searched in the history, it actually was the other way around. You
> changed test_and_clear_bit() to test_bit() in the TX path. I think this
> is also the right way to go in RX.
> >> Would it make more sense to possibly
> >> set ENTRY_DATA_IO_FAILED before clearing ENTRY_OWNER_DEVICE_DATA in
> >> rt2x00usb_interrupt_rxdone() as for tx?
> > I don't think so, ENTRY_DATA_IO_FAILED should be only set on error
> > case.
> 
> Yes of course. But if the error occurs, it should be signalled before
> enabling the workqueue processing, see the race described above.
> 
> After some more testing I'm convinced that this would be the right fix
> for this problem. I will send a v2 of this patch accordingly.

Great, now I understand the problem. Thank you very much!

Stanislaw


Re: [PATCH] rt2x00: fix rx queue hang

2019-06-29 Thread Stanislaw Gruszka
Hello

On Wed, Jun 26, 2019 at 03:28:00PM +0200, Soeren Moch wrote:
> Hi Stanislaw,
> 
> the good news is: your patch below also solves the issue for me. But
> removing the ENTRY_DATA_STATUS_PENDING check in
> rt2x00usb_kick_rx_entry() alone does not help, while removing this check
> in rt2x00usb_work_rxdone() alone does the trick.
> 
> So the real race seems to be that the flags set in the completion
> handler are not yet visible on the cpu core running the workqueue. And
> because the worker is not rescheduled when aborted, the entry can just
> wait forever.
> Do you think this could make sense?

Yes.

> > I'm somewhat reluctant to change the order, because TX processing
> > might relay on it (we first mark we wait for TX status and
> > then mark entry is no longer owned by hardware).
> OK, maybe it's just good luck that changing the order solves the rx
> problem. Or can memory barriers associated with the spinlock in
> rt2x00lib_dmadone() be responsible for that?
> (I'm testing on a armv7 system, Cortex-A9 quadcore.)

I'm not sure, rt2x00queue_index_inc() also disable/enable interrupts,
so maybe that make race not reproducible. 

> While looking at it, why we double-clear ENTRY_OWNER_DEVICE_DATA in
> rt2x00usb_interrupt_rxdone() directly and in rt2x00lib_dmadone() again,

rt2x00lib_dmadone() is called also on other palaces (error paths)
when we have to clear flags.

> while not doing the same for tx? 

If I remember correctly we have some races on rx (not happened on tx)
that was solved by using test_and_clear_bit(ENTRY_OWNER_DEVICE_DATA).

> Would it make more sense to possibly
> set ENTRY_DATA_IO_FAILED before clearing ENTRY_OWNER_DEVICE_DATA in
> rt2x00usb_interrupt_rxdone() as for tx?

I don't think so, ENTRY_DATA_IO_FAILED should be only set on error
case.

> >  However on RX
> > side ENTRY_DATA_STATUS_PENDING bit make no sense as we do not
> > wait for status. We should remove ENTRY_DATA_STATUS_PENDING on
> > RX side and perhaps this also will solve issue you observe.
> I agree that removing the unnecessary checks is a good idea in any case.
> > Could you please check below patch, if it fixes the problem as well?
> At least I could not trigger the problem within transferring 10GB of
> data. But maybe the probability for triggering this bug is just lower
> because ENTRY_OWNER_DEVICE_DATA is cleared some time before
> ENTRY_DATA_STATUS_PENDING is set?

Not sure. Anyway, could you post patch removing not needed checks
with proper description/changelog ?

Stanislaw


Re: [PATCH] rt2x00: fix rx queue hang

2019-06-25 Thread Stanislaw Gruszka
Hello

On Fri, Jun 21, 2019 at 01:30:01PM +0200, Soeren Moch wrote:
> On 18.06.19 11:34, Stanislaw Gruszka wrote:
> > Hi
> >
> > On Mon, Jun 17, 2019 at 11:46:56AM +0200, Soeren Moch wrote:
> >> Since commit ed194d136769 ("usb: core: remove local_irq_save() around
> >>  ->complete() handler") the handlers rt2x00usb_interrupt_rxdone() and
> >> rt2x00usb_interrupt_txdone() are not running with interrupts disabled
> >> anymore. So these handlers are not guaranteed to run completely before
> >> workqueue processing starts. So only mark entries ready for workqueue
> >> processing after proper accounting in the dma done queue.
> > It was always the case on SMP machines that rt2x00usb_interrupt_{tx/rx}done
> > can run concurrently with rt2x00_work_{rx,tx}done, so I do not
> > understand how removing local_irq_save() around complete handler broke
> > things.
> I think because completion handlers can be interrupted now and scheduled
> away
> in the middle of processing.
> > Have you reverted commit ed194d136769 and the revert does solve the problem 
> > ?
> Yes, I already sent a patch for this, see [1]. But this was not considered
> an acceptablesolution. Especially RT folks do not like code running with
> interrupts disabled,particularly when trying to acquire spinlocks then.
> 
> [1] https://lkml.org/lkml/2019/5/31/863
> > Between 4.19 and 4.20 we have some quite big changes in rt2x00 driver:
> >
> > 0240564430c0 rt2800: flush and txstatus rework for rt2800mmio
> > adf26a356f13 rt2x00: use different txstatus timeouts when flushing
> > 5022efb50f62 rt2x00: do not check for txstatus timeout every time on tasklet
> > 0b0d556e0ebb rt2800mmio: use txdone/txstatus routines from lib
> > 5c656c71b1bf rt2800: move usb specific txdone/txstatus routines to rt2800lib
> >
> > so I'm a bit afraid that one of those changes is real cause of
> > the issue not ed194d136769 .
> I tested 4.20 and 5.1 and see the exact same behavior. Reverting this
> usb core patchsolves the problem.
> 4.19.x (before this usb core patch) is running fine.
> >> Note that rt2x00usb_work_rxdone() processes all available entries, not
> >> only such for which queue_work() was called.
> >>
> >> This fixes a regression on a RT5370 based wifi stick in AP mode, which
> >> suddenly stopped data transmission after some period of heavy load. Also
> >> stopping the hanging hostapd resulted in the error message "ieee80211
> >> phy0: rt2x00queue_flush_queue: Warning - Queue 14 failed to flush".
> >> Other operation modes are probably affected as well, this just was
> >> the used testcase.
> > Do you know what actually make the traffic stop,
> > TX queue hung or RX queue hung?
> I think RX queue hang, as stated in the patch title. "Queue 14" means QID_RX
> (rt2x00queue.h, enum data_queue_qid).
> I also tried to re-add local_irq_save() in only one of the handlers. Adding
> this tort2x00usb_interrupt_rxdone() alone solved the issue, while doing so
> for tx alonedid not.
> 
> Note that this doesn't mean there is no problem for tx, that's maybe
> just more
> difficult to trigger.
> >> diff --git a/drivers/net/wireless/ralink/rt2x00/rt2x00dev.c 
> >> b/drivers/net/wireless/ralink/rt2x00/rt2x00dev.c
> >> index 1b08b01db27b..9c102a501ee6 100644
> >> --- a/drivers/net/wireless/ralink/rt2x00/rt2x00dev.c
> >> +++ b/drivers/net/wireless/ralink/rt2x00/rt2x00dev.c
> >> @@ -263,9 +263,9 @@ EXPORT_SYMBOL_GPL(rt2x00lib_dmastart);
> >>
> >>  void rt2x00lib_dmadone(struct queue_entry *entry)
> >>  {
> >> -  set_bit(ENTRY_DATA_STATUS_PENDING, >flags);
> >>clear_bit(ENTRY_OWNER_DEVICE_DATA, >flags);
> >>rt2x00queue_index_inc(entry, Q_INDEX_DMA_DONE);
> >> +  set_bit(ENTRY_DATA_STATUS_PENDING, >flags);
> > Unfortunately I do not understand how this suppose to fix the problem,
> > could you elaborate more about this change?
> >
> Re-adding local_irq_save() around thisrt2x00lib_dmadone()solved
> the issue. So I also tried to reverse the order of these calls.
> It seems totally plausible to me, that the correct sequence is to
> first clear the device assignment, then to set the status to dma_done,
> then to trigger the workqueue processing for this entry. When the handler
> is scheduled away in the middle of this sequence, now there is no
> strange state where the entry can be processed by the workqueue while
> not declared dma_done for it.
> With this changed sequence there is no need anymore to disable interrupts
> for solving the hang issue.

Thanks very 

Re: [PATCH] rt2x00: fix rx queue hang

2019-06-18 Thread Stanislaw Gruszka
Hi

On Mon, Jun 17, 2019 at 11:46:56AM +0200, Soeren Moch wrote:
> Since commit ed194d136769 ("usb: core: remove local_irq_save() around
>  ->complete() handler") the handlers rt2x00usb_interrupt_rxdone() and
> rt2x00usb_interrupt_txdone() are not running with interrupts disabled
> anymore. So these handlers are not guaranteed to run completely before
> workqueue processing starts. So only mark entries ready for workqueue
> processing after proper accounting in the dma done queue.

It was always the case on SMP machines that rt2x00usb_interrupt_{tx/rx}done
can run concurrently with rt2x00_work_{rx,tx}done, so I do not
understand how removing local_irq_save() around complete handler broke
things.

Have you reverted commit ed194d136769 and the revert does solve the problem ?

Between 4.19 and 4.20 we have some quite big changes in rt2x00 driver:

0240564430c0 rt2800: flush and txstatus rework for rt2800mmio
adf26a356f13 rt2x00: use different txstatus timeouts when flushing
5022efb50f62 rt2x00: do not check for txstatus timeout every time on tasklet
0b0d556e0ebb rt2800mmio: use txdone/txstatus routines from lib
5c656c71b1bf rt2800: move usb specific txdone/txstatus routines to rt2800lib

so I'm a bit afraid that one of those changes is real cause of
the issue not ed194d136769 .

> Note that rt2x00usb_work_rxdone() processes all available entries, not
> only such for which queue_work() was called.
> 
> This fixes a regression on a RT5370 based wifi stick in AP mode, which
> suddenly stopped data transmission after some period of heavy load. Also
> stopping the hanging hostapd resulted in the error message "ieee80211
> phy0: rt2x00queue_flush_queue: Warning - Queue 14 failed to flush".
> Other operation modes are probably affected as well, this just was
> the used testcase.

Do you know what actually make the traffic stop,
TX queue hung or RX queue hung?

> diff --git a/drivers/net/wireless/ralink/rt2x00/rt2x00dev.c 
> b/drivers/net/wireless/ralink/rt2x00/rt2x00dev.c
> index 1b08b01db27b..9c102a501ee6 100644
> --- a/drivers/net/wireless/ralink/rt2x00/rt2x00dev.c
> +++ b/drivers/net/wireless/ralink/rt2x00/rt2x00dev.c
> @@ -263,9 +263,9 @@ EXPORT_SYMBOL_GPL(rt2x00lib_dmastart);
> 
>  void rt2x00lib_dmadone(struct queue_entry *entry)
>  {
> - set_bit(ENTRY_DATA_STATUS_PENDING, >flags);
>   clear_bit(ENTRY_OWNER_DEVICE_DATA, >flags);
>   rt2x00queue_index_inc(entry, Q_INDEX_DMA_DONE);
> + set_bit(ENTRY_DATA_STATUS_PENDING, >flags);

Unfortunately I do not understand how this suppose to fix the problem,
could you elaborate more about this change?

Stanislaw


Re: [PATCH] mt76: fix endianness sparse warnings

2019-04-26 Thread Stanislaw Gruszka
On Thu, Apr 25, 2019 at 10:42:01PM +0800, Ryder Lee wrote:
> Fix many warnings with incorrect endian assumptions.
> 
> Reported-by: kbuild test robot 
> Signed-off-by: Ryder Lee 
> ---
>  drivers/net/wireless/mediatek/mt76/mt7603/mac.c |  2 +-
>  drivers/net/wireless/mediatek/mt76/mt7615/mac.c | 12 ++--
>  2 files changed, 7 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/net/wireless/mediatek/mt76/mt7603/mac.c 
> b/drivers/net/wireless/mediatek/mt76/mt7603/mac.c
> index 2f2961ee0a92..af5769d05e6f 100644
> --- a/drivers/net/wireless/mediatek/mt76/mt7603/mac.c
> +++ b/drivers/net/wireless/mediatek/mt76/mt7603/mac.c
> @@ -627,7 +627,7 @@ mt7603_mac_fill_rx(struct mt7603_dev *dev, struct sk_buff 
> *skb)
>   status->aggr = unicast &&
>  !ieee80211_is_qos_nullfunc(hdr->frame_control);
>   status->tid = *ieee80211_get_qos_ctl(hdr) & IEEE80211_QOS_CTL_TID_MASK;
> - status->seqno = IEEE80211_SEQ_TO_SN(hdr->seq_ctrl);
> + status->seqno = IEEE80211_SEQ_TO_SN(__le16_to_cpu(hdr->seq_ctrl));
>  
>   return 0;
>  }
> diff --git a/drivers/net/wireless/mediatek/mt76/mt7615/mac.c 
> b/drivers/net/wireless/mediatek/mt76/mt7615/mac.c
> index 1bf3e7b5f6a7..4b934b0f5a39 100644
> --- a/drivers/net/wireless/mediatek/mt76/mt7615/mac.c
> +++ b/drivers/net/wireless/mediatek/mt76/mt7615/mac.c
> @@ -235,7 +235,7 @@ int mt7615_mac_fill_rx(struct mt7615_dev *dev, struct 
> sk_buff *skb)
>   status->aggr = unicast &&
>  !ieee80211_is_qos_nullfunc(hdr->frame_control);
>   status->tid = *ieee80211_get_qos_ctl(hdr) & IEEE80211_QOS_CTL_TID_MASK;
> - status->seqno = IEEE80211_SEQ_TO_SN(hdr->seq_ctrl);
> + status->seqno = IEEE80211_SEQ_TO_SN(__le16_to_cpu(hdr->seq_ctrl));

Is there any reason to use underscored version instead of standard 
le16_to_cpu() ?

Stanislaw


Re: MT76x2U crashes XHCI driver on AMD Ryzen system

2019-03-03 Thread Stanislaw Gruszka
On Thu, Feb 28, 2019 at 02:40:29PM +0100, Joerg Roedel wrote:
> On Thu, Feb 28, 2019 at 01:19:48PM +0100, Stanislaw Gruszka wrote:
> > Nevermind, the patch is wrong, s->dma_address is initalized in 
> > sg_num_pages().
> 
> Yes, it is. In sg_num_pages() the offset into the IOMMU mapping is
> stored in s->dma_address, taking also the segment boundary mask into
> account. map_sg() later only adds the base-address to that.

I have some more info about the issues in
https://bugzilla.kernel.org/show_bug.cgi?id=202673 

We have some bugs in mt76. Apparently we should not use
page_frag_alloc() with size bigger than PAGE_SIZE as page_frag_alloc()
can fallback to single page allocation. And also we should not make
sizes unaligned as pointed in commit:
3bed3cc4156e ("net: Do not allocate page fragments that are not skb aligned"

However after fixing that mt76usb still did not work. To make things
work we had to change rx frag size from 2048 to PAGE_SIZE and change
virt_to_head_page() to virt_to_page() when setting SG's.

I think I understand why first change was needed. If we do 2 separate
dma maps of 2 different buffers in single page i.e (PAGE + off=0
and PAGE + off=2048) it causes problem. So either map_sg() return
error which mt76usb does not handle correctly or there is issue
in AMD IOMMU because two dma maps use the same page.

But I don't understand why the second change was needed. Without
it we have issue with incorrect page->_refcount . It is somehow
related with AMD IOMMU, because on different platforms we do not
have such problems.

Joerg, could you look at this ? Thanks. 

Stanislaw


Re: MT76x2U crashes XHCI driver on AMD Ryzen system

2019-02-28 Thread Stanislaw Gruszka
On Thu, Feb 28, 2019 at 11:42:24AM +0100, Stanislaw Gruszka wrote:
> On Thu, Feb 28, 2019 at 10:04:12AM +0100, Stanislaw Gruszka wrote:
> > On Tue, Feb 26, 2019 at 12:24:08PM +0100, Stanislaw Gruszka wrote:
> > > On Tue, Feb 26, 2019 at 11:44:13AM +0100, Joerg Roedel wrote:
> > > > On Tue, Feb 26, 2019 at 11:34:51AM +0100, Stanislaw Gruszka wrote:
> > > > > On Tue, Feb 26, 2019 at 11:05:36AM +0100, Joerg Roedel wrote:
> > > > > If sg->offset > PAGE_SIZE is fine then most likely we have problem 
> > > > > with
> > > > > alignment.
> > > > 
> > > > The map_sg implementation in the AMD IOMMU driver uses sg_phys() which
> > > > handles the sg->page + sg->offset calculation fine.
> > > > 
> > > > > Note hat issue is with dma_map_sg(), switching to dma_map_single()
> > > > > by using urb->transfer_buffer instead of urb->sg make things work
> > > > > on AMD IOMMU.
> > > > 
> > > > On the other hand this points to a bug in the driver, I'll look further
> > > > if I can spot something there.
> > > 
> > > I think so too. And I have done some changes that avoid strange allocation
> > > scheme and use usb synchronous messages instead of allocating buffers
> > > with unaligned sizes. However things work ok on Intel IOMMU and 
> > > there is no documentation what are dma_map_sg() requirement versus
> > > dma_map_single() which works. I think there are some unwritten
> > > requirements and things can work on some platforms and fails on others
> > > (different IOMMUs, no-IOMMU on some ARCHes)  
> > 
> > For the record: we have another bug report with this issue:
> > https://bugzilla.kernel.org/show_bug.cgi?id=202673
> > 
> > I provided there patch that change alignment for page_frag_alloc() and
> > it did not fixed the problem. So this is not alignment issue.
> > Now I think it could be page->refcount issue ...
> 
> I looked at the map_sg() in amd_iommu.c code and one line looks suspicious
> to me, seems we can use not correctly initialized s->dma_address (should be 0,
> but I think can be non-zero if SG was reused). The code also seems do
> not do correct thing if there is more than one SG with multiple pages
> on individual segments. Something like in below patch seems to be more
> appropriate to me (not tested nor compiled).

Nevermind, the patch is wrong, s->dma_address is initalized in sg_num_pages().

Stanislaw


Re: MT76x2U crashes XHCI driver on AMD Ryzen system

2019-02-28 Thread Stanislaw Gruszka
On Thu, Feb 28, 2019 at 10:04:12AM +0100, Stanislaw Gruszka wrote:
> On Tue, Feb 26, 2019 at 12:24:08PM +0100, Stanislaw Gruszka wrote:
> > On Tue, Feb 26, 2019 at 11:44:13AM +0100, Joerg Roedel wrote:
> > > On Tue, Feb 26, 2019 at 11:34:51AM +0100, Stanislaw Gruszka wrote:
> > > > On Tue, Feb 26, 2019 at 11:05:36AM +0100, Joerg Roedel wrote:
> > > > If sg->offset > PAGE_SIZE is fine then most likely we have problem with
> > > > alignment.
> > > 
> > > The map_sg implementation in the AMD IOMMU driver uses sg_phys() which
> > > handles the sg->page + sg->offset calculation fine.
> > > 
> > > > Note hat issue is with dma_map_sg(), switching to dma_map_single()
> > > > by using urb->transfer_buffer instead of urb->sg make things work
> > > > on AMD IOMMU.
> > > 
> > > On the other hand this points to a bug in the driver, I'll look further
> > > if I can spot something there.
> > 
> > I think so too. And I have done some changes that avoid strange allocation
> > scheme and use usb synchronous messages instead of allocating buffers
> > with unaligned sizes. However things work ok on Intel IOMMU and 
> > there is no documentation what are dma_map_sg() requirement versus
> > dma_map_single() which works. I think there are some unwritten
> > requirements and things can work on some platforms and fails on others
> > (different IOMMUs, no-IOMMU on some ARCHes)  
> 
> For the record: we have another bug report with this issue:
> https://bugzilla.kernel.org/show_bug.cgi?id=202673
> 
> I provided there patch that change alignment for page_frag_alloc() and
> it did not fixed the problem. So this is not alignment issue.
> Now I think it could be page->refcount issue ...

I looked at the map_sg() in amd_iommu.c code and one line looks suspicious
to me, seems we can use not correctly initialized s->dma_address (should be 0,
but I think can be non-zero if SG was reused). The code also seems do
not do correct thing if there is more than one SG with multiple pages
on individual segments. Something like in below patch seems to be more
appropriate to me (not tested nor compiled).

Stanislaw

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 34c9aa76a7bd..9c8887250b82 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -2517,6 +2517,7 @@ static int map_sg(struct device *dev, struct scatterlist 
*sglist,
prot = dir2prot(direction);
 
/* Map all sg entries */
+   npages = 0;
for_each_sg(sglist, s, nelems, i) {
int j, pages = iommu_num_pages(sg_phys(s), s->length, 
PAGE_SIZE);
 
@@ -2524,7 +2525,7 @@ static int map_sg(struct device *dev, struct scatterlist 
*sglist,
unsigned long bus_addr, phys_addr;
int ret;
 
-   bus_addr  = address + s->dma_address + (j << 
PAGE_SHIFT);
+   bus_addr  = address + ((npages + j) << PAGE_SHIFT);
phys_addr = (sg_phys(s) & PAGE_MASK) + (j << 
PAGE_SHIFT);
ret = iommu_map_page(domain, bus_addr, phys_addr, 
PAGE_SIZE, prot, GFP_ATOMIC);
if (ret)
@@ -2532,6 +2533,8 @@ static int map_sg(struct device *dev, struct scatterlist 
*sglist,
 
mapped_pages += 1;
}
+
+   npages += mapped_pages;
}
 
/* Everything is mapped - write the right values into s->dma_address */




Re: MT76x2U crashes XHCI driver on AMD Ryzen system

2019-02-28 Thread Stanislaw Gruszka
On Tue, Feb 26, 2019 at 12:24:08PM +0100, Stanislaw Gruszka wrote:
> On Tue, Feb 26, 2019 at 11:44:13AM +0100, Joerg Roedel wrote:
> > On Tue, Feb 26, 2019 at 11:34:51AM +0100, Stanislaw Gruszka wrote:
> > > On Tue, Feb 26, 2019 at 11:05:36AM +0100, Joerg Roedel wrote:
> > > If sg->offset > PAGE_SIZE is fine then most likely we have problem with
> > > alignment.
> > 
> > The map_sg implementation in the AMD IOMMU driver uses sg_phys() which
> > handles the sg->page + sg->offset calculation fine.
> > 
> > > Note hat issue is with dma_map_sg(), switching to dma_map_single()
> > > by using urb->transfer_buffer instead of urb->sg make things work
> > > on AMD IOMMU.
> > 
> > On the other hand this points to a bug in the driver, I'll look further
> > if I can spot something there.
> 
> I think so too. And I have done some changes that avoid strange allocation
> scheme and use usb synchronous messages instead of allocating buffers
> with unaligned sizes. However things work ok on Intel IOMMU and 
> there is no documentation what are dma_map_sg() requirement versus
> dma_map_single() which works. I think there are some unwritten
> requirements and things can work on some platforms and fails on others
> (different IOMMUs, no-IOMMU on some ARCHes)  

For the record: we have another bug report with this issue:
https://bugzilla.kernel.org/show_bug.cgi?id=202673

I provided there patch that change alignment for page_frag_alloc() and
it did not fixed the problem. So this is not alignment issue.
Now I think it could be page->refcount issue ...

Stanislaw


Re: MT76x2U crashes XHCI driver on AMD Ryzen system

2019-02-26 Thread Stanislaw Gruszka
On Tue, Feb 26, 2019 at 11:44:13AM +0100, Joerg Roedel wrote:
> On Tue, Feb 26, 2019 at 11:34:51AM +0100, Stanislaw Gruszka wrote:
> > On Tue, Feb 26, 2019 at 11:05:36AM +0100, Joerg Roedel wrote:
> > If sg->offset > PAGE_SIZE is fine then most likely we have problem with
> > alignment.
> 
> The map_sg implementation in the AMD IOMMU driver uses sg_phys() which
> handles the sg->page + sg->offset calculation fine.
> 
> > Note hat issue is with dma_map_sg(), switching to dma_map_single()
> > by using urb->transfer_buffer instead of urb->sg make things work
> > on AMD IOMMU.
> 
> On the other hand this points to a bug in the driver, I'll look further
> if I can spot something there.

I think so too. And I have done some changes that avoid strange allocation
scheme and use usb synchronous messages instead of allocating buffers
with unaligned sizes. However things work ok on Intel IOMMU and 
there is no documentation what are dma_map_sg() requirement versus
dma_map_single() which works. I think there are some unwritten
requirements and things can work on some platforms and fails on others
(different IOMMUs, no-IOMMU on some ARCHes)  

Stanislaw


Re: MT76x2U crashes XHCI driver on AMD Ryzen system

2019-02-26 Thread Stanislaw Gruszka
On Tue, Feb 26, 2019 at 11:05:36AM +0100, Joerg Roedel wrote:
> On Mon, Feb 18, 2019 at 03:37:48PM +0100, Stanislaw Gruszka wrote:
> > 0001-mt76x02u-use-usb_bulk_msg-to-upload-firmware.patch
> > 0002-mt76usb-do-not-use-compound-head-page-for-SG-I-O.patch
> > 
> > Or problem can be solved by just one of it (either first or second).
> > 
> > Additionally I'm not 100% sure if
> > 
> > 0002-mt76usb-do-not-use-compound-head-page-for-SG-I-O.patch
> > 
> > is correct. So perhaps some IOMMU maintainer could look at it.
> 
> The patch looks good, but I don't understand why it is needed. The AMD
> IOMMU driver should handle sg->offset > PAGE_SIZE just fine. Can you
> verify that this is the problem? I will look into that again if it turns
> out there is bug in the IOMMU driver.

I'm try to get that information from bug reporter, but I can't get it so
far.

If sg->offset > PAGE_SIZE is fine then most likely we have problem with
alignment. We use page_frag_alloc() in mt76usb for buffer allocation
in scheme like this

page_frag_alloc(max_payload);   // something like 14434
page_frag_alloc(1024);
page_frag_alloc(2048)
page_frag_alloc(2048)
page_frag_alloc(2048)
...

page_frag_alloc works smart and fast way internally by allocating
fragments just but changing internal offset:

offset = nc->offset - fragsz;
if (unlikely(offset < 0)) {
page = virt_to_page(nc->va);
.
.
.

}

nc->offset = offset;
return nc->va + offset;

but unlike other allocators like kmalloc that make effort to provide
ARCH_DMA_MINALIGN buffers, it does not care about alignment. Above
scheme of allocation in mt76usb breaks it. 

Note hat issue is with dma_map_sg(), switching to dma_map_single()
by using urb->transfer_buffer instead of urb->sg make things work
on AMD IOMMU.

Stanislaw



Re: MT76x2U crashes XHCI driver on AMD Ryzen system

2019-02-19 Thread Stanislaw Gruszka
On Mon, Feb 18, 2019 at 05:01:59PM +, Robin Murphy wrote:
> On 18/02/2019 14:37, Stanislaw Gruszka wrote:
> [...]
> >Another issue is that dma_map_sg() & dma_map_page() may require some
> >constraints. I'm not sure about that and I want to clarify that with
> >CCed mm maintainers. I think DMA drivers may expect sg->offset < PAGE_SIZE
> >for both dma_map_sg() and dma_map_page(). Additionally dma_map_page()
> >maight expect that offset & length specify buffer within one page.
> 
> Luckily, this came up a while back[1] and we seemed to reach a
> consensus that sg->offset >= PAGE_SIZE for dma_map_sg() was weird
> but valid. IIRC it was only the Intel IOMMU code which failed to
> handle that case appropriately (and which I fixed) - the AMD IOMMU
> code always looked like it should be OK, but I'm not sure I've ever
> seen definitive test results (and I don't have hardware to do so
> myself).

Funny that we have problems on AMD IOMMU and not with Intel IOMMU.

> For dma_map_page(), length >= PAGE_SIZE should be perfectly valid
> and handled correctly. The offset >= PAGE_SIZE case is a bit harder
> to justify, but at the same time has less scope for the DMA API
> backend to get it wrong, so either way is likely to be OK in
> practice (in particular the AMD IOMMU code looks like it won't have
> a problem, since its map_page() implementation converts page and
> offset to a plain physical address before doing anything else).

Thanks for clarify this. So my patch which do:

-   page = virt_to_head_page(data);
+   page = virt_to_page(data);
offset = data - page_address(page);
sg_set_page(>sg[i], page, sglen, offset);

should not be necessary as IOMMU driver do exactly the same internally.

Are there any alignment requirement for offset for dma_map_{page,sg} ?
It will work with let say sg->offset=113 or we have make sure it is
aligned to some boundary. If so, what boundary ?

Stanislaw


Re: MT76x2U crashes XHCI driver on AMD Ryzen system

2019-02-18 Thread Stanislaw Gruszka
(cc: IOMMU & page_frag_alloc maintainers)

On Tue, Jan 15, 2019 at 10:04:01AM +0100, Lorenzo Bianconi wrote:
> > On Mon, Jan 14, 2019 at 1:18 AM Lorenzo Bianconi
> >  wrote:
> > >
> > > > On Sun, Jan 13, 2019 at 11:00 AM Lorenzo Bianconi
> > > >  wrote:
> > > > >
> > > > > >
> > > > > >
> > > > > > On Sun, Jan 13, 2019 at 5:33 AM, Lorenzo Bianconi 
> > > > > >  wrote:
> > > > > >
> > > > > > Direct. No VM used. This is the only peripheral causing this issue.
> > > > > >
> > > > > > Is the device connected to a usb3.0 port? If so, could you please 
> > > > > > try to connect the dongle to a 2.0 one?
> > > > > >
> > > > > > I tried through a USB 2.0 port. Shouldn't make a difference as they 
> > > > > > both use the xhci driver.
> > > > > >
> > > > >
> > > > > mt76x2u supports scatter-gather on usb 3.0 (not on 2.0)
> > > > Tried a USB 3 port. Same result.
> > > > >
> > > > > > Could you please double check if IOMMU is enabled?
> > > > > >
> > > > >
> > > > > Have you tried to disable it? Does it make any difference?
> > > > No idea how. UEFI doesn't seem to show anything similar.
> > > >
> > > > Similar bug report: https://bugzilla.kernel.org/show_bug.cgi?id=202241

FWIW: I provided some patches in the bugzilla, which were reported to
solve the problem. But I looking for confirmation if both are needed:

0001-mt76x02u-use-usb_bulk_msg-to-upload-firmware.patch
0002-mt76usb-do-not-use-compound-head-page-for-SG-I-O.patch

Or problem can be solved by just one of it (either first or second).

Additionally I'm not 100% sure if

0002-mt76usb-do-not-use-compound-head-page-for-SG-I-O.patch

is correct. So perhaps some IOMMU maintainer could look at it.

> > > You should be able to disable iommu using GRUB_CMDLINE_LINUX in
> > > /etc/default/grub (I guess setting iommu=off and reinstalling grub)
> > > https://wiki.gentoo.org/wiki/IOMMU_SWIOTLB
> > Yep. Working great now. I wonder what mt76 is doing to cause the crash 
> > though...
> 
> Thanks for bisecting the issue. 

Lorenzo, what you mean by 'bisecting' here ? Someone did 'git bisect'
on this issue?

> I think amd iommu does not support well usb scatter-gather
> (used by default in mt76u). I am working on a series in order to add the 
> possibility to
> disable it.

Even if that true that AMD IOMMU does not support 'well' SG (what I think
is not true) disabling SG in mt76 driver is not right solution. Right
solution would be propagate the issue to AMD IOMMU maintainers
(already CCed).

One problem in mt76 is page_frag_alloc() usage with different sizes.
page_frag_alloc() unlike like other allocators do not assure alignment
and relay on callers to provide buffers sizes that are aligned.
Unaligned buffer might then not be appropriate for DMA.

Another issue is that dma_map_sg() & dma_map_page() may require some 
constraints. I'm not sure about that and I want to clarify that with 
CCed mm maintainers. I think DMA drivers may expect sg->offset < PAGE_SIZE
for both dma_map_sg() and dma_map_page(). Additionally dma_map_page()
maight expect that offset & length specify buffer within one page.

Stanislaw


Re: [PATCH] lib/div64: off by one in shift

2019-01-29 Thread Stanislaw Gruszka
On Mon, Jan 28, 2019 at 09:45:03AM -0800, Andrew Morton wrote:
> On Mon, 28 Jan 2019 15:49:04 +0100 Stanislaw Gruszka  
> wrote:
> 
> > fls counts bits starting from 1 to 32 (returns 0 for zero argument).
> > If we add 1 we shift right one bit more and loose precision from
> > divisor, what cause function incorect results with some numbers.
> > 
> > Corrected code was tested in user-space, see bugzilla:
> > https://bugzilla.kernel.org/show_bug.cgi?id=202391
> 
> What are the usersoace-visible runtime effects of this change?

The bug is rather theoretical and for most cases divisor is within
32 bits, so problem is not visible. Moreover the bug is only for
32-bit systems and various users of it like btrfs are unlikely
to be run on such systems. So I do not consider this as very
important fix.

Thanks
Stanislaw


[PATCH] lib/div64: off by one in shift

2019-01-28 Thread Stanislaw Gruszka
fls counts bits starting from 1 to 32 (returns 0 for zero argument).
If we add 1 we shift right one bit more and loose precision from
divisor, what cause function incorect results with some numbers.

Corrected code was tested in user-space, see bugzilla:
https://bugzilla.kernel.org/show_bug.cgi?id=202391

Fixes: 658716d19f8f ("div64_u64(): improve precision on 32bit platforms")
Reported-and-tested-by: Siarhei Volkau 
Signed-off-by: Stanislaw Gruszka 
---
 lib/div64.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/div64.c b/lib/div64.c
index 01c8602bb6ff..ee146bb4c558 100644
--- a/lib/div64.c
+++ b/lib/div64.c
@@ -109,7 +109,7 @@ u64 div64_u64_rem(u64 dividend, u64 divisor, u64 *remainder)
quot = div_u64_rem(dividend, divisor, );
*remainder = rem32;
} else {
-   int n = 1 + fls(high);
+   int n = fls(high);
quot = div_u64(dividend >> n, divisor >> n);
 
if (quot != 0)
@@ -147,7 +147,7 @@ u64 div64_u64(u64 dividend, u64 divisor)
if (high == 0) {
quot = div_u64(dividend, divisor);
} else {
-   int n = 1 + fls(high);
+   int n = fls(high);
quot = div_u64(dividend >> n, divisor >> n);
 
if (quot != 0)
-- 
1.9.3



Re: [PATCH] rt2x00: no need to check return value of debugfs_create functions

2019-01-23 Thread Stanislaw Gruszka
On Tue, Jan 22, 2019 at 04:21:34PM +0100, Greg Kroah-Hartman wrote:
> When calling debugfs functions, there is no need to ever check the
> return value.  The function can work or not, but the code logic should
> never do something different based on this.
> 
> Cc: Stanislaw Gruszka 
> Cc: Helmut Schaa 
> Cc: Kalle Valo 
> Cc: linux-wirel...@vger.kernel.org
> Signed-off-by: Greg Kroah-Hartman 
Acked-by: Stanislaw Gruszka 


Re: [PATCH] iwlegacy: no need to check return value of debugfs_create functions

2019-01-23 Thread Stanislaw Gruszka
On Tue, Jan 22, 2019 at 04:21:18PM +0100, Greg Kroah-Hartman wrote:
> When calling debugfs functions, there is no need to ever check the
> return value.  The function can work or not, but the code logic should
> never do something different based on this.
> 
> Cc: Stanislaw Gruszka 
> Cc: Kalle Valo 
> Cc: linux-wirel...@vger.kernel.org
> Signed-off-by: Greg Kroah-Hartman 

Acked-by: Stanislaw Gruszka 


Re: [PATCH] rt61pci: Work around a firmware bug with shared keys

2019-01-16 Thread Stanislaw Gruszka
On Tue, Jan 15, 2019 at 02:01:29PM +, Bernd Edlinger wrote:
> Apparently the rt2x61 firmware fails temporarily to decode
> broadcast packets if the shared keys are not assigned
> in the "correct" sequence. At the same time unicast
> packets work fine, since they are encrypted with the
> pairwise key.
> 
> At least with WPA2 CCMP mode the shared keys are
> set in the following sequence: keyidx=1, 2, 1, 2.
> After a while only keyidx 2 gets decrypted, and
> keyidx 1 is ignored, probably because there is never
> a keyidx 3.
> 
> Symptoms are arping -b works for 10 minutes, since
> keyidx=2 is used for broadcast, and then it stops
> working for 10 minutes, because keyidx=1 is used.
> That failure mode repeats forever.
> 
> Note, the firmware does not even know which keyidx
> corresponds to which hw_key_idx so the firmware is
> trying to be smarter than the driver, which is bound
> to fail.
> 
> As workaround the function rt61pci_config_shared_key
> requests software decryption of the shared keys,
> by returning EOPNOTSUPP. However, pairwise keys are
> still handled by hardware which works just fine.
> 
> Signed-off-by: Bernd Edlinger 

Acked-by: Stanislaw Gruszka 


Re: [PATCH] net: wireless: iwlegacy: Fix possible data races in il4965_send_rxon_assoc()

2018-10-05 Thread Stanislaw Gruszka
On Thu, Oct 04, 2018 at 04:52:19PM +0800, Jia-Ju Bai wrote:
> On 2018/10/4 15:59, Stanislaw Gruszka wrote:
> >On Wed, Oct 03, 2018 at 10:07:45PM +0800, Jia-Ju Bai wrote:
> >>These possible races are detected by a runtime testing.
> >>To fix these races, the mutex lock is used in il4965_send_rxon_assoc()
> >>to protect the data.
> >Really ? I'm surprised by that, see below.
> 
> My runtime testing shows that il4965_send_rxon_assoc() and
> il4965_configure_filter() are concurrently executed.
> But after seeing your reply, I need to carefully check whether my
> runtime testing is right, because I think you are right.
> In fact, I only monitored the iwl4965 driver, but did not monitor
> the iwlegacy driver, so I will do the testing again with monitoring
> the lwlegacy driver.

> >So I wonder how this patch did not cause the deadlock ?
> 
> Oh, sorry, anyway, my patch will cause double locks...

So how those runtime test were performend such you didn't
notice this ?

> >Anyway what can be done is adding:
> >
> >lockdep_assert_held(>mutex);
> >
> >il4965_commit_rxon() to check if we hold the mutex.
> 
> I agree.

Care to post a patch ?

Thanks
Stanislaw


Re: [PATCH] net: wireless: iwlegacy: Fix possible data races in il4965_send_rxon_assoc()

2018-10-05 Thread Stanislaw Gruszka
On Thu, Oct 04, 2018 at 04:52:19PM +0800, Jia-Ju Bai wrote:
> On 2018/10/4 15:59, Stanislaw Gruszka wrote:
> >On Wed, Oct 03, 2018 at 10:07:45PM +0800, Jia-Ju Bai wrote:
> >>These possible races are detected by a runtime testing.
> >>To fix these races, the mutex lock is used in il4965_send_rxon_assoc()
> >>to protect the data.
> >Really ? I'm surprised by that, see below.
> 
> My runtime testing shows that il4965_send_rxon_assoc() and
> il4965_configure_filter() are concurrently executed.
> But after seeing your reply, I need to carefully check whether my
> runtime testing is right, because I think you are right.
> In fact, I only monitored the iwl4965 driver, but did not monitor
> the iwlegacy driver, so I will do the testing again with monitoring
> the lwlegacy driver.

> >So I wonder how this patch did not cause the deadlock ?
> 
> Oh, sorry, anyway, my patch will cause double locks...

So how those runtime test were performend such you didn't
notice this ?

> >Anyway what can be done is adding:
> >
> >lockdep_assert_held(>mutex);
> >
> >il4965_commit_rxon() to check if we hold the mutex.
> 
> I agree.

Care to post a patch ?

Thanks
Stanislaw


Re: [PATCH] rt2x00: fix spelling mistake in various macros, UKNOWN -> UNKNOWN

2018-04-18 Thread Stanislaw Gruszka
On Wed, Apr 18, 2018 at 12:47:50PM +0100, Colin King wrote:
> From: Colin Ian King <colin.k...@canonical.com>
> 
> Rename several macros that contain mispellings of UNKNOWN
> 
> Signed-off-by: Colin Ian King <colin.k...@canonical.com>
Acked-by: Stanislaw Gruszka <sgrus...@redhat.com>


Re: [PATCH] rt2x00: fix spelling mistake in various macros, UKNOWN -> UNKNOWN

2018-04-18 Thread Stanislaw Gruszka
On Wed, Apr 18, 2018 at 12:47:50PM +0100, Colin King wrote:
> From: Colin Ian King 
> 
> Rename several macros that contain mispellings of UNKNOWN
> 
> Signed-off-by: Colin Ian King 
Acked-by: Stanislaw Gruszka 


Re: [PATCH] [linux-next] rt2x00: Fix a typo in printk

2018-01-05 Thread Stanislaw Gruszka
On Fri, Jan 05, 2018 at 12:08:00AM +0900, Masanari Iida wrote:
> This patch fix a typo in rt2800lib.c
> 
> Signed-off-by: Masanari Iida <standby2...@gmail.com>

Acked-by: Stanislaw Gruszka <sgrus...@redhat.com>


Re: [PATCH] [linux-next] rt2x00: Fix a typo in printk

2018-01-05 Thread Stanislaw Gruszka
On Fri, Jan 05, 2018 at 12:08:00AM +0900, Masanari Iida wrote:
> This patch fix a typo in rt2800lib.c
> 
> Signed-off-by: Masanari Iida 

Acked-by: Stanislaw Gruszka 


Re: [PATCH] rt2x00: Delete an error message for a failed memory allocation in rt2x00queue_allocate()

2018-01-03 Thread Stanislaw Gruszka
On Fri, Dec 29, 2017 at 10:18:14PM +0100, SF Markus Elfring wrote:
> From: Markus Elfring <elfr...@users.sourceforge.net>
> Date: Fri, 29 Dec 2017 22:11:42 +0100
> 
> Omit an extra message for a memory allocation failure in this function.
> 
> This issue was detected by using the Coccinelle software.
> 
> Signed-off-by: Markus Elfring <elfr...@users.sourceforge.net>

Acked-by: Stanislaw Gruszka <sgrus...@redhat.com>


Re: [PATCH] rt2x00: Delete an error message for a failed memory allocation in rt2x00queue_allocate()

2018-01-03 Thread Stanislaw Gruszka
On Fri, Dec 29, 2017 at 10:18:14PM +0100, SF Markus Elfring wrote:
> From: Markus Elfring 
> Date: Fri, 29 Dec 2017 22:11:42 +0100
> 
> Omit an extra message for a memory allocation failure in this function.
> 
> This issue was detected by using the Coccinelle software.
> 
> Signed-off-by: Markus Elfring 

Acked-by: Stanislaw Gruszka 


Re: Soft lockup in rt2x00usb_work_rxdone()

2017-11-09 Thread Stanislaw Gruszka
On Wed, Nov 08, 2017 at 12:07:15PM +0100, Richard Genoud wrote:
> (maybe if there was more than one CPU on the board, I could do
> something, but that's not the case)

I reproduced the problem with nosmp option. I think we need
to check also for ENOENT error (not only for ENODEV).
Will post the fix in a second.

Thanks for reporting and debugging
Stanislaw 


Re: Soft lockup in rt2x00usb_work_rxdone()

2017-11-09 Thread Stanislaw Gruszka
On Wed, Nov 08, 2017 at 12:07:15PM +0100, Richard Genoud wrote:
> (maybe if there was more than one CPU on the board, I could do
> something, but that's not the case)

I reproduced the problem with nosmp option. I think we need
to check also for ENOENT error (not only for ENODEV).
Will post the fix in a second.

Thanks for reporting and debugging
Stanislaw 


Re: Soft lockup in rt2x00usb_work_rxdone()

2017-11-08 Thread Stanislaw Gruszka
On Wed, Nov 08, 2017 at 12:07:15PM +0100, Richard Genoud wrote:
> > No, that not I wanted you to do. Please remove those options and just
> > do
> > below on tracing directory.
> > 
> > echo 0 > tracing_on 
> > cat trace >  /dev/null
> > echo "function_graph" > current_tracer 
> > echo "rt2*" > set_ftrace_filter 
> > echo 1 > tracing_on
> > echo 1-2.2 > /sys/bus/usb/drivers/usb/unbind
> > echo 0 > tracing_on
> > cat trace > ~/trace.txt
> 
> Well, there's clearly a misunderstanding here :
> After the command "echo 1-2.2 > /sys/bus/usb/drivers/usb/unbind"
> I can't type *anything*
> The only thing I can do is plug off the board.
> This command never returns, so I can't stop the tracing...
> 
> Or I missed something ?
> 
> (maybe if there was more than one CPU on the board, I could do
> something, but that's not the case)

I was not aware that soft-lock up can hung the system.

Below patch should prevent/help with the issue. This is
not right fix, it's for futher debugging.

Regards
Stanislaw 

diff --git a/drivers/net/wireless/ralink/rt2x00/rt2x00usb.c 
b/drivers/net/wireless/ralink/rt2x00/rt2x00usb.c
index e2f4f5778267..d76ca608c722 100644
--- a/drivers/net/wireless/ralink/rt2x00/rt2x00usb.c
+++ b/drivers/net/wireless/ralink/rt2x00/rt2x00usb.c
@@ -321,8 +321,8 @@ static bool rt2x00usb_kick_tx_entry(struct queue_entry 
*entry, void *data)
 
status = usb_submit_urb(entry_priv->urb, GFP_ATOMIC);
if (status) {
-   if (status == -ENODEV)
-   clear_bit(DEVICE_STATE_PRESENT, >flags);
+   printk("submit TX urb ERROR %d\n", status);
+   clear_bit(DEVICE_STATE_PRESENT, >flags);
set_bit(ENTRY_DATA_IO_FAILED, >flags);
rt2x00lib_dmadone(entry);
}
@@ -410,8 +410,8 @@ static bool rt2x00usb_kick_rx_entry(struct queue_entry 
*entry, void *data)
 
status = usb_submit_urb(entry_priv->urb, GFP_ATOMIC);
if (status) {
-   if (status == -ENODEV)
-   clear_bit(DEVICE_STATE_PRESENT, >flags);
+   printk("submit RX urb ERROR %d\n", status);
+   clear_bit(DEVICE_STATE_PRESENT, >flags);
set_bit(ENTRY_DATA_IO_FAILED, >flags);
rt2x00lib_dmadone(entry);
}


Re: Soft lockup in rt2x00usb_work_rxdone()

2017-11-08 Thread Stanislaw Gruszka
On Wed, Nov 08, 2017 at 12:07:15PM +0100, Richard Genoud wrote:
> > No, that not I wanted you to do. Please remove those options and just
> > do
> > below on tracing directory.
> > 
> > echo 0 > tracing_on 
> > cat trace >  /dev/null
> > echo "function_graph" > current_tracer 
> > echo "rt2*" > set_ftrace_filter 
> > echo 1 > tracing_on
> > echo 1-2.2 > /sys/bus/usb/drivers/usb/unbind
> > echo 0 > tracing_on
> > cat trace > ~/trace.txt
> 
> Well, there's clearly a misunderstanding here :
> After the command "echo 1-2.2 > /sys/bus/usb/drivers/usb/unbind"
> I can't type *anything*
> The only thing I can do is plug off the board.
> This command never returns, so I can't stop the tracing...
> 
> Or I missed something ?
> 
> (maybe if there was more than one CPU on the board, I could do
> something, but that's not the case)

I was not aware that soft-lock up can hung the system.

Below patch should prevent/help with the issue. This is
not right fix, it's for futher debugging.

Regards
Stanislaw 

diff --git a/drivers/net/wireless/ralink/rt2x00/rt2x00usb.c 
b/drivers/net/wireless/ralink/rt2x00/rt2x00usb.c
index e2f4f5778267..d76ca608c722 100644
--- a/drivers/net/wireless/ralink/rt2x00/rt2x00usb.c
+++ b/drivers/net/wireless/ralink/rt2x00/rt2x00usb.c
@@ -321,8 +321,8 @@ static bool rt2x00usb_kick_tx_entry(struct queue_entry 
*entry, void *data)
 
status = usb_submit_urb(entry_priv->urb, GFP_ATOMIC);
if (status) {
-   if (status == -ENODEV)
-   clear_bit(DEVICE_STATE_PRESENT, >flags);
+   printk("submit TX urb ERROR %d\n", status);
+   clear_bit(DEVICE_STATE_PRESENT, >flags);
set_bit(ENTRY_DATA_IO_FAILED, >flags);
rt2x00lib_dmadone(entry);
}
@@ -410,8 +410,8 @@ static bool rt2x00usb_kick_rx_entry(struct queue_entry 
*entry, void *data)
 
status = usb_submit_urb(entry_priv->urb, GFP_ATOMIC);
if (status) {
-   if (status == -ENODEV)
-   clear_bit(DEVICE_STATE_PRESENT, >flags);
+   printk("submit RX urb ERROR %d\n", status);
+   clear_bit(DEVICE_STATE_PRESENT, >flags);
set_bit(ENTRY_DATA_IO_FAILED, >flags);
rt2x00lib_dmadone(entry);
}


Re: Soft lockup in rt2x00usb_work_rxdone()

2017-11-08 Thread Stanislaw Gruszka
On Tue, Nov 07, 2017 at 12:13:47PM +0100, Richard Genoud wrote:
> Le mardi 07 novembre 2017 à 11:13 +0100, Stanislaw Gruszka a écrit :
> > On Tue, Nov 07, 2017 at 11:06:39AM +0100, Richard Genoud wrote:
> > > > 3 short articles how to configure and use ftrace are here:
> > > > https://lwn.net/Articles/365835/
> > > > https://lwn.net/Articles/366796/
> > > > https://lwn.net/Articles/370423/
> > > > 
> > > 
> > > I tried with ftrace, but I don't think there's a way to dump the
> > > trace
> > > when there's a soft lock-up
> > > (I can't do anything after the unbind, even the heartbeat led
> > > stopped blinking).
> > > I saw the /proc/sys/kernel/ftrace_dump_on_oops file, but there's no
> > > /proc/sys/kernel/ftrace_dump_on_soft_lock-up file :)
> > 
> > You should configure function trace with rt2x* functions. After that
> > start tracing, unbind the device, then stop tracing and provide trace
> > output.
> Here is another trace, with rt2* as function filter.
> (sorry for the noise)
> 
> Dumping ftrace buffer:
> -
> CPU:0 [LOST 3606923 EVENTS]
>  0)   0.000 us|  } /* rt2x00usb_clear_entry */
>  0)   0.000 us|} /* rt2x00lib_rxdone */
>  0)   0.000 us|rt2x00queue_get_entry();
>  0)   |rt2x00lib_rxdone() {
>  0)   0.000 us|  rt2x00queue_index_inc();
>  0)   |  rt2x00usb_clear_entry() {
>  0)   |rt2x00usb_kick_rx_entry() {

We do that only if: 

if (test_bit(DEVICE_STATE_PRESENT, >flags) &&
test_bit(DEVICE_STATE_ENABLED_RADIO, >flags))
rt2x00dev->ops->lib->clear_entry(entry);

so looks like DEVICE_STATE_PRESENT is not cleared. That mean
usb driver do not call disconnect callback on unbind. That seems
to be usb driver bug. Anyway please provide requested traces,
so I will see what happen.

Stanislaw


Re: Soft lockup in rt2x00usb_work_rxdone()

2017-11-08 Thread Stanislaw Gruszka
On Tue, Nov 07, 2017 at 12:13:47PM +0100, Richard Genoud wrote:
> Le mardi 07 novembre 2017 à 11:13 +0100, Stanislaw Gruszka a écrit :
> > On Tue, Nov 07, 2017 at 11:06:39AM +0100, Richard Genoud wrote:
> > > > 3 short articles how to configure and use ftrace are here:
> > > > https://lwn.net/Articles/365835/
> > > > https://lwn.net/Articles/366796/
> > > > https://lwn.net/Articles/370423/
> > > > 
> > > 
> > > I tried with ftrace, but I don't think there's a way to dump the
> > > trace
> > > when there's a soft lock-up
> > > (I can't do anything after the unbind, even the heartbeat led
> > > stopped blinking).
> > > I saw the /proc/sys/kernel/ftrace_dump_on_oops file, but there's no
> > > /proc/sys/kernel/ftrace_dump_on_soft_lock-up file :)
> > 
> > You should configure function trace with rt2x* functions. After that
> > start tracing, unbind the device, then stop tracing and provide trace
> > output.
> Here is another trace, with rt2* as function filter.
> (sorry for the noise)
> 
> Dumping ftrace buffer:
> -
> CPU:0 [LOST 3606923 EVENTS]
>  0)   0.000 us|  } /* rt2x00usb_clear_entry */
>  0)   0.000 us|} /* rt2x00lib_rxdone */
>  0)   0.000 us|rt2x00queue_get_entry();
>  0)   |rt2x00lib_rxdone() {
>  0)   0.000 us|  rt2x00queue_index_inc();
>  0)   |  rt2x00usb_clear_entry() {
>  0)   |rt2x00usb_kick_rx_entry() {

We do that only if: 

if (test_bit(DEVICE_STATE_PRESENT, >flags) &&
test_bit(DEVICE_STATE_ENABLED_RADIO, >flags))
rt2x00dev->ops->lib->clear_entry(entry);

so looks like DEVICE_STATE_PRESENT is not cleared. That mean
usb driver do not call disconnect callback on unbind. That seems
to be usb driver bug. Anyway please provide requested traces,
so I will see what happen.

Stanislaw


Re: Soft lockup in rt2x00usb_work_rxdone()

2017-11-08 Thread Stanislaw Gruszka
On Tue, Nov 07, 2017 at 12:01:23PM +0100, Richard Genoud wrote:
> Le mardi 07 novembre 2017 à 11:13 +0100, Stanislaw Gruszka a écrit :
> > On Tue, Nov 07, 2017 at 11:06:39AM +0100, Richard Genoud wrote:
> > > > 3 short articles how to configure and use ftrace are here:
> > > > https://lwn.net/Articles/365835/
> > > > https://lwn.net/Articles/366796/
> > > > https://lwn.net/Articles/370423/
> > > > 
> > > 
> > > I tried with ftrace, but I don't think there's a way to dump the
> > > trace
> > > when there's a soft lock-up
> > > (I can't do anything after the unbind, even the heartbeat led
> > > stopped blinking).
> > > I saw the /proc/sys/kernel/ftrace_dump_on_oops file, but there's no
> > > /proc/sys/kernel/ftrace_dump_on_soft_lock-up file :)
> > 
> > You should configure function trace with rt2x* functions. After that
> > start tracing, unbind the device, then stop tracing and provide trace
> > output.
> Ok, I found a way to display the trace (after the unbind, the board is
> frozen and I can't type anything).
> Adding
> CONFIG_SOFTLOCKUP_DETECTOR=y
> CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC=y
> along with echo 1 > /proc/sys/kernel/ftrace_dump_on_oops does the trick

No, that not I wanted you to do. Please remove those options and just do
below on tracing directory.

echo 0 > tracing_on 
cat trace >  /dev/null
echo "function_graph" > current_tracer 
echo "rt2*" > set_ftrace_filter 
echo 1 > tracing_on
echo 1-2.2 > /sys/bus/usb/drivers/usb/unbind
echo 0 > tracing_on
cat trace > ~/trace.txt

and provide trace.txt to me (can be in private email if big).

Thanks
Stanislaw


Re: Soft lockup in rt2x00usb_work_rxdone()

2017-11-08 Thread Stanislaw Gruszka
On Tue, Nov 07, 2017 at 12:01:23PM +0100, Richard Genoud wrote:
> Le mardi 07 novembre 2017 à 11:13 +0100, Stanislaw Gruszka a écrit :
> > On Tue, Nov 07, 2017 at 11:06:39AM +0100, Richard Genoud wrote:
> > > > 3 short articles how to configure and use ftrace are here:
> > > > https://lwn.net/Articles/365835/
> > > > https://lwn.net/Articles/366796/
> > > > https://lwn.net/Articles/370423/
> > > > 
> > > 
> > > I tried with ftrace, but I don't think there's a way to dump the
> > > trace
> > > when there's a soft lock-up
> > > (I can't do anything after the unbind, even the heartbeat led
> > > stopped blinking).
> > > I saw the /proc/sys/kernel/ftrace_dump_on_oops file, but there's no
> > > /proc/sys/kernel/ftrace_dump_on_soft_lock-up file :)
> > 
> > You should configure function trace with rt2x* functions. After that
> > start tracing, unbind the device, then stop tracing and provide trace
> > output.
> Ok, I found a way to display the trace (after the unbind, the board is
> frozen and I can't type anything).
> Adding
> CONFIG_SOFTLOCKUP_DETECTOR=y
> CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC=y
> along with echo 1 > /proc/sys/kernel/ftrace_dump_on_oops does the trick

No, that not I wanted you to do. Please remove those options and just do
below on tracing directory.

echo 0 > tracing_on 
cat trace >  /dev/null
echo "function_graph" > current_tracer 
echo "rt2*" > set_ftrace_filter 
echo 1 > tracing_on
echo 1-2.2 > /sys/bus/usb/drivers/usb/unbind
echo 0 > tracing_on
cat trace > ~/trace.txt

and provide trace.txt to me (can be in private email if big).

Thanks
Stanislaw


Re: Soft lockup in rt2x00usb_work_rxdone()

2017-11-07 Thread Stanislaw Gruszka

On Tue, Nov 07, 2017 at 11:06:39AM +0100, Richard Genoud wrote:
> > 3 short articles how to configure and use ftrace are here:
> > https://lwn.net/Articles/365835/
> > https://lwn.net/Articles/366796/
> > https://lwn.net/Articles/370423/
> >
> I tried with ftrace, but I don't think there's a way to dump the trace
> when there's a soft lock-up
> (I can't do anything after the unbind, even the heartbeat led stopped 
> blinking).
> I saw the /proc/sys/kernel/ftrace_dump_on_oops file, but there's no
> /proc/sys/kernel/ftrace_dump_on_soft_lock-up file :)

You should configure function trace with rt2x* functions. After that
start tracing, unbind the device, then stop tracing and provide trace
output.

Thanks
Stanislaw


Re: Soft lockup in rt2x00usb_work_rxdone()

2017-11-07 Thread Stanislaw Gruszka

On Tue, Nov 07, 2017 at 11:06:39AM +0100, Richard Genoud wrote:
> > 3 short articles how to configure and use ftrace are here:
> > https://lwn.net/Articles/365835/
> > https://lwn.net/Articles/366796/
> > https://lwn.net/Articles/370423/
> >
> I tried with ftrace, but I don't think there's a way to dump the trace
> when there's a soft lock-up
> (I can't do anything after the unbind, even the heartbeat led stopped 
> blinking).
> I saw the /proc/sys/kernel/ftrace_dump_on_oops file, but there's no
> /proc/sys/kernel/ftrace_dump_on_soft_lock-up file :)

You should configure function trace with rt2x* functions. After that
start tracing, unbind the device, then stop tracing and provide trace
output.

Thanks
Stanislaw


Re: Soft lockup in rt2x00usb_work_rxdone()

2017-11-07 Thread Stanislaw Gruszka
Hi

On Mon, Nov 06, 2017 at 04:57:09PM +0100, Richard Genoud wrote:
> I get a soft lock-up while unbinding the USB driver on a TP-Link TL-WN727Nv3 
> (chipset 5370):
> 
> # echo 1-2.2 > /sys/bus/usb/drivers/usb/unbind
> watchdog: BUG: soft lockup - CPU#0 stuck for 23s! [kworker/u2:3:308]
...
> I can trigger this each time.

I can not reproduce this on my system (I'm using 4.14.0-rc6, but I don't
think it's an issue). I think the problem may be caused by usb
host controler driver, which can be different on your system.

Does ftrace work on your platform ? If so could you use ftrace
to provide rt2x00 functions trace when the  probllem happen ? 

3 short articles how to configure and use ftrace are here:
https://lwn.net/Articles/365835/
https://lwn.net/Articles/366796/
https://lwn.net/Articles/370423/

Thanks
Stanislaw


Re: Soft lockup in rt2x00usb_work_rxdone()

2017-11-07 Thread Stanislaw Gruszka
Hi

On Mon, Nov 06, 2017 at 04:57:09PM +0100, Richard Genoud wrote:
> I get a soft lock-up while unbinding the USB driver on a TP-Link TL-WN727Nv3 
> (chipset 5370):
> 
> # echo 1-2.2 > /sys/bus/usb/drivers/usb/unbind
> watchdog: BUG: soft lockup - CPU#0 stuck for 23s! [kworker/u2:3:308]
...
> I can trigger this each time.

I can not reproduce this on my system (I'm using 4.14.0-rc6, but I don't
think it's an issue). I think the problem may be caused by usb
host controler driver, which can be different on your system.

Does ftrace work on your platform ? If so could you use ftrace
to provide rt2x00 functions trace when the  probllem happen ? 

3 short articles how to configure and use ftrace are here:
https://lwn.net/Articles/365835/
https://lwn.net/Articles/366796/
https://lwn.net/Articles/370423/

Thanks
Stanislaw


Re: [PATCH] wireless: iwlegacy: Convert timers to use timer_setup()

2017-10-19 Thread Stanislaw Gruszka
On Mon, Oct 16, 2017 at 04:37:44PM -0700, Kees Cook wrote:
> In preparation for unconditionally passing the struct timer_list pointer to
> all timer callbacks, switch to using the new timer_setup() and from_timer()
> to pass the timer pointer explicitly.
> 
> Cc: Stanislaw Gruszka <sgrus...@redhat.com>
> Cc: Kalle Valo <kv...@codeaurora.org>
> Cc: linux-wirel...@vger.kernel.org
> Cc: net...@vger.kernel.org
> Signed-off-by: Kees Cook <keesc...@chromium.org>

Acked-by: Stanislaw Gruszka <sgrus...@redhat.com>



Re: [PATCH] wireless: iwlegacy: Convert timers to use timer_setup()

2017-10-19 Thread Stanislaw Gruszka
On Mon, Oct 16, 2017 at 04:37:44PM -0700, Kees Cook wrote:
> In preparation for unconditionally passing the struct timer_list pointer to
> all timer callbacks, switch to using the new timer_setup() and from_timer()
> to pass the timer pointer explicitly.
> 
> Cc: Stanislaw Gruszka 
> Cc: Kalle Valo 
> Cc: linux-wirel...@vger.kernel.org
> Cc: net...@vger.kernel.org
> Signed-off-by: Kees Cook 

Acked-by: Stanislaw Gruszka 



Re: rc2-next-20170929: wireless down, won't come up

2017-10-16 Thread Stanislaw Gruszka
On Mon, Oct 16, 2017 at 02:24:56PM +0200, Pavel Machek wrote:
> So far it happened once... so I don't know about reproducibility. And
> I'm not even sure if I'm using iwlegacy driver. Am I?

iwl3945 (and iwl4965) are iwlegacy drivers.

Regards
Stanislaw


Re: rc2-next-20170929: wireless down, won't come up

2017-10-16 Thread Stanislaw Gruszka
On Mon, Oct 16, 2017 at 02:24:56PM +0200, Pavel Machek wrote:
> So far it happened once... so I don't know about reproducibility. And
> I'm not even sure if I'm using iwlegacy driver. Am I?

iwl3945 (and iwl4965) are iwlegacy drivers.

Regards
Stanislaw


Re: rc2-next-20170929: wireless down, won't come up

2017-10-16 Thread Stanislaw Gruszka
Hi

Site note: Intel folks do not support iwlegacy, I removed them from CC.

On Mon, Oct 16, 2017 at 12:27:45PM +0200, Pavel Machek wrote:
> > In -next... after few days of usage with suspend and resume cycles,
> > wifi failed on thinkpad x60. I have not seen this in years...

> > Any ideas?

We do not have any recent iwlegacy changes except cosmetic
ones, so this problem most likely is mac80211, pci or other
subsystem issue

> Suspend+resume fixed the problem.

It is not reproducible? If so, I think it will not be 
easy to identify the issue.

Thanks
Stanislaw




Re: rc2-next-20170929: wireless down, won't come up

2017-10-16 Thread Stanislaw Gruszka
Hi

Site note: Intel folks do not support iwlegacy, I removed them from CC.

On Mon, Oct 16, 2017 at 12:27:45PM +0200, Pavel Machek wrote:
> > In -next... after few days of usage with suspend and resume cycles,
> > wifi failed on thinkpad x60. I have not seen this in years...

> > Any ideas?

We do not have any recent iwlegacy changes except cosmetic
ones, so this problem most likely is mac80211, pci or other
subsystem issue

> Suspend+resume fixed the problem.

It is not reproducible? If so, I think it will not be 
easy to identify the issue.

Thanks
Stanislaw




Re: usb/net/rt2x00: warning in rt2800_eeprom_word_index

2017-10-16 Thread Stanislaw Gruszka
Hi Dmitry

On Sat, Oct 14, 2017 at 04:38:03PM +0200, Dmitry Vyukov wrote:
> On Thu, Oct 12, 2017 at 9:25 AM, Stanislaw Gruszka <sgrus...@redhat.com> 
> wrote:
> > Hi
> >
> > On Mon, Oct 09, 2017 at 07:50:53PM +0200, Andrey Konovalov wrote:
> >> I've got the following report while fuzzing the kernel with syzkaller.
> >>
> >> On commit 8a5776a5f49812d29fe4b2d0a2d71675c3facf3f (4.14-rc4).
> >>
> >> I'm not sure whether this is a bug in the driver, or just a way to
> >> report misbehaving device. In the latter case this shouldn't be a
> >> WARN() call, since WARN() means bug in the kernel.
> >
> > This is about wrong EEPROM, which reported 3 tx streams on
> > non 3 antenna device. I think WARN() is justified and thanks
> > to the call trace I was actually able to to understand what
> > happened.
> >
> > In general I do not think WARN() only means a kernel bug, it
> > can be F/W or H/W bug too.
> 
> Hi Stanislaw,
> 
> Printing messages is fine. Printing stacks is fine. Just please make
> them distinguishable from kernel bugs and don't kill the whole
> possibility of automated Linux kernel testing. That's an important
> capability.

We do not distinguish between bugs and other problems when WARN() is
used in (wireless) drivers, what I think is correct, taking comment from
include/asm-generic/bug.h :

/*
 * WARN(), WARN_ON(), WARN_ON_ONCE, and so on can be used to report
 * significant issues that need prompt attention if they should ever
 * appear at runtime.  Use the versions with printk format strings
 * to provide better diagnostics.
 */

Historically we have BUG() to mark the bugs, but usage if it is not
recommended as it can kill the system, so for anything that can
be recovered in runtime - WARN() is recommended.

Perhaps we can introduce another helper like PROBLEM() for marking
situations when something is wrong, but it is not a bug. However I'm
not even sure at what extent it can be used, since for many cases
if not the most, driver author can not tell apriori if the problem
is a bug in the driver or HW/FW misbehaviour (or maybe particular
issue can happen because of both).

Thanks
Stanislaw


Re: usb/net/rt2x00: warning in rt2800_eeprom_word_index

2017-10-16 Thread Stanislaw Gruszka
Hi Dmitry

On Sat, Oct 14, 2017 at 04:38:03PM +0200, Dmitry Vyukov wrote:
> On Thu, Oct 12, 2017 at 9:25 AM, Stanislaw Gruszka  
> wrote:
> > Hi
> >
> > On Mon, Oct 09, 2017 at 07:50:53PM +0200, Andrey Konovalov wrote:
> >> I've got the following report while fuzzing the kernel with syzkaller.
> >>
> >> On commit 8a5776a5f49812d29fe4b2d0a2d71675c3facf3f (4.14-rc4).
> >>
> >> I'm not sure whether this is a bug in the driver, or just a way to
> >> report misbehaving device. In the latter case this shouldn't be a
> >> WARN() call, since WARN() means bug in the kernel.
> >
> > This is about wrong EEPROM, which reported 3 tx streams on
> > non 3 antenna device. I think WARN() is justified and thanks
> > to the call trace I was actually able to to understand what
> > happened.
> >
> > In general I do not think WARN() only means a kernel bug, it
> > can be F/W or H/W bug too.
> 
> Hi Stanislaw,
> 
> Printing messages is fine. Printing stacks is fine. Just please make
> them distinguishable from kernel bugs and don't kill the whole
> possibility of automated Linux kernel testing. That's an important
> capability.

We do not distinguish between bugs and other problems when WARN() is
used in (wireless) drivers, what I think is correct, taking comment from
include/asm-generic/bug.h :

/*
 * WARN(), WARN_ON(), WARN_ON_ONCE, and so on can be used to report
 * significant issues that need prompt attention if they should ever
 * appear at runtime.  Use the versions with printk format strings
 * to provide better diagnostics.
 */

Historically we have BUG() to mark the bugs, but usage if it is not
recommended as it can kill the system, so for anything that can
be recovered in runtime - WARN() is recommended.

Perhaps we can introduce another helper like PROBLEM() for marking
situations when something is wrong, but it is not a bug. However I'm
not even sure at what extent it can be used, since for many cases
if not the most, driver author can not tell apriori if the problem
is a bug in the driver or HW/FW misbehaviour (or maybe particular
issue can happen because of both).

Thanks
Stanislaw


Re: Crash in bm_evict_inode() during shutdown.

2017-10-16 Thread Stanislaw Gruszka
On Fri, Oct 13, 2017 at 04:47:45PM +0200, Oleg Nesterov wrote:
> Yes, sorry, my fault. Fixed by the patch from Eryu, already in -mm tree.

No worries, I just used power button to "help" the system to shutdown :-)

Now is fixed in mainline, thanks.
Stanislaw


Re: Crash in bm_evict_inode() during shutdown.

2017-10-16 Thread Stanislaw Gruszka
On Fri, Oct 13, 2017 at 04:47:45PM +0200, Oleg Nesterov wrote:
> Yes, sorry, my fault. Fixed by the patch from Eryu, already in -mm tree.

No worries, I just used power button to "help" the system to shutdown :-)

Now is fixed in mainline, thanks.
Stanislaw


Crash in bm_evict_inode() during shutdown.

2017-10-13 Thread Stanislaw Gruszka
On 4.14-rc4 updated today, I have OOPS on bm_evict_inode(). It happens
every time during shutdown of the system.

Seems e = inode->i_private is NULL and e->flags crashes.

Picture with the OOPS call trace is here:
http://people.redhat.com/sgruszka/DSC_0733.JPG

Thanks
Stanislaw


Crash in bm_evict_inode() during shutdown.

2017-10-13 Thread Stanislaw Gruszka
On 4.14-rc4 updated today, I have OOPS on bm_evict_inode(). It happens
every time during shutdown of the system.

Seems e = inode->i_private is NULL and e->flags crashes.

Picture with the OOPS call trace is here:
http://people.redhat.com/sgruszka/DSC_0733.JPG

Thanks
Stanislaw


Re: usb/net/rt2x00: warning in rt2800_eeprom_word_index

2017-10-12 Thread Stanislaw Gruszka
Hi

On Mon, Oct 09, 2017 at 07:50:53PM +0200, Andrey Konovalov wrote:
> I've got the following report while fuzzing the kernel with syzkaller.
> 
> On commit 8a5776a5f49812d29fe4b2d0a2d71675c3facf3f (4.14-rc4).
> 
> I'm not sure whether this is a bug in the driver, or just a way to
> report misbehaving device. In the latter case this shouldn't be a
> WARN() call, since WARN() means bug in the kernel.

This is about wrong EEPROM, which reported 3 tx streams on
non 3 antenna device. I think WARN() is justified and thanks
to the call trace I was actually able to to understand what
happened.

In general I do not think WARN() only means a kernel bug, it 
can be F/W or H/W bug too.

Thanks
Stanislaw


Re: usb/net/rt2x00: warning in rt2800_eeprom_word_index

2017-10-12 Thread Stanislaw Gruszka
Hi

On Mon, Oct 09, 2017 at 07:50:53PM +0200, Andrey Konovalov wrote:
> I've got the following report while fuzzing the kernel with syzkaller.
> 
> On commit 8a5776a5f49812d29fe4b2d0a2d71675c3facf3f (4.14-rc4).
> 
> I'm not sure whether this is a bug in the driver, or just a way to
> report misbehaving device. In the latter case this shouldn't be a
> WARN() call, since WARN() means bug in the kernel.

This is about wrong EEPROM, which reported 3 tx streams on
non 3 antenna device. I think WARN() is justified and thanks
to the call trace I was actually able to to understand what
happened.

In general I do not think WARN() only means a kernel bug, it 
can be F/W or H/W bug too.

Thanks
Stanislaw


Re: [PATCH 1/1] sched/cputime: do not decrease steal time after live migration on xen

2017-10-10 Thread Stanislaw Gruszka
On Tue, Oct 10, 2017 at 12:59:26PM +0200, Ingo Molnar wrote:
> 
> (Cc:-ed more gents involved in kernel/sched/cputime.c work. Full patch quoted 
> below.)
> 
> * Dongli Zhang  wrote:
> 
> > After guest live migration on xen, steal time in /proc/stat
> > (cpustat[CPUTIME_STEAL]) might decrease because steal returned by
> > paravirt_steal_clock() might be less than this_rq()->prev_steal_time.
> > 
> > For instance, steal time of each vcpu is 335 before live migration.
> > 
> > cpu  198 0 368 200064 1962 0 0 1340 0 0
> > cpu0 38 0 81 50063 492 0 0 335 0 0
> > cpu1 65 0 97 49763 634 0 0 335 0 0
> > cpu2 38 0 81 50098 462 0 0 335 0 0
> > cpu3 56 0 107 50138 374 0 0 335 0 0
> > 
> > After live migration, steal time is reduced to 312.
> > 
> > cpu  200 0 370 200330 1971 0 0 1248 0 0
> > cpu0 38 0 82 50123 500 0 0 312 0 0
> > cpu1 65 0 97 49832 634 0 0 312 0 0
> > cpu2 39 0 82 50167 462 0 0 312 0 0
> > cpu3 56 0 107 50207 374 0 0 312 0 0
> > 
> > The code in this patch is borrowed from do_stolen_accounting() which has
> > already been removed from linux source code since commit ecb23dc6 ("xen:
> > add steal_clock support on x86").
> > 
> > Similar and more severe issue would impact prior linux 4.8-4.10 as
> > discussed by Michael Las at
> > https://0xstubs.org/debugging-a-flaky-cpu-steal-time-counter-on-a-paravirtualized-xen-guest.
> > Unlike the issue discussed by Michael Las which would overflow steal time
> > and lead to 100% st usage in top command for linux 4.8-4.10, the issue for
> > linux 4.11+ would only decrease but not overflow steal time after live
> > migration.
> > 
> > References: 
> > https://0xstubs.org/debugging-a-flaky-cpu-steal-time-counter-on-a-paravirtualized-xen-guest
> > Signed-off-by: Dongli Zhang 
> > ---
> >  kernel/sched/cputime.c | 13 ++---
> >  1 file changed, 10 insertions(+), 3 deletions(-)
> > 
> > diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
> > index 14d2dbf..57d09cab 100644
> > --- a/kernel/sched/cputime.c
> > +++ b/kernel/sched/cputime.c
> > @@ -238,10 +238,17 @@ static __always_inline u64 
> > steal_account_process_time(u64 maxtime)
> >  {
> >  #ifdef CONFIG_PARAVIRT
> > if (static_key_false(_steal_enabled)) {
> > -   u64 steal;
> > +   u64 steal, steal_time;
> > +   s64 steal_delta;
> > +
> > +   steal_time = paravirt_steal_clock(smp_processor_id());
> > +   steal = steal_delta = steal_time - this_rq()->prev_steal_time;
> > +
> > +   if (unlikely(steal_delta < 0)) {
> > +   this_rq()->prev_steal_time = steal_time;

I don't think setting prev_steal_time to smaller value is right
thing to do. 

Beside, I don't think we need to check for overflow condition for
cputime variables (it will happen after 279 years :-). So instead
of introducing signed steal_delta variable I would just add
below check, which should be sufficient to fix the problem:

if (unlikely(steal <= this_rq()->prev_steal_time))
return 0;

Thanks
Stanislaw


Re: [PATCH 1/1] sched/cputime: do not decrease steal time after live migration on xen

2017-10-10 Thread Stanislaw Gruszka
On Tue, Oct 10, 2017 at 12:59:26PM +0200, Ingo Molnar wrote:
> 
> (Cc:-ed more gents involved in kernel/sched/cputime.c work. Full patch quoted 
> below.)
> 
> * Dongli Zhang  wrote:
> 
> > After guest live migration on xen, steal time in /proc/stat
> > (cpustat[CPUTIME_STEAL]) might decrease because steal returned by
> > paravirt_steal_clock() might be less than this_rq()->prev_steal_time.
> > 
> > For instance, steal time of each vcpu is 335 before live migration.
> > 
> > cpu  198 0 368 200064 1962 0 0 1340 0 0
> > cpu0 38 0 81 50063 492 0 0 335 0 0
> > cpu1 65 0 97 49763 634 0 0 335 0 0
> > cpu2 38 0 81 50098 462 0 0 335 0 0
> > cpu3 56 0 107 50138 374 0 0 335 0 0
> > 
> > After live migration, steal time is reduced to 312.
> > 
> > cpu  200 0 370 200330 1971 0 0 1248 0 0
> > cpu0 38 0 82 50123 500 0 0 312 0 0
> > cpu1 65 0 97 49832 634 0 0 312 0 0
> > cpu2 39 0 82 50167 462 0 0 312 0 0
> > cpu3 56 0 107 50207 374 0 0 312 0 0
> > 
> > The code in this patch is borrowed from do_stolen_accounting() which has
> > already been removed from linux source code since commit ecb23dc6 ("xen:
> > add steal_clock support on x86").
> > 
> > Similar and more severe issue would impact prior linux 4.8-4.10 as
> > discussed by Michael Las at
> > https://0xstubs.org/debugging-a-flaky-cpu-steal-time-counter-on-a-paravirtualized-xen-guest.
> > Unlike the issue discussed by Michael Las which would overflow steal time
> > and lead to 100% st usage in top command for linux 4.8-4.10, the issue for
> > linux 4.11+ would only decrease but not overflow steal time after live
> > migration.
> > 
> > References: 
> > https://0xstubs.org/debugging-a-flaky-cpu-steal-time-counter-on-a-paravirtualized-xen-guest
> > Signed-off-by: Dongli Zhang 
> > ---
> >  kernel/sched/cputime.c | 13 ++---
> >  1 file changed, 10 insertions(+), 3 deletions(-)
> > 
> > diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
> > index 14d2dbf..57d09cab 100644
> > --- a/kernel/sched/cputime.c
> > +++ b/kernel/sched/cputime.c
> > @@ -238,10 +238,17 @@ static __always_inline u64 
> > steal_account_process_time(u64 maxtime)
> >  {
> >  #ifdef CONFIG_PARAVIRT
> > if (static_key_false(_steal_enabled)) {
> > -   u64 steal;
> > +   u64 steal, steal_time;
> > +   s64 steal_delta;
> > +
> > +   steal_time = paravirt_steal_clock(smp_processor_id());
> > +   steal = steal_delta = steal_time - this_rq()->prev_steal_time;
> > +
> > +   if (unlikely(steal_delta < 0)) {
> > +   this_rq()->prev_steal_time = steal_time;

I don't think setting prev_steal_time to smaller value is right
thing to do. 

Beside, I don't think we need to check for overflow condition for
cputime variables (it will happen after 279 years :-). So instead
of introducing signed steal_delta variable I would just add
below check, which should be sufficient to fix the problem:

if (unlikely(steal <= this_rq()->prev_steal_time))
return 0;

Thanks
Stanislaw


Re: [PATCH] wireless: iwlegacy: make const array static to shink object code size Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 8bit

2017-09-22 Thread Stanislaw Gruszka
On Thu, Sep 21, 2017 at 11:56:30PM +0100, Colin King wrote:
> From: Colin Ian King <colin.k...@canonical.com>
> 
> Don't populate const array ac_to_fifo on the stack in an inlined
> function, instead make it static.  Makes the object code smaller
> by over 800 bytes:
> 
>text  data bss dec hex filename
>  159029 331541216  193399   2f377 4965-mac.o
> 
>text  data bss dec hex filename
>  158122 332501216  192588   2f04c 4965-mac.o
> 
> (gcc version 7.2.0 x86_64)
> 
> Signed-off-by: Colin Ian King <colin.k...@canonical.com>

Content type information was added at the end of the topic, but
I think Kalle can fix that when he will be committing the patch.

Acked-by: Stanislaw Gruszka <sgrus...@redhat.com>


Re: [PATCH] wireless: iwlegacy: make const array static to shink object code size Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 8bit

2017-09-22 Thread Stanislaw Gruszka
On Thu, Sep 21, 2017 at 11:56:30PM +0100, Colin King wrote:
> From: Colin Ian King 
> 
> Don't populate const array ac_to_fifo on the stack in an inlined
> function, instead make it static.  Makes the object code smaller
> by over 800 bytes:
> 
>text  data bss dec hex filename
>  159029 331541216  193399   2f377 4965-mac.o
> 
>text  data bss dec hex filename
>  158122 332501216  192588   2f04c 4965-mac.o
> 
> (gcc version 7.2.0 x86_64)
> 
> Signed-off-by: Colin Ian King 

Content type information was added at the end of the topic, but
I think Kalle can fix that when he will be committing the patch.

Acked-by: Stanislaw Gruszka 


Re: [tip:sched/core] sched/cputime: Refactor the cputime_adjust() code

2017-06-30 Thread Stanislaw Gruszka
On Fri, Jun 30, 2017 at 06:10:35AM -0700, tip-bot for Gustavo A. R. Silva wrote:
> diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
> index aea3135..67c70e2 100644
> --- a/kernel/sched/cputime.c
> +++ b/kernel/sched/cputime.c
> @@ -615,19 +615,13 @@ static void cputime_adjust(struct task_cputime *curr,
>* userspace. Once a task gets some ticks, the monotonicy code at
>* 'update' will ensure things converge to the observed ratio.
^^

>*/
> - if (stime == 0) {
> - utime = rtime;
> - goto update;
> + if (stime != 0) {
> + if (utime == 0)
> + stime = rtime;
> + else
> + stime = scale_stime(stime, rtime, stime + utime);
>   }
>  
> - if (utime == 0) {
> - stime = rtime;
> - goto update;
> - }
> -
> - stime = scale_stime(stime, rtime, stime + utime);
> -
> -update:
Since 'update' label is removed, I think above comment should be
corrected too. Eventually patch could just remove 'utime = rtime;'
line to shut up coverity.

Stanislaw


Re: [tip:sched/core] sched/cputime: Refactor the cputime_adjust() code

2017-06-30 Thread Stanislaw Gruszka
On Fri, Jun 30, 2017 at 06:10:35AM -0700, tip-bot for Gustavo A. R. Silva wrote:
> diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
> index aea3135..67c70e2 100644
> --- a/kernel/sched/cputime.c
> +++ b/kernel/sched/cputime.c
> @@ -615,19 +615,13 @@ static void cputime_adjust(struct task_cputime *curr,
>* userspace. Once a task gets some ticks, the monotonicy code at
>* 'update' will ensure things converge to the observed ratio.
^^

>*/
> - if (stime == 0) {
> - utime = rtime;
> - goto update;
> + if (stime != 0) {
> + if (utime == 0)
> + stime = rtime;
> + else
> + stime = scale_stime(stime, rtime, stime + utime);
>   }
>  
> - if (utime == 0) {
> - stime = rtime;
> - goto update;
> - }
> -
> - stime = scale_stime(stime, rtime, stime + utime);
> -
> -update:
Since 'update' label is removed, I think above comment should be
corrected too. Eventually patch could just remove 'utime = rtime;'
line to shut up coverity.

Stanislaw


Re: [PATCH] rt2x00: improve calling conventions for register accessors

2017-05-16 Thread Stanislaw Gruszka
On Tue, May 16, 2017 at 01:55:17PM +0200, Stanislaw Gruszka wrote:
> On Mon, May 15, 2017 at 10:39:51AM -0400, David Miller wrote:
> > From: Stanislaw Gruszka <sgrus...@redhat.com>
> > Date: Mon, 15 May 2017 16:28:01 +0200
> > 
> > > On Mon, May 15, 2017 at 03:46:55PM +0200, Arnd Bergmann wrote:
> > >> With CONFIG_KASAN enabled and gcc-7, we get a warning about rather high
> > >> stack usage (with a private patch set I have to turn on this warning,
> > >> which I intend to get into the next kernel release):
> > >> 
> > >> wireless/ralink/rt2x00/rt2800lib.c: In function 
> > >> 'rt2800_bw_filter_calibration':
> > >> wireless/ralink/rt2x00/rt2800lib.c:7990:1: error: the frame size of 2144 
> > >> bytes is larger than 1536 bytes [-Werror=frame-larger-than=]
> > >> 
> > >> The problem is that KASAN inserts a redzone around each local variable 
> > >> that
> > >> gets passed by reference, and the newly added function has a lot of them.
> > >> We can easily avoid that here by changing the calling convention to have
> > >> the output as the return value of the function. This should also results 
> > >> in
> > >> smaller object code, saving around 4KB in .text with KASAN, or 2KB 
> > >> without
> > >> KASAN.
> > >> 
> > >> Fixes: 41977e86c984 ("rt2x00: add support for MT7620")
> > >> Signed-off-by: Arnd Bergmann <a...@arndb.de>
> > >> ---
> > >>  drivers/net/wireless/ralink/rt2x00/rt2800lib.c | 319 
> > >> +
> > >>  1 file changed, 164 insertions(+), 155 deletions(-)
> > > 
> > > We have read(, ) calling convention since forever in rt2x00 and that
> > > was never a problem. I dislike to change that now to make some tools
> > > happy, I think problem should be fixed in the tools instead.
> > 
> > Passing return values by reference is and always has been a really
> > poor way to achieve what these functions are doing.
> > 
> > And frankly, whilst the tool could see what's going on here better, we
> > should be making code easier rather than more difficult to audit.
> > 
> > I am therefore very much in favor of Arnd's change.
> > 
> > This isn't even a situation where there are multiple return values,
> > such as needing to signal an error and return an unsigned value at the
> > same time.
> > 
> > These functions return _one_ value, and therefore they should be
> > returned as a true return value.
> 
> In rt2x00 driver we use poor convention in other kind of registers
> accessors like bbp, mac, eeprom. I dislike to changing only rfcsr
> accessors and leaving others in the old way. And changing all accessors
> would be massive and error prone change, which I'm not prefer either.
> 
> Arnd, could this be fixed by refactoring rt2800_bw_filter_calibration()
> function (which is enormous and definitely should be split into smaller
> subroutines) ? If not, I would accept this patch.

Does below patch make things better with KASAN compilation ? 

diff --git a/drivers/net/wireless/ralink/rt2x00/rt2800lib.c 
b/drivers/net/wireless/ralink/rt2x00/rt2800lib.c
index d11c7b2..4b85ef3 100644
--- a/drivers/net/wireless/ralink/rt2x00/rt2800lib.c
+++ b/drivers/net/wireless/ralink/rt2x00/rt2800lib.c
@@ -7740,6 +7740,39 @@ static char rt2800_lp_tx_filter_bw_cal(struct rt2x00_dev 
*rt2x00dev)
return cal_val;
 }
 
+#define RF_SAVE_NUM 24
+#define BBP_SAVE_NUM 3
+static const int rf_save_ids[RF_SAVE_NUM] = {0, 1, 3, 4, 5, 6, 7, 8,
+   17, 18, 19, 20, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 58, 59};
+
+static void rt2800_phy_save(struct rt2x00_dev *rt2x00dev,
+   u8 *const bbp_regs, u8 *const rf_regs)
+{
+   int i;
+
+   rt2800_bbp_read(rt2x00dev, 23, _regs[0]);
+
+   rt2800_bbp_dcoc_read(rt2x00dev, 0, _regs[1]);
+   rt2800_bbp_dcoc_read(rt2x00dev, 2, _regs[2]);
+
+   for (i = 0; i < RF_SAVE_NUM; i++)
+   rt2800_rfcsr_read_bank(rt2x00dev, 5, rf_save_ids[i], 
_regs[i]);
+}
+
+static void rt2800_phy_restore(struct rt2x00_dev *rt2x00dev,
+  const u8 *const bbp_regs, const u8 *const 
rf_regs)
+{
+   int i;
+
+   for (i = 0; i < RF_SAVE_NUM; i++)
+   rt2800_rfcsr_write_bank(rt2x00dev, 5, rf_save_ids[i], 
rf_regs[i]);
+
+   rt2800_bbp_write(rt2x00dev, 23, bbp_regs[0]);
+
+   rt2800_bbp_dcoc_write(rt2x00dev, 0, bbp_regs[1]);
+   rt2800_bbp_dcoc_write(rt2x00dev, 2, bbp_regs[2]);
+}
+
 static void rt2800_bw_filter_calibration(struct rt2x00_dev *rt2x00dev,
   

Re: [PATCH] rt2x00: improve calling conventions for register accessors

2017-05-16 Thread Stanislaw Gruszka
On Tue, May 16, 2017 at 01:55:17PM +0200, Stanislaw Gruszka wrote:
> On Mon, May 15, 2017 at 10:39:51AM -0400, David Miller wrote:
> > From: Stanislaw Gruszka 
> > Date: Mon, 15 May 2017 16:28:01 +0200
> > 
> > > On Mon, May 15, 2017 at 03:46:55PM +0200, Arnd Bergmann wrote:
> > >> With CONFIG_KASAN enabled and gcc-7, we get a warning about rather high
> > >> stack usage (with a private patch set I have to turn on this warning,
> > >> which I intend to get into the next kernel release):
> > >> 
> > >> wireless/ralink/rt2x00/rt2800lib.c: In function 
> > >> 'rt2800_bw_filter_calibration':
> > >> wireless/ralink/rt2x00/rt2800lib.c:7990:1: error: the frame size of 2144 
> > >> bytes is larger than 1536 bytes [-Werror=frame-larger-than=]
> > >> 
> > >> The problem is that KASAN inserts a redzone around each local variable 
> > >> that
> > >> gets passed by reference, and the newly added function has a lot of them.
> > >> We can easily avoid that here by changing the calling convention to have
> > >> the output as the return value of the function. This should also results 
> > >> in
> > >> smaller object code, saving around 4KB in .text with KASAN, or 2KB 
> > >> without
> > >> KASAN.
> > >> 
> > >> Fixes: 41977e86c984 ("rt2x00: add support for MT7620")
> > >> Signed-off-by: Arnd Bergmann 
> > >> ---
> > >>  drivers/net/wireless/ralink/rt2x00/rt2800lib.c | 319 
> > >> +
> > >>  1 file changed, 164 insertions(+), 155 deletions(-)
> > > 
> > > We have read(, ) calling convention since forever in rt2x00 and that
> > > was never a problem. I dislike to change that now to make some tools
> > > happy, I think problem should be fixed in the tools instead.
> > 
> > Passing return values by reference is and always has been a really
> > poor way to achieve what these functions are doing.
> > 
> > And frankly, whilst the tool could see what's going on here better, we
> > should be making code easier rather than more difficult to audit.
> > 
> > I am therefore very much in favor of Arnd's change.
> > 
> > This isn't even a situation where there are multiple return values,
> > such as needing to signal an error and return an unsigned value at the
> > same time.
> > 
> > These functions return _one_ value, and therefore they should be
> > returned as a true return value.
> 
> In rt2x00 driver we use poor convention in other kind of registers
> accessors like bbp, mac, eeprom. I dislike to changing only rfcsr
> accessors and leaving others in the old way. And changing all accessors
> would be massive and error prone change, which I'm not prefer either.
> 
> Arnd, could this be fixed by refactoring rt2800_bw_filter_calibration()
> function (which is enormous and definitely should be split into smaller
> subroutines) ? If not, I would accept this patch.

Does below patch make things better with KASAN compilation ? 

diff --git a/drivers/net/wireless/ralink/rt2x00/rt2800lib.c 
b/drivers/net/wireless/ralink/rt2x00/rt2800lib.c
index d11c7b2..4b85ef3 100644
--- a/drivers/net/wireless/ralink/rt2x00/rt2800lib.c
+++ b/drivers/net/wireless/ralink/rt2x00/rt2800lib.c
@@ -7740,6 +7740,39 @@ static char rt2800_lp_tx_filter_bw_cal(struct rt2x00_dev 
*rt2x00dev)
return cal_val;
 }
 
+#define RF_SAVE_NUM 24
+#define BBP_SAVE_NUM 3
+static const int rf_save_ids[RF_SAVE_NUM] = {0, 1, 3, 4, 5, 6, 7, 8,
+   17, 18, 19, 20, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 58, 59};
+
+static void rt2800_phy_save(struct rt2x00_dev *rt2x00dev,
+   u8 *const bbp_regs, u8 *const rf_regs)
+{
+   int i;
+
+   rt2800_bbp_read(rt2x00dev, 23, _regs[0]);
+
+   rt2800_bbp_dcoc_read(rt2x00dev, 0, _regs[1]);
+   rt2800_bbp_dcoc_read(rt2x00dev, 2, _regs[2]);
+
+   for (i = 0; i < RF_SAVE_NUM; i++)
+   rt2800_rfcsr_read_bank(rt2x00dev, 5, rf_save_ids[i], 
_regs[i]);
+}
+
+static void rt2800_phy_restore(struct rt2x00_dev *rt2x00dev,
+  const u8 *const bbp_regs, const u8 *const 
rf_regs)
+{
+   int i;
+
+   for (i = 0; i < RF_SAVE_NUM; i++)
+   rt2800_rfcsr_write_bank(rt2x00dev, 5, rf_save_ids[i], 
rf_regs[i]);
+
+   rt2800_bbp_write(rt2x00dev, 23, bbp_regs[0]);
+
+   rt2800_bbp_dcoc_write(rt2x00dev, 0, bbp_regs[1]);
+   rt2800_bbp_dcoc_write(rt2x00dev, 2, bbp_regs[2]);
+}
+
 static void rt2800_bw_filter_calibration(struct rt2x00_dev *rt2x00dev,
 bool btxcal)
 {
@@ -7751,52 +7784,15 @@ 

Re: [PATCH] rt2x00: improve calling conventions for register accessors

2017-05-16 Thread Stanislaw Gruszka
On Tue, May 16, 2017 at 01:58:56PM +0200, Johannes Berg wrote:
> On Tue, 2017-05-16 at 13:55 +0200, Stanislaw Gruszka wrote:
> > 
> > In rt2x00 driver we use poor convention in other kind of registers
> > accessors like bbp, mac, eeprom. I dislike to changing only rfcsr
> > accessors and leaving others in the old way. And changing all
> > accessors would be massive and error prone change, which I'm not
> > prefer either.
> 
> That's a stupid argument, but for the sake of it - the conversion can
> easily be done with coccinelle/spatch without being "error prone".

Sure, but still I think it would be preferable to fix newly added
rt2800_bw_filter_calibration() function, instead of ancient rfcsr
accessors.

Stanislaw  


Re: [PATCH] rt2x00: improve calling conventions for register accessors

2017-05-16 Thread Stanislaw Gruszka
On Tue, May 16, 2017 at 01:58:56PM +0200, Johannes Berg wrote:
> On Tue, 2017-05-16 at 13:55 +0200, Stanislaw Gruszka wrote:
> > 
> > In rt2x00 driver we use poor convention in other kind of registers
> > accessors like bbp, mac, eeprom. I dislike to changing only rfcsr
> > accessors and leaving others in the old way. And changing all
> > accessors would be massive and error prone change, which I'm not
> > prefer either.
> 
> That's a stupid argument, but for the sake of it - the conversion can
> easily be done with coccinelle/spatch without being "error prone".

Sure, but still I think it would be preferable to fix newly added
rt2800_bw_filter_calibration() function, instead of ancient rfcsr
accessors.

Stanislaw  


Re: [PATCH] rt2x00: improve calling conventions for register accessors

2017-05-16 Thread Stanislaw Gruszka
On Mon, May 15, 2017 at 10:39:51AM -0400, David Miller wrote:
> From: Stanislaw Gruszka <sgrus...@redhat.com>
> Date: Mon, 15 May 2017 16:28:01 +0200
> 
> > On Mon, May 15, 2017 at 03:46:55PM +0200, Arnd Bergmann wrote:
> >> With CONFIG_KASAN enabled and gcc-7, we get a warning about rather high
> >> stack usage (with a private patch set I have to turn on this warning,
> >> which I intend to get into the next kernel release):
> >> 
> >> wireless/ralink/rt2x00/rt2800lib.c: In function 
> >> 'rt2800_bw_filter_calibration':
> >> wireless/ralink/rt2x00/rt2800lib.c:7990:1: error: the frame size of 2144 
> >> bytes is larger than 1536 bytes [-Werror=frame-larger-than=]
> >> 
> >> The problem is that KASAN inserts a redzone around each local variable that
> >> gets passed by reference, and the newly added function has a lot of them.
> >> We can easily avoid that here by changing the calling convention to have
> >> the output as the return value of the function. This should also results in
> >> smaller object code, saving around 4KB in .text with KASAN, or 2KB without
> >> KASAN.
> >> 
> >> Fixes: 41977e86c984 ("rt2x00: add support for MT7620")
> >> Signed-off-by: Arnd Bergmann <a...@arndb.de>
> >> ---
> >>  drivers/net/wireless/ralink/rt2x00/rt2800lib.c | 319 
> >> +
> >>  1 file changed, 164 insertions(+), 155 deletions(-)
> > 
> > We have read(, ) calling convention since forever in rt2x00 and that
> > was never a problem. I dislike to change that now to make some tools
> > happy, I think problem should be fixed in the tools instead.
> 
> Passing return values by reference is and always has been a really
> poor way to achieve what these functions are doing.
> 
> And frankly, whilst the tool could see what's going on here better, we
> should be making code easier rather than more difficult to audit.
> 
> I am therefore very much in favor of Arnd's change.
> 
> This isn't even a situation where there are multiple return values,
> such as needing to signal an error and return an unsigned value at the
> same time.
> 
> These functions return _one_ value, and therefore they should be
> returned as a true return value.

In rt2x00 driver we use poor convention in other kind of registers
accessors like bbp, mac, eeprom. I dislike to changing only rfcsr
accessors and leaving others in the old way. And changing all accessors
would be massive and error prone change, which I'm not prefer either.

Arnd, could this be fixed by refactoring rt2800_bw_filter_calibration()
function (which is enormous and definitely should be split into smaller
subroutines) ? If not, I would accept this patch.

Thanks
Stanislaw


Re: [PATCH] rt2x00: improve calling conventions for register accessors

2017-05-16 Thread Stanislaw Gruszka
On Mon, May 15, 2017 at 10:39:51AM -0400, David Miller wrote:
> From: Stanislaw Gruszka 
> Date: Mon, 15 May 2017 16:28:01 +0200
> 
> > On Mon, May 15, 2017 at 03:46:55PM +0200, Arnd Bergmann wrote:
> >> With CONFIG_KASAN enabled and gcc-7, we get a warning about rather high
> >> stack usage (with a private patch set I have to turn on this warning,
> >> which I intend to get into the next kernel release):
> >> 
> >> wireless/ralink/rt2x00/rt2800lib.c: In function 
> >> 'rt2800_bw_filter_calibration':
> >> wireless/ralink/rt2x00/rt2800lib.c:7990:1: error: the frame size of 2144 
> >> bytes is larger than 1536 bytes [-Werror=frame-larger-than=]
> >> 
> >> The problem is that KASAN inserts a redzone around each local variable that
> >> gets passed by reference, and the newly added function has a lot of them.
> >> We can easily avoid that here by changing the calling convention to have
> >> the output as the return value of the function. This should also results in
> >> smaller object code, saving around 4KB in .text with KASAN, or 2KB without
> >> KASAN.
> >> 
> >> Fixes: 41977e86c984 ("rt2x00: add support for MT7620")
> >> Signed-off-by: Arnd Bergmann 
> >> ---
> >>  drivers/net/wireless/ralink/rt2x00/rt2800lib.c | 319 
> >> +
> >>  1 file changed, 164 insertions(+), 155 deletions(-)
> > 
> > We have read(, ) calling convention since forever in rt2x00 and that
> > was never a problem. I dislike to change that now to make some tools
> > happy, I think problem should be fixed in the tools instead.
> 
> Passing return values by reference is and always has been a really
> poor way to achieve what these functions are doing.
> 
> And frankly, whilst the tool could see what's going on here better, we
> should be making code easier rather than more difficult to audit.
> 
> I am therefore very much in favor of Arnd's change.
> 
> This isn't even a situation where there are multiple return values,
> such as needing to signal an error and return an unsigned value at the
> same time.
> 
> These functions return _one_ value, and therefore they should be
> returned as a true return value.

In rt2x00 driver we use poor convention in other kind of registers
accessors like bbp, mac, eeprom. I dislike to changing only rfcsr
accessors and leaving others in the old way. And changing all accessors
would be massive and error prone change, which I'm not prefer either.

Arnd, could this be fixed by refactoring rt2800_bw_filter_calibration()
function (which is enormous and definitely should be split into smaller
subroutines) ? If not, I would accept this patch.

Thanks
Stanislaw


Re: [PATCH] rt2x00: improve calling conventions for register accessors

2017-05-15 Thread Stanislaw Gruszka
On Mon, May 15, 2017 at 03:46:55PM +0200, Arnd Bergmann wrote:
> With CONFIG_KASAN enabled and gcc-7, we get a warning about rather high
> stack usage (with a private patch set I have to turn on this warning,
> which I intend to get into the next kernel release):
> 
> wireless/ralink/rt2x00/rt2800lib.c: In function 
> 'rt2800_bw_filter_calibration':
> wireless/ralink/rt2x00/rt2800lib.c:7990:1: error: the frame size of 2144 
> bytes is larger than 1536 bytes [-Werror=frame-larger-than=]
> 
> The problem is that KASAN inserts a redzone around each local variable that
> gets passed by reference, and the newly added function has a lot of them.
> We can easily avoid that here by changing the calling convention to have
> the output as the return value of the function. This should also results in
> smaller object code, saving around 4KB in .text with KASAN, or 2KB without
> KASAN.
> 
> Fixes: 41977e86c984 ("rt2x00: add support for MT7620")
> Signed-off-by: Arnd Bergmann 
> ---
>  drivers/net/wireless/ralink/rt2x00/rt2800lib.c | 319 
> +
>  1 file changed, 164 insertions(+), 155 deletions(-)

We have read(, ) calling convention since forever in rt2x00 and that
was never a problem. I dislike to change that now to make some tools
happy, I think problem should be fixed in the tools instead.

Stanislaw


Re: [PATCH] rt2x00: improve calling conventions for register accessors

2017-05-15 Thread Stanislaw Gruszka
On Mon, May 15, 2017 at 03:46:55PM +0200, Arnd Bergmann wrote:
> With CONFIG_KASAN enabled and gcc-7, we get a warning about rather high
> stack usage (with a private patch set I have to turn on this warning,
> which I intend to get into the next kernel release):
> 
> wireless/ralink/rt2x00/rt2800lib.c: In function 
> 'rt2800_bw_filter_calibration':
> wireless/ralink/rt2x00/rt2800lib.c:7990:1: error: the frame size of 2144 
> bytes is larger than 1536 bytes [-Werror=frame-larger-than=]
> 
> The problem is that KASAN inserts a redzone around each local variable that
> gets passed by reference, and the newly added function has a lot of them.
> We can easily avoid that here by changing the calling convention to have
> the output as the return value of the function. This should also results in
> smaller object code, saving around 4KB in .text with KASAN, or 2KB without
> KASAN.
> 
> Fixes: 41977e86c984 ("rt2x00: add support for MT7620")
> Signed-off-by: Arnd Bergmann 
> ---
>  drivers/net/wireless/ralink/rt2x00/rt2800lib.c | 319 
> +
>  1 file changed, 164 insertions(+), 155 deletions(-)

We have read(, ) calling convention since forever in rt2x00 and that
was never a problem. I dislike to change that now to make some tools
happy, I think problem should be fixed in the tools instead.

Stanislaw


Re: [PATCH] mm, vmscan: do not loop on too_many_isolated for ever

2017-04-25 Thread Stanislaw Gruszka
On Mon, Apr 24, 2017 at 10:06:32PM +0900, Tetsuo Handa wrote:
> Stanislaw Gruszka wrote:
> > On Sun, Apr 23, 2017 at 07:24:21PM +0900, Tetsuo Handa wrote:
> > > On 2017/03/10 20:44, Tetsuo Handa wrote:
> > > > Michal Hocko wrote:
> > > >> I am definitely not against. There is no reason to rush the patch in.
> > > > 
> > > > I don't hurry if we can check using watchdog whether this problem is 
> > > > occurring
> > > > in the real world. I have to test corner cases because watchdog is 
> > > > missing.
> > > > 
> > > Ping?
> > > 
> > > This problem can occur even immediately after the first invocation of
> > > the OOM killer. I believe this problem can occur in the real world.
> > > When are we going to apply this patch or watchdog patch?
> > > 
> > > 
> > > [0.00] Linux version 4.11.0-rc7-next-20170421+ (root@ccsecurity) 
> > > (gcc version 4.8.5 20150623 (Red Hat 4.8.5-11) (GCC) ) #588 SMP Sun Apr 
> > > 23 17:38:02 JST 2017
> > > [0.00] Command line: 
> > > BOOT_IMAGE=/boot/vmlinuz-4.11.0-rc7-next-20170421+ 
> > > root=UUID=17c3c28f-a70a-4666-95fa-ecf6acd901e4 ro vconsole.keymap=jp106 
> > > crashkernel=256M vconsole.font=latarcyrheb-sun16 security=none 
> > > sysrq_always_enabled console=ttyS0,115200n8 console=tty0 LANG=en_US.UTF-8 
> > > debug_guardpage_minorder=1
> > 
> > Are you debugging memory corruption problem?
> 
> No. Just a random testing trying to find how we can avoid flooding of
> warn_alloc_stall() warning messages while also avoiding ratelimiting.

This is not right way to stress mm subsystem, debug_guardpage_minorder= 
option is for _debug_ purpose. Use mem= instead if you want to limit
available memory.

> > FWIW, if you use debug_guardpage_minorder= you can expect any
> > allocation memory problems. This option is intended to debug
> > memory corruption bugs and it shrinks available memory in 
> > artificial way. Taking that, I don't think justifying any
> > patch, by problem happened when debug_guardpage_minorder= is 
> > used, is reasonable.
> >  
> > Stanislaw
> 
> This problem occurs without debug_guardpage_minorder= parameter and

So please justify your patches by that.

Thanks
Stanislaw


Re: [PATCH] mm, vmscan: do not loop on too_many_isolated for ever

2017-04-25 Thread Stanislaw Gruszka
On Mon, Apr 24, 2017 at 10:06:32PM +0900, Tetsuo Handa wrote:
> Stanislaw Gruszka wrote:
> > On Sun, Apr 23, 2017 at 07:24:21PM +0900, Tetsuo Handa wrote:
> > > On 2017/03/10 20:44, Tetsuo Handa wrote:
> > > > Michal Hocko wrote:
> > > >> I am definitely not against. There is no reason to rush the patch in.
> > > > 
> > > > I don't hurry if we can check using watchdog whether this problem is 
> > > > occurring
> > > > in the real world. I have to test corner cases because watchdog is 
> > > > missing.
> > > > 
> > > Ping?
> > > 
> > > This problem can occur even immediately after the first invocation of
> > > the OOM killer. I believe this problem can occur in the real world.
> > > When are we going to apply this patch or watchdog patch?
> > > 
> > > 
> > > [0.00] Linux version 4.11.0-rc7-next-20170421+ (root@ccsecurity) 
> > > (gcc version 4.8.5 20150623 (Red Hat 4.8.5-11) (GCC) ) #588 SMP Sun Apr 
> > > 23 17:38:02 JST 2017
> > > [0.00] Command line: 
> > > BOOT_IMAGE=/boot/vmlinuz-4.11.0-rc7-next-20170421+ 
> > > root=UUID=17c3c28f-a70a-4666-95fa-ecf6acd901e4 ro vconsole.keymap=jp106 
> > > crashkernel=256M vconsole.font=latarcyrheb-sun16 security=none 
> > > sysrq_always_enabled console=ttyS0,115200n8 console=tty0 LANG=en_US.UTF-8 
> > > debug_guardpage_minorder=1
> > 
> > Are you debugging memory corruption problem?
> 
> No. Just a random testing trying to find how we can avoid flooding of
> warn_alloc_stall() warning messages while also avoiding ratelimiting.

This is not right way to stress mm subsystem, debug_guardpage_minorder= 
option is for _debug_ purpose. Use mem= instead if you want to limit
available memory.

> > FWIW, if you use debug_guardpage_minorder= you can expect any
> > allocation memory problems. This option is intended to debug
> > memory corruption bugs and it shrinks available memory in 
> > artificial way. Taking that, I don't think justifying any
> > patch, by problem happened when debug_guardpage_minorder= is 
> > used, is reasonable.
> >  
> > Stanislaw
> 
> This problem occurs without debug_guardpage_minorder= parameter and

So please justify your patches by that.

Thanks
Stanislaw


Re: [PATCH] mm, vmscan: do not loop on too_many_isolated for ever

2017-04-24 Thread Stanislaw Gruszka
On Sun, Apr 23, 2017 at 07:24:21PM +0900, Tetsuo Handa wrote:
> On 2017/03/10 20:44, Tetsuo Handa wrote:
> > Michal Hocko wrote:
> >> On Thu 09-03-17 13:05:40, Johannes Weiner wrote:
> >>> On Tue, Mar 07, 2017 at 02:52:36PM -0500, Rik van Riel wrote:
>  It only does this to some extent.  If reclaim made
>  no progress, for example due to immediately bailing
>  out because the number of already isolated pages is
>  too high (due to many parallel reclaimers), the code
>  could hit the "no_progress_loops > MAX_RECLAIM_RETRIES"
>  test without ever looking at the number of reclaimable
>  pages.
> >>>
> >>> Hm, there is no early return there, actually. We bump the loop counter
> >>> every time it happens, but then *do* look at the reclaimable pages.
> >>>
>  Could that create problems if we have many concurrent
>  reclaimers?
> >>>
> >>> With increased concurrency, the likelihood of OOM will go up if we
> >>> remove the unlimited wait for isolated pages, that much is true.
> >>>
> >>> I'm not sure that's a bad thing, however, because we want the OOM
> >>> killer to be predictable and timely. So a reasonable wait time in
> >>> between 0 and forever before an allocating thread gives up under
> >>> extreme concurrency makes sense to me.
> >>>
>  It may be OK, I just do not understand all the implications.
> 
>  I like the general direction your patch takes the code in,
>  but I would like to understand it better...
> >>>
> >>> I feel the same way. The throttling logic doesn't seem to be very well
> >>> thought out at the moment, making it hard to reason about what happens
> >>> in certain scenarios.
> >>>
> >>> In that sense, this patch isn't really an overall improvement to the
> >>> way things work. It patches a hole that seems to be exploitable only
> >>> from an artificial OOM torture test, at the risk of regressing high
> >>> concurrency workloads that may or may not be artificial.
> >>>
> >>> Unless I'm mistaken, there doesn't seem to be a whole lot of urgency
> >>> behind this patch. Can we think about a general model to deal with
> >>> allocation concurrency? 
> >>
> >> I am definitely not against. There is no reason to rush the patch in.
> > 
> > I don't hurry if we can check using watchdog whether this problem is 
> > occurring
> > in the real world. I have to test corner cases because watchdog is missing.
> > 
> Ping?
> 
> This problem can occur even immediately after the first invocation of
> the OOM killer. I believe this problem can occur in the real world.
> When are we going to apply this patch or watchdog patch?
> 
> 
> [0.00] Linux version 4.11.0-rc7-next-20170421+ (root@ccsecurity) (gcc 
> version 4.8.5 20150623 (Red Hat 4.8.5-11) (GCC) ) #588 SMP Sun Apr 23 
> 17:38:02 JST 2017
> [0.00] Command line: 
> BOOT_IMAGE=/boot/vmlinuz-4.11.0-rc7-next-20170421+ 
> root=UUID=17c3c28f-a70a-4666-95fa-ecf6acd901e4 ro vconsole.keymap=jp106 
> crashkernel=256M vconsole.font=latarcyrheb-sun16 security=none 
> sysrq_always_enabled console=ttyS0,115200n8 console=tty0 LANG=en_US.UTF-8 
> debug_guardpage_minorder=1

Are you debugging memory corruption problem?

FWIW, if you use debug_guardpage_minorder= you can expect any
allocation memory problems. This option is intended to debug
memory corruption bugs and it shrinks available memory in 
artificial way. Taking that, I don't think justifying any
patch, by problem happened when debug_guardpage_minorder= is 
used, is reasonable.
 
Stanislaw


Re: [PATCH] mm, vmscan: do not loop on too_many_isolated for ever

2017-04-24 Thread Stanislaw Gruszka
On Sun, Apr 23, 2017 at 07:24:21PM +0900, Tetsuo Handa wrote:
> On 2017/03/10 20:44, Tetsuo Handa wrote:
> > Michal Hocko wrote:
> >> On Thu 09-03-17 13:05:40, Johannes Weiner wrote:
> >>> On Tue, Mar 07, 2017 at 02:52:36PM -0500, Rik van Riel wrote:
>  It only does this to some extent.  If reclaim made
>  no progress, for example due to immediately bailing
>  out because the number of already isolated pages is
>  too high (due to many parallel reclaimers), the code
>  could hit the "no_progress_loops > MAX_RECLAIM_RETRIES"
>  test without ever looking at the number of reclaimable
>  pages.
> >>>
> >>> Hm, there is no early return there, actually. We bump the loop counter
> >>> every time it happens, but then *do* look at the reclaimable pages.
> >>>
>  Could that create problems if we have many concurrent
>  reclaimers?
> >>>
> >>> With increased concurrency, the likelihood of OOM will go up if we
> >>> remove the unlimited wait for isolated pages, that much is true.
> >>>
> >>> I'm not sure that's a bad thing, however, because we want the OOM
> >>> killer to be predictable and timely. So a reasonable wait time in
> >>> between 0 and forever before an allocating thread gives up under
> >>> extreme concurrency makes sense to me.
> >>>
>  It may be OK, I just do not understand all the implications.
> 
>  I like the general direction your patch takes the code in,
>  but I would like to understand it better...
> >>>
> >>> I feel the same way. The throttling logic doesn't seem to be very well
> >>> thought out at the moment, making it hard to reason about what happens
> >>> in certain scenarios.
> >>>
> >>> In that sense, this patch isn't really an overall improvement to the
> >>> way things work. It patches a hole that seems to be exploitable only
> >>> from an artificial OOM torture test, at the risk of regressing high
> >>> concurrency workloads that may or may not be artificial.
> >>>
> >>> Unless I'm mistaken, there doesn't seem to be a whole lot of urgency
> >>> behind this patch. Can we think about a general model to deal with
> >>> allocation concurrency? 
> >>
> >> I am definitely not against. There is no reason to rush the patch in.
> > 
> > I don't hurry if we can check using watchdog whether this problem is 
> > occurring
> > in the real world. I have to test corner cases because watchdog is missing.
> > 
> Ping?
> 
> This problem can occur even immediately after the first invocation of
> the OOM killer. I believe this problem can occur in the real world.
> When are we going to apply this patch or watchdog patch?
> 
> 
> [0.00] Linux version 4.11.0-rc7-next-20170421+ (root@ccsecurity) (gcc 
> version 4.8.5 20150623 (Red Hat 4.8.5-11) (GCC) ) #588 SMP Sun Apr 23 
> 17:38:02 JST 2017
> [0.00] Command line: 
> BOOT_IMAGE=/boot/vmlinuz-4.11.0-rc7-next-20170421+ 
> root=UUID=17c3c28f-a70a-4666-95fa-ecf6acd901e4 ro vconsole.keymap=jp106 
> crashkernel=256M vconsole.font=latarcyrheb-sun16 security=none 
> sysrq_always_enabled console=ttyS0,115200n8 console=tty0 LANG=en_US.UTF-8 
> debug_guardpage_minorder=1

Are you debugging memory corruption problem?

FWIW, if you use debug_guardpage_minorder= you can expect any
allocation memory problems. This option is intended to debug
memory corruption bugs and it shrinks available memory in 
artificial way. Taking that, I don't think justifying any
patch, by problem happened when debug_guardpage_minorder= is 
used, is reasonable.
 
Stanislaw


Re: [PATCH 2/2 v2] xen/acpi: upload PM state from init-domain to Xen

2017-03-23 Thread Stanislaw Gruszka
On Wed, Mar 22, 2017 at 10:56:02AM -0700, Ankur Arora wrote:
> >It is ok to do upload_pm_data() with delay i.e. after some other
> >resume actions are done and possibly xen-acpi-processor is in
> >running state ?
> The state uploaded is ACPI P and C state from struct acpi_processor
> which AFAICS is stable once inited so a delay would not lead to
> invalid state.
> The only concern would be the ACPI pCPU hotplug logic in
> acpi_processor_add() which could add a new entry in
> per_cpu(processors) but that also looks okay because either we
> get a NULL or we get a pointer to an inited structure.
> 
> As for the hypervisor -- that falls back to more limited state after
> resume (because some of this state is thrown away at suspend) and so
> uses that until it gets the uploaded PM state from the initial-domain.

Patch looks good to me then.

Reviewed-by: Stanislaw Gruszka <sgrus...@redhat.com>


Re: [PATCH 2/2 v2] xen/acpi: upload PM state from init-domain to Xen

2017-03-23 Thread Stanislaw Gruszka
On Wed, Mar 22, 2017 at 10:56:02AM -0700, Ankur Arora wrote:
> >It is ok to do upload_pm_data() with delay i.e. after some other
> >resume actions are done and possibly xen-acpi-processor is in
> >running state ?
> The state uploaded is ACPI P and C state from struct acpi_processor
> which AFAICS is stable once inited so a delay would not lead to
> invalid state.
> The only concern would be the ACPI pCPU hotplug logic in
> acpi_processor_add() which could add a new entry in
> per_cpu(processors) but that also looks okay because either we
> get a NULL or we get a pointer to an inited structure.
> 
> As for the hypervisor -- that falls back to more limited state after
> resume (because some of this state is thrown away at suspend) and so
> uses that until it gets the uploaded PM state from the initial-domain.

Patch looks good to me then.

Reviewed-by: Stanislaw Gruszka 


Re: [PATCH 2/2 v2] xen/acpi: upload PM state from init-domain to Xen

2017-03-22 Thread Stanislaw Gruszka
On Tue, Mar 21, 2017 at 03:43:38PM -0700, Ankur Arora wrote:
> This was broken in commit cd979883b9ede90643e019f33cb317933eb867b4.
> do_suspend (from xen/manage.c) and thus xen_resume_notifier never get
> called on the initial-domain at resume (it is if running as guest.)
> 
> The rationale for the breaking change was that upload_pm_data()
> potentially does blocking work in syscore_resume(). This patch
> addresses the original issue by scheduling upload_pm_data() to
> execute in workqueue context.

It is ok to do upload_pm_data() with delay i.e. after some other
resume actions are done and possibly xen-acpi-processor is in
running state ?

Stanislaw


Re: [PATCH 2/2 v2] xen/acpi: upload PM state from init-domain to Xen

2017-03-22 Thread Stanislaw Gruszka
On Tue, Mar 21, 2017 at 03:43:38PM -0700, Ankur Arora wrote:
> This was broken in commit cd979883b9ede90643e019f33cb317933eb867b4.
> do_suspend (from xen/manage.c) and thus xen_resume_notifier never get
> called on the initial-domain at resume (it is if running as guest.)
> 
> The rationale for the breaking change was that upload_pm_data()
> potentially does blocking work in syscore_resume(). This patch
> addresses the original issue by scheduling upload_pm_data() to
> execute in workqueue context.

It is ok to do upload_pm_data() with delay i.e. after some other
resume actions are done and possibly xen-acpi-processor is in
running state ?

Stanislaw


Re: [GIT PULL] cputime: Convert core use of cputime_t to nsecs

2017-01-30 Thread Stanislaw Gruszka
On Mon, Jan 30, 2017 at 03:56:49PM +0100, Frederic Weisbecker wrote:
> On Mon, Jan 30, 2017 at 03:32:24PM +0100, Stanislaw Gruszka wrote:
> > On Mon, Jan 30, 2017 at 05:46:43AM +0100, Frederic Weisbecker wrote:
> > > Now lets admit one drawback: s390 and powerpc with
> > > CONFIG_VIRT_CPU_ACCOUNTING_NATIVE have new cputime_t to nsecs conversion
> > > on cputime accounting path. But this should be leveraged by the recent
> > > changes which delay the cputime accounting to tick and context switch. 
> > 
> > I think it would be worth to mention that there are other drawbacks on
> > 32bit architectures that use cputime-jiffies currently, like:
> > - cache utilization will be worse
> 
> Due to utime and stime becoming 64 bits? Yeah indeed.

:-) Well, it is possible that something that use to fit into one cache
line, now will be consuming 2 cache lines and results more cache misses.

In general, this patchset seems to be nice cleanup of code, but how it
affect runtime efficiency is not clear and might depend on .config
and architecture. However seems 32bit embedded users will be affected
negatively.

Stanislaw


Re: [GIT PULL] cputime: Convert core use of cputime_t to nsecs

2017-01-30 Thread Stanislaw Gruszka
On Mon, Jan 30, 2017 at 03:56:49PM +0100, Frederic Weisbecker wrote:
> On Mon, Jan 30, 2017 at 03:32:24PM +0100, Stanislaw Gruszka wrote:
> > On Mon, Jan 30, 2017 at 05:46:43AM +0100, Frederic Weisbecker wrote:
> > > Now lets admit one drawback: s390 and powerpc with
> > > CONFIG_VIRT_CPU_ACCOUNTING_NATIVE have new cputime_t to nsecs conversion
> > > on cputime accounting path. But this should be leveraged by the recent
> > > changes which delay the cputime accounting to tick and context switch. 
> > 
> > I think it would be worth to mention that there are other drawbacks on
> > 32bit architectures that use cputime-jiffies currently, like:
> > - cache utilization will be worse
> 
> Due to utime and stime becoming 64 bits? Yeah indeed.

:-) Well, it is possible that something that use to fit into one cache
line, now will be consuming 2 cache lines and results more cache misses.

In general, this patchset seems to be nice cleanup of code, but how it
affect runtime efficiency is not clear and might depend on .config
and architecture. However seems 32bit embedded users will be affected
negatively.

Stanislaw


Re: [GIT PULL] cputime: Convert core use of cputime_t to nsecs

2017-01-30 Thread Stanislaw Gruszka
On Mon, Jan 30, 2017 at 05:46:43AM +0100, Frederic Weisbecker wrote:
> Now lets admit one drawback: s390 and powerpc with
> CONFIG_VIRT_CPU_ACCOUNTING_NATIVE have new cputime_t to nsecs conversion
> on cputime accounting path. But this should be leveraged by the recent
> changes which delay the cputime accounting to tick and context switch. 

I think it would be worth to mention that there are other drawbacks on
32bit architectures that use cputime-jiffies currently, like:
- cache utilization will be worse
- conversion cputime_to_jiffies() and cputime_to_clock_t() (with HZ == USER_HS)
  will no longer be an no-op
- to keep values consistent will need to add protection of u64 store/load,
  which will create additional performance costs

Stanislaw


Re: [GIT PULL] cputime: Convert core use of cputime_t to nsecs

2017-01-30 Thread Stanislaw Gruszka
On Mon, Jan 30, 2017 at 05:46:43AM +0100, Frederic Weisbecker wrote:
> Now lets admit one drawback: s390 and powerpc with
> CONFIG_VIRT_CPU_ACCOUNTING_NATIVE have new cputime_t to nsecs conversion
> on cputime accounting path. But this should be leveraged by the recent
> changes which delay the cputime accounting to tick and context switch. 

I think it would be worth to mention that there are other drawbacks on
32bit architectures that use cputime-jiffies currently, like:
- cache utilization will be worse
- conversion cputime_to_jiffies() and cputime_to_clock_t() (with HZ == USER_HS)
  will no longer be an no-op
- to keep values consistent will need to add protection of u64 store/load,
  which will create additional performance costs

Stanislaw


Re: [PATCH 08/37] cputime: Convert task/group cputime to nsecs

2017-01-30 Thread Stanislaw Gruszka
On Sat, Jan 28, 2017 at 04:28:13PM +0100, Frederic Weisbecker wrote:
> On Sat, Jan 28, 2017 at 12:57:40PM +0100, Stanislaw Gruszka wrote:
> > On 32 bit architectures 64bit store/load is not atomic and if not
> > protected - 64bit variables can be mangled. I do not see any protection
> > (lock) between utime/stime store and load in the patch and seems that
> > {u/s}time store/load can be performed at the same time. Though problem
> > is very very improbable it still can happen at least theoretically when
> > lower and upper 32 bits are changed at the same time i.e. process
> > {u,s}time become near to multiple of 2**32 nsec (aprox: 4sec) and
> > 64bit {u,s}time is stored and loaded at the same time on different
> > cpus. As said this is very improbable situation, but eventually could
> > be possible on long lived processes.
> 
> "Improbable situation" doesn't appply to Linux. With millions (billion?)
> of machines using it, a rare issue in the core turns into likely to happen
> somewhere in the planet every second.
> 
> So it's definetly a race we want to consider. Note it goes beyond the scope
> of this patchset as the issue was already there before since cputime_t can 
> already
> map to u64 on 32 bits systems upstream. But this patchset definetly extends
> the issue on all 32 bits configs.
> 
> kcpustat has the same issue upstream. It's is made of u64 on all configs.

I would like to add what are possible consequences if value will be
mangled. For sum_exec_runtime, utime and stime we could get wrong values
on cpu-clock related syscalls like clock_gettime() or clock_nanosleep()
and cpu-clock timers like timer_create(CLOCK_PROCESS_CPUTIME_ID) can be
triggered before or long after expected. For kcpustat this seems to be
wrong values read by procfs and 3 drivers (cpufreq, appldata, macintosh).

> > I considering fixing problem of sum_exec_runtime possible mangling
> > by using prev_sum_exec_runtime:
> > 
> > u64 read_sum_exec_runtime(struct task_struct *t)
> > {
> >u64 ns, prev_ns;
> >  
> >do {
> >prev_ns = READ_ONCE(t->se.prev_sum_exec_runtime);
> >ns = READ_ONCE(t->se.sum_exec_runtime);
> >} while (ns < prev_ns || ns > (prev_ns + U32_MAX));
> >  
> >return ns;
> > }
> > 
> > This should work based on fact that prev_sum_exec_runtime and
> > sum_exec_runtime are not modified and stored at the same time, so only
> > one of those variabled can be mangled. Though I need to think about 
> > correctnes of that a bit more.
> 
> I'm not sure that would be enough. READ_ONCE prevents from reordering by the
> compiler but not by the CPU. You'd need memory barriers between reads and
> writes of prev_ns and ns.

It will not be enough, this _suppose_ to work based on that sum_exec_runtime
and prev_sum_exec_runtime are not written at the same time. i.e. only
one variable can be mangled as another one sits already in the memory.
However "not written at the same time" is weak part of reasoning. Even
if those variables are stored at different part of code (sum_exec_runtime
on update_curr() and prev_sum_exec_runtime on set_next_entity()) we can
not assume store of one variable is finished before another one starts.

>WRITE nsREAD prev_ns
>smp_wmb()   smp_rmb()
>WRITE prev_ns   READ ns
>smp_wmb()   smp_rmb()
>
> It seems to be the only way to make sure that at least one of the reads
> (prev_ns or ns) is correct.

I think you have right, but seems on much code paths we have scenario:

WRITE nsREAD prev_ns
smp_wmb()   smp_rmb()
WRITE prev_ns   READ ns

and we have already smp_wmb() after write of sum_exec_runtime on
update_min_vruntime().

Stanislaw


Re: [PATCH 08/37] cputime: Convert task/group cputime to nsecs

2017-01-30 Thread Stanislaw Gruszka
On Sat, Jan 28, 2017 at 04:28:13PM +0100, Frederic Weisbecker wrote:
> On Sat, Jan 28, 2017 at 12:57:40PM +0100, Stanislaw Gruszka wrote:
> > On 32 bit architectures 64bit store/load is not atomic and if not
> > protected - 64bit variables can be mangled. I do not see any protection
> > (lock) between utime/stime store and load in the patch and seems that
> > {u/s}time store/load can be performed at the same time. Though problem
> > is very very improbable it still can happen at least theoretically when
> > lower and upper 32 bits are changed at the same time i.e. process
> > {u,s}time become near to multiple of 2**32 nsec (aprox: 4sec) and
> > 64bit {u,s}time is stored and loaded at the same time on different
> > cpus. As said this is very improbable situation, but eventually could
> > be possible on long lived processes.
> 
> "Improbable situation" doesn't appply to Linux. With millions (billion?)
> of machines using it, a rare issue in the core turns into likely to happen
> somewhere in the planet every second.
> 
> So it's definetly a race we want to consider. Note it goes beyond the scope
> of this patchset as the issue was already there before since cputime_t can 
> already
> map to u64 on 32 bits systems upstream. But this patchset definetly extends
> the issue on all 32 bits configs.
> 
> kcpustat has the same issue upstream. It's is made of u64 on all configs.

I would like to add what are possible consequences if value will be
mangled. For sum_exec_runtime, utime and stime we could get wrong values
on cpu-clock related syscalls like clock_gettime() or clock_nanosleep()
and cpu-clock timers like timer_create(CLOCK_PROCESS_CPUTIME_ID) can be
triggered before or long after expected. For kcpustat this seems to be
wrong values read by procfs and 3 drivers (cpufreq, appldata, macintosh).

> > I considering fixing problem of sum_exec_runtime possible mangling
> > by using prev_sum_exec_runtime:
> > 
> > u64 read_sum_exec_runtime(struct task_struct *t)
> > {
> >u64 ns, prev_ns;
> >  
> >do {
> >prev_ns = READ_ONCE(t->se.prev_sum_exec_runtime);
> >ns = READ_ONCE(t->se.sum_exec_runtime);
> >} while (ns < prev_ns || ns > (prev_ns + U32_MAX));
> >  
> >return ns;
> > }
> > 
> > This should work based on fact that prev_sum_exec_runtime and
> > sum_exec_runtime are not modified and stored at the same time, so only
> > one of those variabled can be mangled. Though I need to think about 
> > correctnes of that a bit more.
> 
> I'm not sure that would be enough. READ_ONCE prevents from reordering by the
> compiler but not by the CPU. You'd need memory barriers between reads and
> writes of prev_ns and ns.

It will not be enough, this _suppose_ to work based on that sum_exec_runtime
and prev_sum_exec_runtime are not written at the same time. i.e. only
one variable can be mangled as another one sits already in the memory.
However "not written at the same time" is weak part of reasoning. Even
if those variables are stored at different part of code (sum_exec_runtime
on update_curr() and prev_sum_exec_runtime on set_next_entity()) we can
not assume store of one variable is finished before another one starts.

>WRITE nsREAD prev_ns
>smp_wmb()   smp_rmb()
>WRITE prev_ns   READ ns
>smp_wmb()   smp_rmb()
>
> It seems to be the only way to make sure that at least one of the reads
> (prev_ns or ns) is correct.

I think you have right, but seems on much code paths we have scenario:

WRITE nsREAD prev_ns
smp_wmb()   smp_rmb()
WRITE prev_ns   READ ns

and we have already smp_wmb() after write of sum_exec_runtime on
update_min_vruntime().

Stanislaw


Re: [PATCH 08/37] cputime: Convert task/group cputime to nsecs

2017-01-28 Thread Stanislaw Gruszka
Hi Frederic and sorry for late comment.

On Sun, Jan 22, 2017 at 07:19:44PM +0100, Frederic Weisbecker wrote:
> Now that most cputime readers use the transition API which return the
> task cputime in old style cputime_t, we can safely store the cputime in
> nsecs. This will eventually make cputime statistics less opaque and more
> granular. Back and forth convertions between cputime_t and nsecs in order
> to deal with cputime_t random granularity won't be needed anymore.

> - cputime_t utime;
> - cputime_t stime;
> + u64 utime;
> + u64 stime;
>   unsigned long long sum_exec_runtime;
>  };

> @@ -134,7 +134,7 @@ void account_user_time(struct task_struct *p, cputime_t 
> cputime)
>   int index;
>  
>   /* Add user time to process. */
> - p->utime += cputime;
> + p->utime += cputime_to_nsecs(cputime);
>   account_group_user_time(p, cputime);

> +void task_cputime_adjusted(struct task_struct *p, u64 *ut, u64 *st)
>  {
>   *ut = p->utime;
>   *st = p->stime;
> o }

On 32 bit architectures 64bit store/load is not atomic and if not
protected - 64bit variables can be mangled. I do not see any protection
(lock) between utime/stime store and load in the patch and seems that
{u/s}time store/load can be performed at the same time. Though problem
is very very improbable it still can happen at least theoretically when
lower and upper 32 bits are changed at the same time i.e. process
{u,s}time become near to multiple of 2**32 nsec (aprox: 4sec) and
64bit {u,s}time is stored and loaded at the same time on different
cpus. As said this is very improbable situation, but eventually could
be possible on long lived processes.

BTW we have already similar problem with sum_exec_runtime. I posted
some patches to solve the problem, but non of them was good:
- https://lkml.org/lkml/2016/9/1/172
  this one slow down scheduler hot path's and Peter hates it.

- https://lkml.org/lkml/2016/9/6/305
  this one was fine for Peter, but I dislike it for taking
  task_rq_lock() and do not push it forward.

I considering fixing problem of sum_exec_runtime possible mangling
by using prev_sum_exec_runtime:

u64 read_sum_exec_runtime(struct task_struct *t)
{
   u64 ns, prev_ns;
 
   do {
   prev_ns = READ_ONCE(t->se.prev_sum_exec_runtime);
   ns = READ_ONCE(t->se.sum_exec_runtime);
   } while (ns < prev_ns || ns > (prev_ns + U32_MAX));
 
   return ns;
}

This should work based on fact that prev_sum_exec_runtime and
sum_exec_runtime are not modified and stored at the same time, so only
one of those variabled can be mangled. Though I need to think about 
correctnes of that a bit more.

Stanislaw


Re: [PATCH 08/37] cputime: Convert task/group cputime to nsecs

2017-01-28 Thread Stanislaw Gruszka
Hi Frederic and sorry for late comment.

On Sun, Jan 22, 2017 at 07:19:44PM +0100, Frederic Weisbecker wrote:
> Now that most cputime readers use the transition API which return the
> task cputime in old style cputime_t, we can safely store the cputime in
> nsecs. This will eventually make cputime statistics less opaque and more
> granular. Back and forth convertions between cputime_t and nsecs in order
> to deal with cputime_t random granularity won't be needed anymore.

> - cputime_t utime;
> - cputime_t stime;
> + u64 utime;
> + u64 stime;
>   unsigned long long sum_exec_runtime;
>  };

> @@ -134,7 +134,7 @@ void account_user_time(struct task_struct *p, cputime_t 
> cputime)
>   int index;
>  
>   /* Add user time to process. */
> - p->utime += cputime;
> + p->utime += cputime_to_nsecs(cputime);
>   account_group_user_time(p, cputime);

> +void task_cputime_adjusted(struct task_struct *p, u64 *ut, u64 *st)
>  {
>   *ut = p->utime;
>   *st = p->stime;
> o }

On 32 bit architectures 64bit store/load is not atomic and if not
protected - 64bit variables can be mangled. I do not see any protection
(lock) between utime/stime store and load in the patch and seems that
{u/s}time store/load can be performed at the same time. Though problem
is very very improbable it still can happen at least theoretically when
lower and upper 32 bits are changed at the same time i.e. process
{u,s}time become near to multiple of 2**32 nsec (aprox: 4sec) and
64bit {u,s}time is stored and loaded at the same time on different
cpus. As said this is very improbable situation, but eventually could
be possible on long lived processes.

BTW we have already similar problem with sum_exec_runtime. I posted
some patches to solve the problem, but non of them was good:
- https://lkml.org/lkml/2016/9/1/172
  this one slow down scheduler hot path's and Peter hates it.

- https://lkml.org/lkml/2016/9/6/305
  this one was fine for Peter, but I dislike it for taking
  task_rq_lock() and do not push it forward.

I considering fixing problem of sum_exec_runtime possible mangling
by using prev_sum_exec_runtime:

u64 read_sum_exec_runtime(struct task_struct *t)
{
   u64 ns, prev_ns;
 
   do {
   prev_ns = READ_ONCE(t->se.prev_sum_exec_runtime);
   ns = READ_ONCE(t->se.sum_exec_runtime);
   } while (ns < prev_ns || ns > (prev_ns + U32_MAX));
 
   return ns;
}

This should work based on fact that prev_sum_exec_runtime and
sum_exec_runtime are not modified and stored at the same time, so only
one of those variabled can be mangled. Though I need to think about 
correctnes of that a bit more.

Stanislaw


[tip:sched/core] sched/cputime: Simplify task_cputime()

2016-11-15 Thread tip-bot for Stanislaw Gruszka
Commit-ID:  353c50ebe329daaf2c94dc41c1c481cbba2a31fd
Gitweb: http://git.kernel.org/tip/353c50ebe329daaf2c94dc41c1c481cbba2a31fd
Author: Stanislaw Gruszka <sgrus...@redhat.com>
AuthorDate: Tue, 15 Nov 2016 03:06:52 +0100
Committer:  Ingo Molnar <mi...@kernel.org>
CommitDate: Tue, 15 Nov 2016 09:51:05 +0100

sched/cputime: Simplify task_cputime()

Now since fetch_task_cputime() has no other users than task_cputime(),
its code could be used directly in task_cputime().

Moreover since only 2 task_cputime() calls of 17 use a NULL argument,
we can add dummy variables to those calls and remove NULL checks from
task_cputimes().

Also remove NULL checks from task_cputimes_scaled().

Signed-off-by: Stanislaw Gruszka <sgrus...@redhat.com>
Signed-off-by: Frederic Weisbecker <fweis...@gmail.com>
Cc: Benjamin Herrenschmidt <b...@kernel.crashing.org>
Cc: Heiko Carstens <heiko.carst...@de.ibm.com>
Cc: Linus Torvalds <torva...@linux-foundation.org>
Cc: Martin Schwidefsky <schwidef...@de.ibm.com>
Cc: Michael Neuling <mi...@neuling.org>
Cc: Paul Mackerras <pau...@ozlabs.org>
Cc: Paul Mackerras <pau...@samba.org>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Thomas Gleixner <t...@linutronix.de>
Link: 
http://lkml.kernel.org/r/1479175612-14718-5-git-send-email-fweis...@gmail.com
Signed-off-by: Ingo Molnar <mi...@kernel.org>
---
 arch/x86/kernel/apm_32.c   |  4 +--
 include/linux/sched.h  | 12 +++--
 kernel/sched/cputime.c | 57 +++---
 kernel/time/posix-cpu-timers.c |  4 +--
 4 files changed, 23 insertions(+), 54 deletions(-)

diff --git a/arch/x86/kernel/apm_32.c b/arch/x86/kernel/apm_32.c
index c7364bd..d90749b 100644
--- a/arch/x86/kernel/apm_32.c
+++ b/arch/x86/kernel/apm_32.c
@@ -906,14 +906,14 @@ static int apm_cpu_idle(struct cpuidle_device *dev,
static int use_apm_idle; /* = 0 */
static unsigned int last_jiffies; /* = 0 */
static unsigned int last_stime; /* = 0 */
-   cputime_t stime;
+   cputime_t stime, utime;
 
int apm_idle_done = 0;
unsigned int jiffies_since_last_check = jiffies - last_jiffies;
unsigned int bucket;
 
 recalc:
-   task_cputime(current, NULL, );
+   task_cputime(current, , );
if (jiffies_since_last_check > IDLE_CALC_LIMIT) {
use_apm_idle = 0;
} else if (jiffies_since_last_check > idle_period) {
diff --git a/include/linux/sched.h b/include/linux/sched.h
index f72e813..fe3ce46 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -2248,10 +2248,8 @@ extern cputime_t task_gtime(struct task_struct *t);
 static inline void task_cputime(struct task_struct *t,
cputime_t *utime, cputime_t *stime)
 {
-   if (utime)
-   *utime = t->utime;
-   if (stime)
-   *stime = t->stime;
+   *utime = t->utime;
+   *stime = t->stime;
 }
 
 static inline cputime_t task_gtime(struct task_struct *t)
@@ -2265,10 +2263,8 @@ static inline void task_cputime_scaled(struct 
task_struct *t,
   cputime_t *utimescaled,
   cputime_t *stimescaled)
 {
-   if (utimescaled)
-   *utimescaled = t->utimescaled;
-   if (stimescaled)
-   *stimescaled = t->stimescaled;
+   *utimescaled = t->utimescaled;
+   *stimescaled = t->stimescaled;
 }
 #else
 static inline void task_cputime_scaled(struct task_struct *t,
diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
index ba55ebf..7700a9c 100644
--- a/kernel/sched/cputime.c
+++ b/kernel/sched/cputime.c
@@ -851,29 +851,25 @@ cputime_t task_gtime(struct task_struct *t)
  * add up the pending nohz execution time since the last
  * cputime snapshot.
  */
-static void
-fetch_task_cputime(struct task_struct *t,
-  cputime_t *u_dst, cputime_t *s_dst,
-  cputime_t *u_src, cputime_t *s_src,
-  cputime_t *udelta, cputime_t *sdelta)
+void task_cputime(struct task_struct *t, cputime_t *utime, cputime_t *stime)
 {
+   cputime_t delta;
unsigned int seq;
-   unsigned long long delta;
 
-   do {
-   *udelta = 0;
-   *sdelta = 0;
+   if (!vtime_accounting_enabled()) {
+   *utime = t->utime;
+   *stime = t->stime;
+   return;
+   }
 
+   do {
seq = read_seqcount_begin(>vtime_seqcount);
 
-   if (u_dst)
-   *u_dst = *u_src;
-   if (s_dst)
-   *s_dst = *s_src;
+   *utime = t->utime;
+   *stime = t->stime;
 
/* Task is sleeping, nothing to add */
-   if (t->vtime_snap_whence == VTIME_INACTIVE ||
-   is_idle_task(t))
+   if (

  1   2   3   4   5   >