from:"Jakub Kicinski"

Re: [PATCH v3] ipvs: Fix checksumming on GSO of SCTP packets

2024-04-25 Thread Jakub Kicinski

On Thu, 25 Apr 2024 18:28:40 +0200 Ismael Luceno wrote:
> Changes since v2:
> * Use only skb_is_gso, no need to check for GSO type

v2 is already in the tree, if the change is important you need to send
an incremental fix.

Re: [PATCH net-next v6] net/ipv4: add tracepoint for icmp_send

2024-04-25 Thread Jakub Kicinski

On Tue, 23 Apr 2024 17:23:39 +0800 (CST) xu.xi...@zte.com.cn wrote:
> +#include 
> \ No newline at end of file

Please fix.
-- 
pw-bot: cr

Re: [PATCH net-next v6 0/7] Implement reset reason mechanism to detect

2024-04-18 Thread Jakub Kicinski

On Thu, 18 Apr 2024 11:30:02 +0800 Jason Xing wrote:
> I'm not sure why the patch series has been changed to 'Changes
> Requested', until now I don't think I need to change something.
> 
> Should I repost this series (keeping the v6 tag) and then wait for
> more comments?

If Eric doesn't like it - it's not getting merged.

Re: [PATCH net-next v6] virtio_net: Support RX hash XDP hint

2024-04-12 Thread Jakub Kicinski

On Thu, 11 Apr 2024 16:52:16 +0800 Liang Chen wrote:
> + switch (__le16_to_cpu(hdr_hash->hash_report)) {
> + case VIRTIO_NET_HASH_REPORT_TCPv4:

Please indent things according to the kernel coding style.

Checkpatch finds 2 problems in this change.
-- 
pw-bot: cr

Re: [PATCH net-next 0/6] Implement reset reason mechanism to detect

2024-04-03 Thread Jakub Kicinski

On Wed,  3 Apr 2024 15:31:38 +0800 Jason Xing wrote:
> It's based on top of 
> https://patchwork.kernel.org/project/netdevbpf/list/?series=840182

Please post as RFC if there's a dependency.
We don't maintain patch queues for people.
-- 
pw-bot: cr

Re: [PATCH net v3] virtio_net: Do not send RSS key if it is not supported

2024-04-01 Thread Jakub Kicinski

On Sun, 31 Mar 2024 16:20:30 -0400 Michael S. Tsirkin wrote:
> > Fixes: c7114b1249fa ("drivers/net/virtio_net: Added basic RSS support.")
> > Cc: sta...@vger.kernel.org  
> 
> net has its own stable process, don't CC stable on net patches.

Not any more, FWIW:

  1.5.7. Stable tree

  While it used to be the case that netdev submissions were not
  supposed to carry explicit CC: sta...@vger.kernel.org tags that is no
  longer the case today. Please follow the standard stable rules in
  Documentation/process/stable-kernel-rules.rst, and make sure you
  include appropriate Fixes tags!

https://www.kernel.org/doc/html/next/process/maintainer-netdev.html#stable-tree

Re: [PATCH net-next v2 3/3] tcp: add location into reset trace process

2024-03-28 Thread Jakub Kicinski

On Tue, 26 Mar 2024 12:08:01 +0100 Paolo Abeni wrote:
> > -   TP_PROTO(const struct sock *sk, const struct sk_buff *skb),
> > +   TP_PROTO(
> > +   const struct sock *sk,
> > +   const struct sk_buff *skb,
> > +   void *location),  
> 
> Very minor nit: the above lines should be aligned with the open
> bracket.

Yes, and a very odd way of breaking it up. Empty line after ( but
) not on a separate line.

> No need to repost just for this, but let's wait for Eric's feedback.

Erring on the side of caution I'd read this:
https://lore.kernel.org/all/CANn89iKK-qPhQ91Sq8rR_=KDWajnY2=et2bujdsgoqk4wxf...@mail.gmail.com/
as lukewarm towards tp changes. Please repost if you think otherwise
(with the formatting fixed)
-- 
pw-bot: cr

Re: [PATCH net-next v3 2/2] net: udp: add IP/port data to the tracepoint udp/udp_fail_queue_rcv_skb

2024-03-25 Thread Jakub Kicinski

On Mon, 25 Mar 2024 11:29:18 +0100 Balazs Scheidler wrote:
> +memset(__entry->saddr, 0, sizeof(struct sockaddr_in6));
> +memset(__entry->daddr, 0, sizeof(struct sockaddr_in6));

Indent with tabs please, checkpatch says:

ERROR: code indent should use tabs where possible
#59: FILE: include/trace/events/udp.h:38:
+memset(__entry->saddr, 0, sizeof(struct sockaddr_in6));$

WARNING: please, no spaces at the start of a line
#59: FILE: include/trace/events/udp.h:38:
+memset(__entry->saddr, 0, sizeof(struct sockaddr_in6));$

ERROR: code indent should use tabs where possible
#60: FILE: include/trace/events/udp.h:39:
+memset(__entry->daddr, 0, sizeof(struct sockaddr_in6));$

WARNING: please, no spaces at the start of a line
#60: FILE: include/trace/events/udp.h:39:
+memset(__entry->daddr, 0, sizeof(struct sockaddr_in6));$
-- 
pw-bot: cr

Re: [PATCH net-next v2 0/3] tcp: make trace of reset logic complete

2024-03-25 Thread Jakub Kicinski

On Tue, 26 Mar 2024 10:13:55 +0800 Jason Xing wrote:
> Yesterday, I posted two series to do two kinds of things. They are not
> the same. Maybe you get me wrong :S

Ah, my bad, sorry about that. I see that they are different now.
One is v1 the other v2, both targeting tcp tracing... Easy to miss
in the post merge window rush :(

Re: [PATCH net-next v2 0/3] tcp: make trace of reset logic complete

2024-03-25 Thread Jakub Kicinski

On Mon, 25 Mar 2024 14:28:28 +0800 Jason Xing wrote:
> Before this, we miss some cases where the TCP layer could send rst but
> we cannot trace it. So I decided to complete it :)
> 
> v2
> 1. fix spelling mistakes

Not only do you post it before we "officially" open net-next but
also ignoring the 24h wait period.

https://www.kernel.org/doc/html/next/process/maintainer-netdev.html#tl-dr

The main goal of the 24h rule is to stop people from bombarding us with
new versions for silly reasons.

You show know better than this, it's hardly your first contribution :(
-- 
pv-bot: 24h

Re: [PATCH v3 resend] net/ipv4: add tracepoint for icmp_send

2024-03-20 Thread Jakub Kicinski

On Thu, 21 Mar 2024 11:09:18 +0800 (CST) xu.xi...@zte.com.cn wrote:
> +/* This part must be outside protection */
> +#include 
> \ No newline at end of file

In addition to Jason's comments please make sure there is a new line at
the end of the file.

And please post v4 on Monday, net-next is currently closed.

While you wait have a read of:
https://www.kernel.org/doc/html/next/process/maintainer-netdev.html

:)

Re: [RFC PATCH 0/2] net: provides dim profile fine-tuning channels

2024-03-14 Thread Jakub Kicinski

On Thu, 14 Mar 2024 21:09:31 +0800 Heng Qi wrote:
> The NetDIM library provides excellent acceleration for many modern
> network cards. However, the default profiles of DIM limits its maximum
> capabilities for different NICs, so providing a channel through which
> the NIC can be custom configured is necessary.

Given that DIM is currently enabled and disable via ethtool
why are you putting the API is sysfs and ops in ndo?

Re: [PATCH net-next 0/3] trace: use TP_STORE_ADDRS macro

2024-03-11 Thread Jakub Kicinski

On Sun, 10 Mar 2024 20:14:03 +0800 Jason Xing wrote:
> Using the macro for other tracepoints use to be more concise.
> No functional change.

The merge window for 6.9 has started and we try not to apply patches 
to net-next during the merge window. Please repost in 2 weeks, once
Linus has tagged v6.9-rc1.
-- 
pw-bot: defer

Re: [PATCH net-next v3] net: dqs: add NIC stall detector based on BQL

2024-02-13 Thread Jakub Kicinski

On Tue, 13 Feb 2024 14:57:49 +0100 Eric Dumazet wrote:
> Please note that adding other sysfs entries is expensive for workloads
> creating/deleting netdev and netns often.
> 
> I _think_ we should find a way for not creating
> /sys/class/net//queues/tx-{Q}/byte_queue_limits  directory
> and files
> for non BQL enabled devices (like loopback !)

We should try, see if anyone screams. We could use IFF_NO_QUEUE, and
NETIF_F_LLTX as a proxy for "device doesn't have a real queue so BQL 
would be pointless"? Obviously better to annotate the drivers which
do have BQL support, but there's >50 of them on a quick count..

Re: [PATCH RFC 0/4] Introduce uts_release

2024-02-02 Thread Jakub Kicinski

On Sat, 3 Feb 2024 00:01:26 +0900 Masahiro Yamada wrote:
> I do not see why it is useful.
> As you discussed in 3/4, if UTS_RELEASE is unneeded,
> it is better to get rid of it.

To be clear - the discussion on 3/4 was about the fact that netdev
already prints UTS_RELEASE into the version member of driver info
struct, as a default. So the drivers no longer have to. But there's 
no user-observable change there.

Re: [PATCH RFC 3/4] net: ethtool: Use uts_release

2024-02-01 Thread Jakub Kicinski

On Thu, 1 Feb 2024 14:20:23 +0100 Jiri Pirko wrote:
> >BTW, I assume that changes like this are also ok:
> >
> >8<-
> >
> >   net: team: Don't bother filling in ethtool driver version

Yup, just to be clear - you can send this independently from the series,
tag is as 

 [PATCH net-next]

we'll take it via the networking tree. I'm not sure which tree the other
patches will go thru..

Re: [PATCH RFC 3/4] net: ethtool: Use uts_release

2024-01-31 Thread Jakub Kicinski

On Wed, 31 Jan 2024 10:48:50 + John Garry wrote:
> Instead of using UTS_RELEASE, use uts_release, which means that we don't
> need to rebuild the code just for the git head commit changing.
> 
> Signed-off-by: John Garry 

Yes, please!

Acked-by: Jakub Kicinski

Re: [PATCH net-next v3 2/3] net: introduce abstraction for network memory

2024-01-10 Thread Jakub Kicinski

On Wed, 10 Jan 2024 09:50:08 -0800 Shakeel Butt wrote:
> On Thu, Jan 4, 2024 at 1:44 PM Jakub Kicinski  wrote:
> > You seem to be trying hard to make struct netmem a thing.
> > Perhaps you have a reason I'm not getting?  
> 
> Mina already went with your suggestion and that is fine. To me, struct
> netmem is more aesthetically aligned with the existing struct
> encoded_page approach, but I don't have a strong opinion one way or
> the other. However it seems like you have a stronger preference for
> __bitwise approach. Is there a technical reason or just aesthetic?

Yes, right above the text you quoted:

  The __bitwise annotation will make catching people trying
  to cast to struct page * trivial.

https://lore.kernel.org/all/20240104134424.399fe...@kernel.org/

Re: [PATCH net-next v3 2/3] net: introduce abstraction for network memory

2024-01-04 Thread Jakub Kicinski

On Thu, 21 Dec 2023 15:44:22 -0800 Mina Almasry wrote:
> The warning is like so:
> 
> ./include/net/page_pool/helpers.h: In function ‘page_pool_alloc’:
> ./include/linux/stddef.h:8:14: warning: returning ‘void *’ from a
> function with return type ‘netmem_ref’ {aka ‘long unsigned int’} makes
> integer from pointer without a cast [-Wint-conversion]
> 8 | #define NULL ((void *)0)
>   |  ^
> ./include/net/page_pool/helpers.h:132:24: note: in expansion of macro
> ‘NULL’
>   132 | return NULL;
>   |^~~~
> 
> And happens in all the code where:
> 
> netmem_ref func()
> {
> return NULL;
> }
> 
> It's fixable by changing the return to `return (netmem_ref NULL);` or
> `return 0;`, but I feel like netmem_ref should be some type which
> allows a cast from NULL implicitly.

Why do you think we should be able to cast NULL implicitly?
netmem_ref is a handle, it could possibly be some form of 
an ID in the future, rather than a pointer. Or have more low
bits stolen for specific use cases.

unsigned long, and returning 0 as "no handle" makes perfect sense to me.

Note that 0 is a special case, bitwise types are allowed to convert
to 0/bool and 0 is implicitly allowed to become a bitwise type.
This will pass without a warning:

typedef unsigned long __bitwise netmem_ref;

netmem_ref some_code(netmem_ref ref)
{
// direct test is fine
if (!ref)
// 0 "upgrades" without casts
return 0;
// 1 does not, we need __force
return (__force netmem_ref)1 | ref;
}

The __bitwise annotation will make catching people trying
to cast to struct page * trivial.

You seem to be trying hard to make struct netmem a thing.
Perhaps you have a reason I'm not getting?

Re: [PATCH net-next 2/6] page_frag: unify gfp bit for order 3 page allocation

2023-12-06 Thread Jakub Kicinski

On Tue, 5 Dec 2023 19:34:40 +0800 Yunsheng Lin wrote:
> __GFP_DIRECT_RECLAIM is xor'd to avoid
> direct reclaim in skb_page_frag_refill(), but it is not
> xor'd in __page_frag_cache_refill().

xor is not the same thing as masking a bit off.
The patch itself LGTM.

Re: [net-next PATCH v3 3/3] net: phy: add support for PHY package MMD read/write

2023-12-05 Thread Jakub Kicinski

On Tue, 5 Dec 2023 15:10:50 + Russell King (Oracle) wrote:
> I've raised this before in other subsystems, and it's suggested that
> it's better to have it in the .c file. I guess the reason is that it's
> more obvious that the function is documented when modifying it, so
> there's a higher probability that the kdoc will get updated when the
> function is altered.

Plus I think people using IDEs (i.e. not me) may use the "jump to
definition" functionality, to find the doc? 

TBH I thought putting kdoc in the C source was documented in the coding
style, but I can't find any mention of it now.

Re: [PATCH] KVM: deprecate KVM_WERROR in favor of general WERROR

2023-10-10 Thread Jakub Kicinski

On Tue, 10 Oct 2023 11:04:18 +0300 Jani Nikula wrote:
> > If you do invest in build testing automation, why can't your automation
> > count warnings rather than depend on WERROR? I don't understand.  
> 
> Because having both CI and the subsystem/driver developers enable a
> local WERROR actually works in keeping the subsystem/driver clean of
> warnings.
> 
> For i915, we also enable W=1 warnings and kernel-doc -Werror with it,
> keeping all of them warning clean. I don't much appreciate calling that
> anti-social.

Anti-social is not the right word, that's fair.

Werror makes your life easier while increasing the blast radius 
of your mistakes. So you're trading off your convenience for risk
of breakage to others. Note that you can fix issues locally very
quickly and move on. Others have to wait to get your patches thru
Linus.

> >> I disagree.  WERROR simply doesn't provide the same coverage.  E.g. it 
> >> can't be
> >> enabled for i386 without tuning FRAME_WARN, which (a) won't be at all 
> >> obvious to
> >> the average contributor and (b) increasing FRAME_WARN effectively reduces 
> >> the
> >> test coverage of KVM i386.
> >> 
> >> For KVM x86, I want the rules for contributing to be clearly documented, 
> >> and as
> >> simple as possible.  I don't see a sane way to achieve that with WERROR=y. 
> >>  
> 
> The DRM_I915_WERROR config depends on EXPERT and !COMPILE_TEST, and to
> my knowledge this has never caused issues outside of i915 developers and
> CI.

Ack, I think you do it right. I was trying to establish a precedent
so that we can delete these as soon as they cause an issue, not sooner.

Whatever tho, there's no accounting for taste.

Re: [PATCH 01/10] appletalk: remove localtalk and ppp support

2023-10-09 Thread Jakub Kicinski

On Mon,  9 Oct 2023 16:18:59 +0200 Arnd Bergmann wrote:
> The last localtalk driver is gone now, and ppp support was never fully
> merged, so clean up the appletalk code by removing the obvious dead
> code paths.
> 
> Notably, this removes one of the two callers of the old .ndo_do_ioctl()
> callback that was abused for getting device addresses and is now
> only used in the ieee802154 subsystem, which still uses the same trick.
> 
> The include/uapi/linux/if_ltalk.h header might still be required
> for building userspace programs, but I made sure that debian code
> search and the netatalk upstream have no references it it, so it
> should be fine to remove.

Looks like it depends on the ipddp driver removal.
Could you repost once that one is merged (~tomorrow)?
-- 
pw-bot: cr

Re: [PATCH] KVM: deprecate KVM_WERROR in favor of general WERROR

2023-10-09 Thread Jakub Kicinski

On Mon, 9 Oct 2023 12:33:53 -0700 Sean Christopherson wrote:
> > We do have sympathy for these folks, we are mostly volunteers after
> > all. At the same time someone's under-investment should not be causing
> > pain to those of us who _do_ build test stuff carefully.  
> 
> This is a bit over the top.  Yeah, I need to add W=1 to my build scripts, but 
> that's
> not a lack of investment, just an oversight.  Though in this case it likely 
> wouldn't
> have made any difference since Paolo grabbed the patches directly and might 
> have
> even bypassed linux-next.  But again I would argue that's bad process, not a 
> lack
> of investment.

If you do invest in build testing automation, why can't your automation
count warnings rather than depend on WERROR? I don't understand.

> > Rather than tweak stuff I'd prefer if we could agree that local -Werror
> > is anti-social :(
> > 
> > The global WERROR seems to be a good compromise.  
> 
> I disagree.  WERROR simply doesn't provide the same coverage.  E.g. it can't 
> be
> enabled for i386 without tuning FRAME_WARN, which (a) won't be at all obvious 
> to
> the average contributor and (b) increasing FRAME_WARN effectively reduces the
> test coverage of KVM i386.
> 
> For KVM x86, I want the rules for contributing to be clearly documented, and 
> as
> simple as possible.  I don't see a sane way to achieve that with WERROR=y.

Linus, you created the global WERROR option. Do you have an opinion
on whether random subsystems should create their own WERROR flags?
W=1 warning got in thru KVM and since they have a KVM_WERROR which
defaults to enabled it broke build testing in networking.
Randomly sprinkled -Werrors are fragile. Can we ask people to stop
using them now that the global ERROR exists?

Re: [PATCH] KVM: deprecate KVM_WERROR in favor of general WERROR

2023-10-09 Thread Jakub Kicinski

On Mon, 9 Oct 2023 10:43:43 -0700 Sean Christopherson wrote:
> On Fri, Oct 06, 2023, Jakub Kicinski wrote:
> > Setting WERROR for random subsystems make life really hard
> > for subsystems which want to build-test their stuff with W=1.
> > WERROR for the entire kernel now exists and can be used
> > instead. W=1 people probably know how to deal with the global
> > W=1 already, tracking all per-subsystem WERRORs is too much...  
> 
> I assume s/W=1/WERROR=y in this line?

Yes, sorry about that.

> > Link: 
> > https://lore.kernel.org/all/0da9874b6e9fcbaaa5edeb345d7e2a7c859fc818.1696271334.git.thomas.lenda...@amd.com/
> > Signed-off-by: Jakub Kicinski 
> > ---
> >  Documentation/process/maintainer-kvm-x86.rst |  2 +-
> >  arch/x86/kvm/Kconfig | 14 --
> >  arch/x86/kvm/Makefile|  1 -
> >  3 files changed, 1 insertion(+), 16 deletions(-)
> > 
> > diff --git a/Documentation/process/maintainer-kvm-x86.rst 
> > b/Documentation/process/maintainer-kvm-x86.rst
> > index 9183bd449762..cd70c0351108 100644
> > --- a/Documentation/process/maintainer-kvm-x86.rst
> > +++ b/Documentation/process/maintainer-kvm-x86.rst
> > @@ -243,7 +243,7 @@ context and disambiguate the reference.
> >  Testing
> >  ---
> >  At a bare minimum, *all* patches in a series must build cleanly for 
> > KVM_INTEL=m
> > -KVM_AMD=m, and KVM_WERROR=y.  Building every possible combination of 
> > Kconfigs
> > +KVM_AMD=m, and WERROR=y.  Building every possible combination of Kconfigs
> >  isn't feasible, but the more the merrier.  KVM_SMM, KVM_XEN, 
> > PROVE_LOCKING, and
> >  X86_64 are particularly interesting knobs to turn.
> >  
> > diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig
> > index ed90f148140d..12929324ac3e 100644
> > --- a/arch/x86/kvm/Kconfig
> > +++ b/arch/x86/kvm/Kconfig
> > @@ -63,20 +63,6 @@ config KVM
> >  
> >   If unsure, say N.
> >  
> > -config KVM_WERROR
> > -   bool "Compile KVM with -Werror"
> > -   # KASAN may cause the build to fail due to larger frames
> > -   default y if X86_64 && !KASAN  
> 
> Hrm, I am loath to give up KVM's targeted -Werror as it allows for more 
> aggresive
> enabling, e.g. enabling CONFIG_WERROR for i386 builds with other defaults 
> doesn't
> work because of CONFIG_FRAME_WARN=1024.  That in turns means making WERROR=y a
> requirement in maintainer-kvm-x86.rst is likely unreasonable.
> 
> And arguably KVM_WERROR is doing its job by flagging the linked W=1 error.  
> The
> problem there lies more in my build testing, which I'll go fix by adding a W=1
> configuration or three.  As the changelog notes, I highly doubt W=1 builds 
> work
> with WERROR, whereas keeping KVM x86 warning-free even with W=1 is feasible.
> 
> > -   # We use the dependency on !COMPILE_TEST to not be enabled
> > -   # blindly in allmodconfig or allyesconfig configurations
> > -   depends on KVM
> > -   depends on (X86_64 && !KASAN) || !COMPILE_TEST  
> 
> On a related topic, this is comically stale as WERROR is on by default for 
> both
> allmodconfig and allyesconfig, which work because they trigger 64-bit builds.
> And KASAN on x86 is 64-bit only.
> 
> Rather than yank out KVM_WERROR entirely, what if we make default=n and trim 
> the
> depends down to "KVM && EXPERT && !KASAN"?  E.g.

IMO setting WERROR is a bit perverse. The way I see it WERROR is a
crutch for people who don't have the time / infra to properly build
test changes they send to Linus. Or wait for build bots to do their job.
We do have sympathy for these folks, we are mostly volunteers after
all. At the same time someone's under-investment should not be causing
pain to those of us who _do_ build test stuff carefully.

Rather than tweak stuff I'd prefer if we could agree that local -Werror
is anti-social :(

The global WERROR seems to be a good compromise.

[PATCH] KVM: deprecate KVM_WERROR in favor of general WERROR

2023-10-06 Thread Jakub Kicinski

Setting WERROR for random subsystems make life really hard
for subsystems which want to build-test their stuff with W=1.
WERROR for the entire kernel now exists and can be used
instead. W=1 people probably know how to deal with the global
W=1 already, tracking all per-subsystem WERRORs is too much...

Link: 
https://lore.kernel.org/all/0da9874b6e9fcbaaa5edeb345d7e2a7c859fc818.1696271334.git.thomas.lenda...@amd.com/
Signed-off-by: Jakub Kicinski 
---
 Documentation/process/maintainer-kvm-x86.rst |  2 +-
 arch/x86/kvm/Kconfig | 14 --
 arch/x86/kvm/Makefile|  1 -
 3 files changed, 1 insertion(+), 16 deletions(-)

diff --git a/Documentation/process/maintainer-kvm-x86.rst 
b/Documentation/process/maintainer-kvm-x86.rst
index 9183bd449762..cd70c0351108 100644
--- a/Documentation/process/maintainer-kvm-x86.rst
+++ b/Documentation/process/maintainer-kvm-x86.rst
@@ -243,7 +243,7 @@ context and disambiguate the reference.
 Testing
 ---
 At a bare minimum, *all* patches in a series must build cleanly for KVM_INTEL=m
-KVM_AMD=m, and KVM_WERROR=y.  Building every possible combination of Kconfigs
+KVM_AMD=m, and WERROR=y.  Building every possible combination of Kconfigs
 isn't feasible, but the more the merrier.  KVM_SMM, KVM_XEN, PROVE_LOCKING, and
 X86_64 are particularly interesting knobs to turn.
 
diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig
index ed90f148140d..12929324ac3e 100644
--- a/arch/x86/kvm/Kconfig
+++ b/arch/x86/kvm/Kconfig
@@ -63,20 +63,6 @@ config KVM
 
  If unsure, say N.
 
-config KVM_WERROR
-   bool "Compile KVM with -Werror"
-   # KASAN may cause the build to fail due to larger frames
-   default y if X86_64 && !KASAN
-   # We use the dependency on !COMPILE_TEST to not be enabled
-   # blindly in allmodconfig or allyesconfig configurations
-   depends on KVM
-   depends on (X86_64 && !KASAN) || !COMPILE_TEST
-   depends on EXPERT
-   help
- Add -Werror to the build flags for KVM.
-
- If in doubt, say "N".
-
 config KVM_INTEL
tristate "KVM for Intel (and compatible) processors support"
depends on KVM && IA32_FEAT_CTL
diff --git a/arch/x86/kvm/Makefile b/arch/x86/kvm/Makefile
index 80e3fe184d17..8e6afde7c680 100644
--- a/arch/x86/kvm/Makefile
+++ b/arch/x86/kvm/Makefile
@@ -1,7 +1,6 @@
 # SPDX-License-Identifier: GPL-2.0
 
 ccflags-y += -I $(srctree)/arch/x86/kvm
-ccflags-$(CONFIG_KVM_WERROR) += -Werror
 
 ifeq ($(CONFIG_FRAME_POINTER),y)
 OBJECT_FILES_NON_STANDARD_vmenter.o := y
-- 
2.41.0

Re: [PATCH] net: appletalk: remove cops support

2023-10-04 Thread Jakub Kicinski

On Wed, 27 Sep 2023 11:00:30 +0200 Greg Kroah-Hartman wrote:
> The COPS Appletalk support is very old, never said to actually work
> properly, and the firmware code for the devices are under a very suspect
> license.  Remove it all to clear up the license issue, if it is still
> needed and actually used by anyone, we can add it back later once the
> license is cleared up.

Nice, Doug and Arnd also mentioned this in the past so let me add
them to the CC as I apply this...

Re: [PATCH 0/4] tracing: improve symbolic printing

2023-10-04 Thread Jakub Kicinski

On Wed, 4 Oct 2023 12:35:24 -0400 Steven Rostedt wrote:
> > Potentially naive question - the trace point holds enum skb_drop_reason.
> > The user space can get the names from BTF. Can we not teach user space
> > to generically look up names of enums in BTF?  
> 
> That puts a hard requirement to include BTF in builds where it was not
> needed before. I really do not want to build with BTF just to get access to
> these symbols. And since this is used by the embedded world, and BTF is
> extremely bloated, the short answer is "No".

Dunno. BTF is there most of the time. It could make the life of
majority of the users far more pleasant.

I hope we can at least agree that the current methods of generating 
the string arrays at C level are... aesthetically displeasing.

Re: [PATCH 0/4] tracing: improve symbolic printing

2023-10-04 Thread Jakub Kicinski

On Thu, 21 Sep 2023 10:51:30 +0200 Johannes Berg wrote:
> So I was frustrated with not seeing the names of SKB dropreasons
> for all but the core reasons, and then while looking into this
> all, realized, that the current __print_symbolic() is pretty bad
> anyway.
> 
> So I came up with a new approach, using a separate declaration
> of the symbols, and __print_sym() in there, but to userspace it
> all doesn't matter, it shows it the same way, just dyamically
> instead of munging with the strings all the time.
> 
> This is a huge .data savings as far as I can tell, with a modest
> amount (~4k) of .text addition, while making it all dynamic and
> in the SKB dropreason case even reusing the existing list that
> dropmonitor uses today. Surely patch 3 isn't needed here, but it
> felt right.
> 
> Anyway, I think it's a pretty reasonable approach overall, and
> it does works.
> 
> I've listed a number of open questions in the first patch since
> that's where the real changes for this are.

Potentially naive question - the trace point holds enum skb_drop_reason.
The user space can get the names from BTF. Can we not teach user space
to generically look up names of enums in BTF?

Re: [PATCH] net: called rtnl_unlock() before runpm resumes devices

2021-04-20 Thread Jakub Kicinski

On Tue, 20 Apr 2021 10:34:17 +0200 Eric Dumazet wrote:
> On Tue, Apr 20, 2021 at 9:54 AM AceLan Kao  wrote:
> >
> > From: "Chia-Lin Kao (AceLan)" 
> >
> > The rtnl_lock() has been called in rtnetlink_rcv_msg(), and then in
> > __dev_open() it calls pm_runtime_resume() to resume devices, and in
> > some devices' resume function(igb_resum,igc_resume) they calls rtnl_lock()
> > again. That leads to a recursive lock.
> >
> > It should leave the devices' resume function to decide if they need to
> > call rtnl_lock()/rtnl_unlock(), so call rtnl_unlock() before calling
> > pm_runtime_resume() and then call rtnl_lock() after it in __dev_open().
> >
> >  
> 
> Hi Acelan
> 
> When was the bugg added ?
> Please add a Fixes: tag

For immediate cause probably:

Fixes: 9474933caf21 ("igb: close/suspend race in netif_device_detach")

> By doing so, you give more chances for reviewers to understand why the
> fix is not risky,
> and help stable teams work.

IMO the driver lacks internal locking. Taking rtnl from resume is just
one example, git history shows many more places that lacked locking and
got papered over with rtnl here.

Re: [PATCH net-next 4/6] r8152: support new chips

2021-04-20 Thread Jakub Kicinski

On Tue, 20 Apr 2021 07:00:39 + Hayes Wang wrote:
> > > @@ -6878,7 +8942,11 @@ static int rtl8152_probe(struct usb_interface 
> > > *intf,  
> > >   set_ethernet_addr(tp);
> > >
> > >   usb_set_intfdata(intf, tp);
> > > - netif_napi_add(netdev, >napi, r8152_poll, RTL8152_NAPI_WEIGHT);
> > > +
> > > + if (tp->support_2500full)
> > > + netif_napi_add(netdev, >napi, r8152_poll, 256);  
> > 
> > why 256? We have 100G+ drivers all using 64 what's special here?
> >   
> > > + else
> > > + netif_napi_add(netdev, >napi, r8152_poll, 64);  
> 
> We test 2.5G Ethernet on some embedded platform.
> And we find 64 is not large enough, and the performance
> couldn't reach 2.5 G bits/s.

Did you manage to identify what the cause is?

NAPI will keep calling your driver if the budget was exhausted, the
only difference between 64 and 256 should be the setup cost of the
driver's internal loop. And perhaps more frequent GRO flush - what's
the CONFIG_HZ set to?

[GIT PULL] Networking for 5.12-rc8

2021-04-16 Thread Jakub Kicinski

The following changes since commit 4e04e7513b0fa2fe8966a1c83fb473f1667e2810:

  Merge tag 'net-5.12-rc7' of 
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net (2021-04-09 15:26:51 
-0700)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net.git tags/net-5.12-rc8

for you to fetch changes up to f2764bd4f6a8dffaec3e220728385d9756b3c2cb:

  netlink: don't call ->netlink_bind with table lock held (2021-04-16 17:01:04 
-0700)


Networking fixes for 5.12-rc8, including fixes from netfilter,
and bpf. BPF verifier changes stand out, otherwise things have
slowed down.

Current release - regressions:

 - gro: ensure frag0 meets IP header alignment

 - Revert "net: stmmac: re-init rx buffers when mac resume back"

 - ethernet: macb: fix the restore of cmp registers

Previous releases - regressions:

 - ixgbe: Fix NULL pointer dereference in ethtool loopback test

 - ixgbe: fix unbalanced device enable/disable in suspend/resume

 - phy: marvell: fix detection of PHY on Topaz switches

 - make tcp_allowed_congestion_control readonly in non-init netns

 - xen-netback: Check for hotplug-status existence before watching

Previous releases - always broken:

 - bpf: mitigate a speculative oob read of up to map value size by
tightening the masking window

 - sctp: fix race condition in sctp_destroy_sock

 - sit, ip6_tunnel: Unregister catch-all devices

 - netfilter: nftables: clone set element expression template

 - netfilter: flowtable: fix NAT IPv6 offload mangling

 - net: geneve: check skb is large enough for IPv4/IPv6 header

 - netlink: don't call ->netlink_bind with table lock held

Signed-off-by: Jakub Kicinski 


Alexander Duyck (1):
  ixgbe: Fix NULL pointer dereference in ethtool loopback test

Aya Levin (2):
  net/mlx5: Fix setting of devlink traps in switchdev mode
  net/mlx5e: Fix setting of RS FEC mode

Christophe JAILLET (1):
  net: davicom: Fix regulator not turned off on failed probe

Ciara Loftus (1):
  libbpf: Fix potential NULL pointer dereference

Claudiu Beznea (1):
  net: macb: fix the restore of cmp registers

Colin Ian King (1):
  ice: Fix potential infinite loop when using u8 loop counter

Daniel Borkmann (9):
  bpf: Use correct permission flag for mixed signed bounds arithmetic
  bpf: Move off_reg into sanitize_ptr_alu
  bpf: Ensure off_reg has no mixed signed bounds for all types
  bpf: Rework ptr_limit into alu_limit and add common error path
  bpf: Improve verifier error messages for users
  bpf: Refactor and streamline bounds check into helper
  bpf: Move sanitize_val_alu out of op switch
  bpf: Tighten speculative pointer arithmetic mask
  bpf: Update selftests to reflect new error states

David S. Miller (7):
  Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf
  Merge branch 'catch-all-devices'
  Merge branch 'ibmvnic-napi-fixes'
  Merge branch '10GbE' of 
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
  Merge tag 'mlx5-fixes-2021-04-14' of 
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
  Merge branch 'ch_tlss-fixes'
  Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf

Eric Dumazet (2):
  netfilter: nft_limit: avoid possible divide error in nft_limit_init
  gro: ensure frag0 meets IP header alignment

Florian Westphal (4):
  netfilter: bridge: add pre_exit hooks for ebtable unregistration
  netfilter: arp_tables: add pre_exit hook for table unregister
  netfilter: x_tables: fix compat match/target pad out-of-bound write
  netlink: don't call ->netlink_bind with table lock held

Heiner Kallweit (1):
  r8169: don't advertise pause in jumbo mode

Hristo Venev (2):
  net: sit: Unregister catch-all devices
  net: ip6_tunnel: Unregister catch-all devices

Jakub Kicinski (2):
  ethtool: fix kdoc attr name
  ethtool: pause: make sure we init driver stats

Jason Xing (1):
  i40e: fix the panic when running bpf in xdpdrv mode

Joakim Zhang (1):
  MAINTAINERS: update maintainer entry for freescale fec driver

Jonathon Reinhart (1):
  net: Make tcp_allowed_congestion_control readonly in non-init netns

Lijun Pan (5):
  ibmvnic: correctly use dev_consume/free_skb_irq
  ibmvnic: avoid calling napi_disable() twice
  ibmvnic: remove duplicate napi_schedule call in do_reset function
  ibmvnic: remove duplicate napi_schedule call in open function
  MAINTAINERS: update my email

Michael Brown (1):
  xen-netback: Check for hotplug-status existence before watching

Nicolas Dichtel (2):
  doc: move seg6_flowlabel to seg6-sysctl.rst
  vrf: fix a comment about loopback device

Or Cohen (1):
  net/sctp: fix race condition in sctp_destroy_sock

Pablo Neira Ayuso (3):
  netfil

Re: [PATCH net-next 4/6] r8152: support new chips

2021-04-16 Thread Jakub Kicinski

On Fri, 16 Apr 2021 16:04:35 +0800 Hayes Wang wrote:
> Support RTL8153C, RTL8153D, RTL8156A, and RTL8156B. The RTL8156A
> and RTL8156B are the 2.5G ethernet.
> 
> Signed-off-by: Hayes Wang 

> + switch (tp->version) {
> + case RTL_VER_10:
> + data = ocp_reg_read(tp, 0xad40);
> + data &= ~0x3ff;
> + data |= BIT(7) | BIT(2);
> + ocp_reg_write(tp, 0xad40, data);
> +
> + data = ocp_reg_read(tp, 0xad4e);
> + data |= BIT(4);
> + ocp_reg_write(tp, 0xad4e, data);
> + data = ocp_reg_read(tp, 0xad16);
> + data &= ~0x3ff;
> + data |= 0x6;
> + ocp_reg_write(tp, 0xad16, data);
> + data = ocp_reg_read(tp, 0xad32);
> + data &= ~0x3f;
> + data |= 6;
> + ocp_reg_write(tp, 0xad32, data);
> + data = ocp_reg_read(tp, 0xac08);
> + data &= ~(BIT(12) | BIT(8));
> + ocp_reg_write(tp, 0xac08, data);
> + data = ocp_reg_read(tp, 0xac8a);
> + data |= BIT(12) | BIT(13) | BIT(14);
> + data &= ~BIT(15);
> + ocp_reg_write(tp, 0xac8a, data);
> + data = ocp_reg_read(tp, 0xad18);
> + data |= BIT(10);
> + ocp_reg_write(tp, 0xad18, data);
> + data = ocp_reg_read(tp, 0xad1a);
> + data |= 0x3ff;
> + ocp_reg_write(tp, 0xad1a, data);
> + data = ocp_reg_read(tp, 0xad1c);
> + data |= 0x3ff;
> + ocp_reg_write(tp, 0xad1c, data);
> +
> + data = sram_read(tp, 0x80ea);
> + data &= ~0xff00;
> + data |= 0xc400;
> + sram_write(tp, 0x80ea, data);
> + data = sram_read(tp, 0x80eb);
> + data &= ~0x0700;
> + data |= 0x0300;
> + sram_write(tp, 0x80eb, data);
> + data = sram_read(tp, 0x80f8);
> + data &= ~0xff00;
> + data |= 0x1c00;
> + sram_write(tp, 0x80f8, data);
> + data = sram_read(tp, 0x80f1);
> + data &= ~0xff00;
> + data |= 0x3000;
> + sram_write(tp, 0x80f1, data);

> + switch (tp->version) {
> + case RTL_VER_12:
> + ocp_reg_write(tp, 0xbf86, 0x9000);
> + data = ocp_reg_read(tp, 0xc402);
> + data |= BIT(10);
> + ocp_reg_write(tp, 0xc402, data);
> + data &= ~BIT(10);
> + ocp_reg_write(tp, 0xc402, data);
> + ocp_reg_write(tp, 0xbd86, 0x1010);
> + ocp_reg_write(tp, 0xbd88, 0x1010);
> + data = ocp_reg_read(tp, 0xbd4e);
> + data &= ~(BIT(10) | BIT(11));
> + data |= BIT(11);
> + ocp_reg_write(tp, 0xbd4e, data);
> + data = ocp_reg_read(tp, 0xbf46);
> + data &= ~0xf00;
> + data |= 0x700;
> + ocp_reg_write(tp, 0xbf46, data);

> + data = r8153_phy_status(tp, 0);
> + switch (data) {
> + case PHY_STAT_EXT_INIT:
> + rtl8152_apply_firmware(tp, true);
> +
> + data = ocp_reg_read(tp, 0xa466);
> + data &= ~BIT(0);
> + ocp_reg_write(tp, 0xa466, data);

What are all these magic constants? :(

> @@ -6878,7 +8942,11 @@ static int rtl8152_probe(struct usb_interface *intf,
>   set_ethernet_addr(tp);
>  
>   usb_set_intfdata(intf, tp);
> - netif_napi_add(netdev, >napi, r8152_poll, RTL8152_NAPI_WEIGHT);
> +
> + if (tp->support_2500full)
> + netif_napi_add(netdev, >napi, r8152_poll, 256);

why 256? We have 100G+ drivers all using 64 what's special here?

> + else
> + netif_napi_add(netdev, >napi, r8152_poll, 64);

Re: [PATCH net-next 09/10] net: sparx5: add ethtool configuration and statistics support

2021-04-16 Thread Jakub Kicinski

On Fri, 16 Apr 2021 15:16:56 +0200 Steen Hegelund wrote:
> + "rx_in_bytes",
> + "rx_symbol_err",
> + "rx_pause",
> + "rx_unsup_opcode",
> + "rx_ok_bytes",
> + "rx_bad_bytes",
> + "rx_unicast",
> + "rx_multicast",
> + "rx_broadcast",
> + "rx_crc_err",
> + "rx_undersize",
> + "rx_fragments",
> + "rx_inrangelen_err",
> + "rx_outofrangelen_err",
> + "rx_oversize",
> + "rx_jabbers",
> + "rx_size64",
> + "rx_size65_127",
> + "rx_size128_255",
> + "rx_size256_511",
> + "rx_size512_1023",
> + "rx_size1024_1518",
> + "rx_size1519_max",
> + "pmac_rx_symbol_err",
> + "pmac_rx_pause",
> + "pmac_rx_unsup_opcode",
> + "pmac_rx_ok_bytes",
> + "pmac_rx_bad_bytes",
> + "pmac_rx_unicast",
> + "pmac_rx_multicast",
> + "pmac_rx_broadcast",
> + "pmac_rx_crc_err",
> + "pmac_rx_undersize",
> + "pmac_rx_fragments",
> + "pmac_rx_inrangelen_err",
> + "pmac_rx_outofrangelen_err",
> + "pmac_rx_oversize",
> + "pmac_rx_jabbers",
> + "pmac_rx_size64",
> + "pmac_rx_size65_127",
> + "pmac_rx_size128_255",
> + "pmac_rx_size256_511",
> + "pmac_rx_size512_1023",
> + "pmac_rx_size1024_1518",
> + "pmac_rx_size1519_max",

> + "rx_local_drop",
> + "rx_port_policer_drop",
> + "tx_out_bytes",
> + "tx_pause",
> + "tx_ok_bytes",
> + "tx_unicast",
> + "tx_multicast",
> + "tx_broadcast",
> + "tx_size64",
> + "tx_size65_127",
> + "tx_size128_255",
> + "tx_size256_511",
> + "tx_size512_1023",
> + "tx_size1024_1518",
> + "tx_size1519_max",
> + "tx_multi_coll",
> + "tx_late_coll",
> + "tx_xcoll",
> + "tx_defer",
> + "tx_xdefer",
> + "tx_backoff1",
> + "pmac_tx_pause",
> + "pmac_tx_ok_bytes",
> + "pmac_tx_unicast",
> + "pmac_tx_multicast",
> + "pmac_tx_broadcast",
> + "pmac_tx_size64",
> + "pmac_tx_size65_127",
> + "pmac_tx_size128_255",
> + "pmac_tx_size256_511",
> + "pmac_tx_size512_1023",
> + "pmac_tx_size1024_1518",
> + "pmac_tx_size1519_max",

Please see 

https://patchwork.kernel.org/project/netdevbpf/list/?series=468795

(hopefully to be merged by the end of the week) and earlier patches for
pause and FEC stats. Anything that has a standardized interface is off
limits for the random ethtool -S grab bag.

Re: [PATCH net-next 03/10] net: sparx5: add hostmode with phylink support

2021-04-16 Thread Jakub Kicinski

On Fri, 16 Apr 2021 15:16:50 +0200 Steen Hegelund wrote:
> +static int sparx5_set_mac_address(struct net_device *dev, void *p)
> +{
> + const struct sockaddr *addr = p;
> +
> + /* Record the address */
> + ether_addr_copy(dev->dev_addr, addr->sa_data);

I think you need to validate that add is a valid ethernet address.
is_valid_ether_addr()

> +struct net_device *sparx5_create_netdev(struct sparx5 *sparx5, u32 portno)
> +{
> + struct sparx5_port *spx5_port;
> + struct net_device *ndev;
> + u64 val;
> +
> + ndev = devm_alloc_etherdev(sparx5->dev, sizeof(struct sparx5_port));
> + if (!ndev)
> + return ERR_PTR(-ENOMEM);
> +
> + SET_NETDEV_DEV(ndev, sparx5->dev);
> + spx5_port = netdev_priv(ndev);
> + spx5_port->ndev = ndev;
> + spx5_port->sparx5 = sparx5;
> + spx5_port->portno = portno;
> + sparx5_set_port_ifh(spx5_port->ifh, portno);
> + snprintf(ndev->name, IFNAMSIZ, "eth%d", portno);

Please don't try to name interfaces in the kernel.

Re: [PATCH v6 net-next] net: mana: Add a driver for Microsoft Azure Network Adapter (MANA)

2021-04-15 Thread Jakub Kicinski

On Wed, 14 Apr 2021 22:45:19 -0700 Dexuan Cui wrote:
> + buf = dma_alloc_coherent(gmi->dev, length, _handle,
> +  GFP_KERNEL | __GFP_ZERO);

No need for GFP_ZERO, dma_alloc_coherent() zeroes the memory these days.

> +static int mana_gd_register_irq(struct gdma_queue *queue,
> + const struct gdma_queue_spec *spec)
> ...
> + struct gdma_irq_context *gic;
> +
> + struct gdma_context *gc;

Why the empty line?

> + queue = kzalloc(sizeof(*queue), GFP_KERNEL);
> + if (!queue)
> + return -ENOMEM;
> +
> + gmi = >mem_info;
> + err = mana_gd_alloc_memory(gc, spec->queue_size, gmi);
> + if (err)
> + return err;

Leaks the memory from 'queue'?

Same code in mana_gd_create_mana_eq(), ...wq_cq(), etc.

> +int mana_do_attach(struct net_device *ndev, enum mana_attach_caller caller)
> +{
> + struct mana_port_context *apc = netdev_priv(ndev);
> + struct gdma_dev *gd = apc->ac->gdma_dev;
> + u32 max_txq, max_rxq, max_queues;
> + int port_idx = apc->port_idx;
> + u32 num_indirect_entries;
> + int err;
> +
> + if (caller == MANA_OPEN)
> + goto start_open;
> +
> + err = mana_init_port_context(apc);
> + if (err)
> + return err;
> +
> + err = mana_query_vport_cfg(apc, port_idx, _txq, _rxq,
> +_indirect_entries);
> + if (err) {
> + netdev_err(ndev, "Failed to query info for vPort 0\n");
> + goto reset_apc;
> + }
> +
> + max_queues = min_t(u32, max_txq, max_rxq);
> + if (apc->max_queues > max_queues)
> + apc->max_queues = max_queues;
> +
> + if (apc->num_queues > apc->max_queues)
> + apc->num_queues = apc->max_queues;
> +
> + memcpy(ndev->dev_addr, apc->mac_addr, ETH_ALEN);
> +
> + if (caller == MANA_PROBE)
> + return 0;
> +
> +start_open:

Why keep this as a single function, there is no overlap between what's
done for OPEN and PROBE, it seems.

Similarly detach should probably be split into clearly distinct parts.

> + err = mana_create_eq(apc);
> + if (err)
> + goto reset_apc;
> +
> + err = mana_create_vport(apc, ndev);
> + if (err)
> + goto destroy_eq;
> +
> + err = netif_set_real_num_tx_queues(ndev, apc->num_queues);
> + if (err)
> + goto destroy_vport;
> +
> + err = mana_add_rx_queues(apc, ndev);
> + if (err)
> + goto destroy_vport;
> +
> + apc->rss_state = apc->num_queues > 1 ? TRI_STATE_TRUE : TRI_STATE_FALSE;
> +
> + err = netif_set_real_num_rx_queues(ndev, apc->num_queues);
> + if (err)
> + goto destroy_vport;
> +
> + mana_rss_table_init(apc);
> +
> + err = mana_config_rss(apc, TRI_STATE_TRUE, true, true);
> + if (err)
> + goto destroy_vport;
> +
> + return 0;
> +
> +destroy_vport:
> + mana_destroy_vport(apc);
> +destroy_eq:
> + mana_destroy_eq(gd->gdma_context, apc);
> +reset_apc:
> + if (caller == MANA_OPEN)
> + return err;
> + kfree(apc->rxqs);
> + apc->rxqs = NULL;
> + return err;
> +}
> +
> +int mana_attach(struct net_device *ndev)
> +{
> + struct mana_port_context *apc = netdev_priv(ndev);
> + int err;
> +
> + ASSERT_RTNL();
> +
> + err = mana_do_attach(ndev, MANA_ATTACH);
> + if (err)
> + return err;
> +
> + netif_device_attach(ndev);
> +
> + apc->port_is_up = apc->port_st_save;
> +
> + /* Ensure port state updated before txq state */
> + smp_wmb();
> +
> + if (apc->port_is_up) {
> + netif_carrier_on(ndev);
> + netif_tx_wake_all_queues(ndev);
> + }
> +
> + return 0;
> +}

Re: bonding: 3ad: update slave arr after initialize

2021-04-15 Thread Jakub Kicinski

On Thu, 15 Apr 2021 14:59:49 +0800 jin yiting wrote:
>  From 71e63af579edd15ad7f7395760a19f67d9a1d7d3 Mon Sep 17 00:00:00 2001
> From: jin yiting 
> Date: Wed, 31 Mar 2021 20:38:40 +0800
> Subject: [PATCH] bonding: 3ad: update slave arr after initialize
> MIME-Version: 1.0
> Content-Type: text/plain; charset=UTF-8
> Content-Transfer-Encoding: 8bit

Please try git-send-email, the patch was malformed by Thunderbird.

Re: [PATCH v2 5/7] net: fec: make use of generic NET_SELFTESTS library

2021-04-15 Thread Jakub Kicinski

On Thu, 15 Apr 2021 15:07:36 +0200 Oleksij Rempel wrote:
> With this patch FEC on iMX will able to run generic net selftests
> 
> Signed-off-by: Oleksij Rempel 

allmodconfig build fails starting from this patch and still fails 
after patch 7:

net/core/selftests.o: In function `net_selftest':
selftests.c:(.text+0x75c): undefined reference to `phy_loopback'
selftests.c:(.text+0x7c2): undefined reference to `phy_loopback'

Re: [PATCH v5 net-next] net: mana: Add a driver for Microsoft Azure Network Adapter (MANA)

2021-04-13 Thread Jakub Kicinski

On Mon, 12 Apr 2021 19:35:09 -0700 Dexuan Cui wrote:
> + apc->port_st_save = apc->port_is_up;
> + apc->port_is_up = false;
> + apc->start_remove = true;
> +
> + /* Ensure port state updated before txq state */
> + smp_wmb();
> +
> + netif_tx_disable(ndev);

In your napi poll method there is no barrier between port_is_up check
and netif_tx_queue_stopped().

> + netif_carrier_off(ndev);
> +
> + /* No packet can be transmitted now since apc->port_is_up is false.
> +  * There is still a tiny chance that mana_poll_tx_cq() can re-enable
> +  * a txq because it may not timely see apc->port_is_up being cleared
> +  * to false, but it doesn't matter since mana_start_xmit() drops any
> +  * new packets due to apc->port_is_up being false.
> +  *
> +  * Drain all the in-flight TX packets
> +  */
> + for (i = 0; i < apc->num_queues; i++) {
> + txq = >tx_qp[i].txq;
> +
> + while (atomic_read(>pending_sends) > 0)
> + usleep_range(1000, 2000);
> + }

> + /* All cleanup actions should stay after rtnl_lock(), otherwise
> +  * other functions may access partially cleaned up data.
> +  */
> + rtnl_lock();
> +
> + mana_detach(ndev);
> +
> + unregister_netdevice(ndev);
> +
> + rtnl_unlock();

I find the resource management somewhat strange. Why is mana_attach()
and mana_detach() called at probe/remove time, and not when the
interface is brought up? Presumably when the user ifdowns the interface
there is no point holding the resources? Your open/close methods are
rather empty.

> + if ((eq_addr & PAGE_MASK) != eq_addr)
> + return -EINVAL;
> +
> + if ((cq_addr & PAGE_MASK) != cq_addr)
> + return -EINVAL;
> +
> + if ((rq_addr & PAGE_MASK) != rq_addr)
> + return -EINVAL;
> +
> + if ((sq_addr & PAGE_MASK) != sq_addr)
> + return -EINVAL;

PAGE_ALIGNED()

Re: Regression v5.12-rc3: net: stmmac: re-init rx buffers when mac resume back

2021-04-13 Thread Jakub Kicinski

On Tue, 13 Apr 2021 18:06:39 +0200 Thierry Reding wrote:
> given where we are in the release cycle, I think it'd be best to revert
> commit 9c63faaa931e ("net: stmmac: re-init rx buffers when mac resume
> back") for now.
> 
> To summarize the discussion: the patch was meant as a workaround to fix
> an occasional suspend/resume failure on one board that was not fully
> root caused, and ends up causing fully reproducible suspend/resume
> failures on at least one other board.
> 
> Joakim is looking at an alternative solution and Jon and I can provide
> testing from the Tegra side for any fixes.
> 
> Do you want me to send a revert patch or can you revert directly on top
> of your tree?

Please send a revert with the justification in the commit log, and 
perhaps also:

Link: https://lore.kernel.org/r/708edb92-a5df-ecc4-3126-5ab36707e...@nvidia.com/

for those who want to dig deeper in the history. Make sure you 
CC relevant folks so they can review and express their opinions.

Thanks!

Re: [PATCH v4 net-next] net: mana: Add a driver for Microsoft Azure Network Adapter (MANA)

2021-04-12 Thread Jakub Kicinski

On Sun, 11 Apr 2021 19:34:55 -0700 Dexuan Cui wrote:
> + for (i = 0; i < ANA_INDIRECT_TABLE_SIZE; i++)
> + apc->indir_table[i] = i % apc->num_queues;

ethtool_rxfh_indir_default()

> + err = mana_cfg_vport_steering(apc, rx, true, update_hash, update_tab);
> + return err;

return mana_...

please fix everywhere.

> + netif_set_real_num_tx_queues(ndev, apc->num_queues);
> +
> + err = mana_add_rx_queues(apc, ndev);
> + if (err)
> + goto destroy_vport;
> +
> + apc->rss_state = apc->num_queues > 1 ? TRI_STATE_TRUE : TRI_STATE_FALSE;
> +
> + netif_set_real_num_rx_queues(ndev, apc->num_queues);

netif_set_real_num_.. can fail.

> + rtnl_lock();
> +
> + netdev_lockdep_set_classes(ndev);
> +
> + ndev->hw_features = NETIF_F_SG | NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM;
> + ndev->hw_features |= NETIF_F_RXCSUM;
> + ndev->hw_features |= NETIF_F_TSO | NETIF_F_TSO6;
> + ndev->hw_features |= NETIF_F_RXHASH;
> + ndev->features = ndev->hw_features;
> + ndev->vlan_features = 0;
> +
> + err = register_netdevice(ndev);
> + if (err) {
> + netdev_err(ndev, "Unable to register netdev.\n");
> + goto destroy_vport;
> + }
> +
> + rtnl_unlock();
> +
> + return 0;
> +destroy_vport:
> + rtnl_unlock();

Why do you take rtnl_lock() explicitly around this code?

> +static int mana_set_channels(struct net_device *ndev,
> +  struct ethtool_channels *channels)
> +{
> + struct ana_port_context *apc = netdev_priv(ndev);
> + unsigned int new_count;
> + unsigned int old_count;
> + int err, err2;
> +
> + new_count = channels->combined_count;
> + old_count = apc->num_queues;
> +
> + if (new_count < 1 || new_count > apc->max_queues ||
> + channels->rx_count || channels->tx_count || channels->other_count)

All these checks should be done by the core already.

> + return -EINVAL;
> +
> + if (new_count == old_count)
> + return 0;

And so is this one.

> + err = mana_detach(ndev);
> + if (err) {
> + netdev_err(ndev, "mana_detach failed: %d\n", err);
> + return err;
> + }

Re: [PATCH net-next v2 1/1] net: stmmac: Add support for external trigger timestamping

2021-04-09 Thread Jakub Kicinski

Other than the minor nit below LGTM. Let's give Richard one more day.

On Thu,  8 Apr 2021 01:04:42 +0800 Wong Vee Khee wrote:
> +static void timestamp_interrupt(struct stmmac_priv *priv)
> +{
> + struct ptp_clock_event event;
> + unsigned long flags;
> + u32 num_snapshot;
> + u32 ts_status;
> + u32 tsync_int;
> + u64 ptp_time;
> + int i;
> +
> + tsync_int = readl(priv->ioaddr + GMAC_INT_STATUS) & GMAC_INT_TSIE;
> +
> + if (!tsync_int)
> + return;
> +
> + /* Read timestamp status to clear interrupt from either external
> +  * timestamp or start/end of PPS.
> +  */
> + ts_status = readl(priv->ioaddr + GMAC_TIMESTAMP_STATUS);
> +
> + if (priv->plat->ext_snapshot_en) {

Are you intending to add more code after this if? Otherwise you could
flip the condition and return early instead of having the extra level
of indentation.

> + num_snapshot = (ts_status & GMAC_TIMESTAMP_ATSNS_MASK) >>
> +GMAC_TIMESTAMP_ATSNS_SHIFT;
> +
> + for (i = 0; i < num_snapshot; i++) {
> + spin_lock_irqsave(>ptp_lock, flags);
> + get_ptptime(priv->ptpaddr, _time);
> + spin_unlock_irqrestore(>ptp_lock, flags);
> + event.type = PTP_CLOCK_EXTTS;
> + event.index = 0;
> + event.timestamp = ptp_time;
> + ptp_clock_event(priv->ptp_clock, );
> + }
> + }
> +}

Not really related to this patch but how does stmmac set IRQF_SHARED
and yet not track if it indeed generated the interrupt? Isn't that
against the rules?

Re: [PATCH] nfc: pn533: remove redundant assignment

2021-04-09 Thread Jakub Kicinski

On Fri,  9 Apr 2021 19:50:09 +0800 samirweng1979 wrote:
> From: wengjianfeng 
> 
> In many places,first assign a value to a variable and then return
> the variable. which is redundant, we should directly return the value.
> in pn533_rf_field funciton,return statement in the if statement is
> redundant, we just delete it.
> 
> Signed-off-by: wengjianfeng 

Thank you for the changes, please see comments below.

> diff --git a/drivers/nfc/pn533/pn533.c b/drivers/nfc/pn533/pn533.c
> index f1469ac..61ab4c0 100644
> --- a/drivers/nfc/pn533/pn533.c
> +++ b/drivers/nfc/pn533/pn533.c
> @@ -489,12 +489,8 @@ static int pn533_send_data_async(struct pn533 *dev, u8 
> cmd_code,
>pn533_send_async_complete_t complete_cb,
>void *complete_cb_context)
>  {
> - int rc;
> -
> - rc = __pn533_send_async(dev, cmd_code, req, complete_cb,
> + return __pn533_send_async(dev, cmd_code, req, complete_cb,
>   complete_cb_context);

Please realign the continuation line so it starts after the opening
bracket.

> -
> - return rc;
>  }
>  
>  static int pn533_send_cmd_async(struct pn533 *dev, u8 cmd_code,
> @@ -502,12 +498,8 @@ static int pn533_send_cmd_async(struct pn533 *dev, u8 
> cmd_code,
>   pn533_send_async_complete_t complete_cb,
>   void *complete_cb_context)
>  {
> - int rc;
> -
> - rc = __pn533_send_async(dev, cmd_code, req, complete_cb,
> + return __pn533_send_async(dev, cmd_code, req, complete_cb,
>   complete_cb_context);

Same here.

> - return rc;
>  }
>  
>  /*
> @@ -2614,7 +2606,6 @@ static int pn533_rf_field(struct nfc_dev *nfc_dev, u8 
> rf)
>(u8 *)_field, 1);
>   if (rc) {
>   nfc_err(dev->dev, "Error on setting RF field\n");
> - return rc;
>   }
>  
>   return rc;

In case some code is added between the check and the return statement
it'd be better to fix this by replacing the second return rc with
return 0:

if (rc) {
nfc_err(...);
return rc;
}

return 0;

> diff --git a/drivers/nfc/pn533/uart.c b/drivers/nfc/pn533/uart.c
> index a0665d8..6465348 100644
> --- a/drivers/nfc/pn533/uart.c
> +++ b/drivers/nfc/pn533/uart.c
> @@ -239,9 +239,8 @@ static int pn532_uart_probe(struct serdev_device *serdev)
>  {
>   struct pn532_uart_phy *pn532;
>   struct pn533 *priv;
> - int err;
> + int err = -ENOMEM;
>  
> - err = -ENOMEM;
>   pn532 = kzalloc(sizeof(*pn532), GFP_KERNEL);

IMO having the assignment before the call is more readable, please
leave as is.

>   if (!pn532)
>   goto err_exit;

[GIT PULL] Networking for 5.12-rc7

2021-04-09 Thread Jakub Kicinski

The following changes since commit 002322402dafd846c424ffa9240a937f49b48c42:

  Merge branch 'akpm' (patches from Andrew) (2021-03-25 11:43:43 -0700)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net.git tags/net-5.12-rc7

for you to fetch changes up to 27f0ad71699de41bae013c367b95a6b319cc46a9:

  net: fix hangup on napi_disable for threaded napi (2021-04-09 12:50:31 -0700)


Networking fixes for 5.12-rc7, including fixes from can, ipsec,
mac80211, wireless, and bpf trees. No scary regressions here
or in the works, but small fixes for 5.12 changes keep coming.

Current release - regressions:

 - virtio: do not pull payload in skb->head

 - virtio: ensure mac header is set in virtio_net_hdr_to_skb()

 - Revert "net: correct sk_acceptq_is_full()"

 - mptcp: revert "mptcp: provide subflow aware release function"

 - ethernet: lan743x: fix ethernet frame cutoff issue

 - dsa: fix type was not set for devlink port

 - ethtool: remove link_mode param and derive link params
from driver

 - sched: htb: fix null pointer dereference on a null new_q

 - wireless: iwlwifi: Fix softirq/hardirq disabling in
  iwl_pcie_enqueue_hcmd()

 - wireless: iwlwifi: fw: fix notification wait locking

 - wireless: brcmfmac: p2p: Fix deadlock introduced by avoiding
the rtnl dependency

Current release - new code bugs:

 - napi: fix hangup on napi_disable for threaded napi

 - bpf: take module reference for trampoline in module

 - wireless: mt76: mt7921: fix airtime reporting and related
   tx hangs

 - wireless: iwlwifi: mvm: rfi: don't lock mvm->mutex when sending
config command

Previous releases - regressions:

 - rfkill: revert back to old userspace API by default

 - nfc: fix infinite loop, refcount & memory leaks in LLCP sockets

 - let skb_orphan_partial wake-up waiters

 - xfrm/compat: Cleanup WARN()s that can be user-triggered

 - vxlan, geneve: do not modify the shared tunnel info when PMTU
  triggers an ICMP reply

 - can: fix msg_namelen values depending on CAN_REQUIRED_SIZE

 - can: uapi: mark union inside struct can_frame packed

 - sched: cls: fix action overwrite reference counting

 - sched: cls: fix err handler in tcf_action_init()

 - ethernet: mlxsw: fix ECN marking in tunnel decapsulation

 - ethernet: nfp: Fix a use after free in nfp_bpf_ctrl_msg_rx

 - ethernet: i40e: fix receiving of single packets in xsk zero-copy
   mode

 - ethernet: cxgb4: avoid collecting SGE_QBASE regs during traffic

Previous releases - always broken:

 - bpf: Refuse non-O_RDWR flags in BPF_OBJ_GET

 - bpf: Refcount task stack in bpf_get_task_stack

 - bpf, x86: Validate computation of branch displacements

 - ieee802154: fix many similar syzbot-found bugs
- fix NULL dereferences in netlink attribute handling
- reject unsupported operations on monitor interfaces
- fix error handling in llsec_key_alloc()

 - xfrm: make ipv4 pmtu check honor ip header df

 - xfrm: make hash generation lock per network namespace

 - xfrm: esp: delete NETIF_F_SCTP_CRC bit from features for esp
  offload

 - ethtool: fix incorrect datatype in set_eee ops

 - xdp: fix xdp_return_frame() kernel BUG throw for page_pool
memory model

 - openvswitch: fix send of uninitialized stack memory in ct limit
reply

Misc:

 - udp: add get handling for UDP_GRO sockopt

Signed-off-by: Jakub Kicinski 


A. Cody Schuffelen (1):
  virt_wifi: Return micros for BSS TSF values

Aditya Pakki (1):
  net/rds: Avoid potential use after free in rds_send_remove_from_sock

Ahmed S. Darwish (2):
  net: xfrm: Localize sequence counter per network namespace
  net: xfrm: Use sequence counter with associated spinlock

Alex Shi (1):
  net/ieee802154: remove unused macros to tame gcc

Alexander Aring (19):
  net: ieee802154: fix nl802154 del llsec key
  net: ieee802154: fix nl802154 del llsec dev
  net: ieee802154: fix nl802154 add llsec key
  net: ieee802154: fix nl802154 del llsec devkey
  net: ieee802154: nl-mac: fix check on panid
  net: ieee802154: forbid monitor for set llsec params
  net: ieee802154: stop dump llsec keys for monitors
  net: ieee802154: forbid monitor for add llsec key
  net: ieee802154: forbid monitor for del llsec key
  net: ieee802154: stop dump llsec devs for monitors
  net: ieee802154: forbid monitor for add llsec dev
  net: ieee802154: forbid monitor for del llsec dev
  net: ieee802154: stop dump llsec devkeys for monitors
  net: ieee802154: forbid monitor for add llsec devkey
  net: ieee802154: forbid monitor for del llsec devkey
  net: ieee802154: stop dump llsec seclevels for

Re: [PATCH net-next 2/3] net: use skb_for_each_frag() helper where possible

2021-04-09 Thread Jakub Kicinski

On Fri, 9 Apr 2021 22:44:50 +0200 Matteo Croce wrote:
> > What pops to mind (although quite nit picky) is the question if the
> > assembly changes much between driver which used to cache nr_frags and
> > now always going skb_shinfo(skb)->nr_frags? It's a relatively common
> > pattern.  
> 
> Since skb_shinfo() is a macro and skb_end_pointer() a static inline,
> it should be the same, but I was curious to check so, this is a diff
> between the following snippet before and afer the macro:
> 
> int frags = skb_shinfo(skb)->nr_frags;
> int i;
> for (i = 0; i < frags; i++)
> kfree(skb->frags[i]);
> 
>  1 file changed, 8 insertions(+), 7 deletions(-)
> 
> --- ins1.s 2021-04-09 22:35:59.384523865 +0200
> +++ ins2.s 2021-04-09 22:36:08.132594737 +0200
> @@ -1,26 +1,27 @@
>  iter:
>  movsx   rax, DWORD PTR [rdi+16]
>  mov rdx, QWORD PTR [rdi+8]
>  mov eax, DWORD PTR [rdx+rax]
>  testeax, eax
>  jle .L6
>  pushrbp
> -sub eax, 1
> +mov rbp, rdi
>  pushrbx
> -lea rbp, [rdi+32+rax*8]
> -lea rbx, [rdi+24]
> +xor ebx, ebx
>  sub rsp, 8
>  .L3:
> -mov rdi, QWORD PTR [rbx]
> -add rbx, 8
> +mov rdi, QWORD PTR [rbp+24+rbx*8]
> +add rbx, 1
>  callkfree
> -cmp rbx, rbp
> -jne .L3
> +movsx   rax, DWORD PTR [rbp+16]
> +mov rdx, QWORD PTR [rbp+8]
> +cmp DWORD PTR [rdx+rax], ebx
> +jg  .L3
>  add rsp, 8
>  xor eax, eax
>  pop rbx
>  pop rbp
>  ret
>  .L6:
>  xor eax, eax
>  for (i = 0; i < frags; i++)ret
> 

So looks like before compiler generated:

end = [nfrags]
for (ptr = [0]; ptr < end; ptr++)

and now it has to use the actual value of i, read nfrags in the loop
each time and compare it to i.

That makes sense, since it can't prove kfree() doesn't change nr_frags.

IDK if we care, but at least commit message should mention this.

Re: [PATCH] phy: nxp-c45: add driver for tja1103

2021-04-09 Thread Jakub Kicinski

On Fri,  9 Apr 2021 21:41:06 +0300 Radu Pirea (NXP OSS) wrote:
> Add driver for tja1103 driver and for future NXP C45 PHYs.
> 
> Signed-off-by: Radu Pirea (NXP OSS) 

drivers/net/phy/nxp-c45: struct mdio_device_id is 8 bytes.  The last of 1 is:
0x10 0xb0 0x1b 0x00 0xf0 0xff 0xff 0xff 
FATAL: modpost: drivers/net/phy/nxp-c45: struct mdio_device_id is not 
terminated with a NULL entry!
make[2]: *** [Module.symvers] Error 1
make[1]: *** [modules] Error 2
make: *** [__sub-make] Error 2

Re: [PATCH net-next v2 3/5] page_pool: Allow drivers to hint on SKB recycling

2021-04-09 Thread Jakub Kicinski

On Fri, 9 Apr 2021 22:01:51 +0300 Ilias Apalodimas wrote:
> On Fri, Apr 09, 2021 at 11:56:48AM -0700, Jakub Kicinski wrote:
> > On Fri,  2 Apr 2021 20:17:31 +0200 Matteo Croce wrote:  
> > > Co-developed-by: Jesper Dangaard Brouer 
> > > Co-developed-by: Matteo Croce 
> > > Signed-off-by: Ilias Apalodimas   
> > 
> > Checkpatch says we need sign-offs from all authors.
> > Especially you since you're posting.  
> 
> Yes it does, we forgot that.  Let me take a chance on this one. 
> The patch is changing the default skb return path and while we've done enough
> testing, I would really prefer this going in on a future -rc1 (assuming we 
> even
> consider merging it), allowing enough time to have wider tests.

Up to you guys. FWIW if you decide to try for 5.13 the missing signoffs
can be posted in replies, no need to repost.

Re: [PATCH net-next v2 3/5] page_pool: Allow drivers to hint on SKB recycling

2021-04-09 Thread Jakub Kicinski

On Fri,  2 Apr 2021 20:17:31 +0200 Matteo Croce wrote:
> Co-developed-by: Jesper Dangaard Brouer 
> Co-developed-by: Matteo Croce 
> Signed-off-by: Ilias Apalodimas 

Checkpatch says we need sign-offs from all authors.
Especially you since you're posting.

Re: [PATCH net-next 2/3] net: use skb_for_each_frag() helper where possible

2021-04-09 Thread Jakub Kicinski

On Fri,  9 Apr 2021 20:06:04 +0200 Matteo Croce wrote:
> From: Matteo Croce 
> 
> use the new helper macro skb_for_each_frag() which allows to iterate
> through all the SKB fragments.
> 
> The patch was created with Coccinelle, this was the semantic patch:

Bunch of set but not used warnings here. Please make sure the code
builds cleanly allmodconfig, W=1 C=1 before posting.

What pops to mind (although quite nit picky) is the question if the
assembly changes much between driver which used to cache nr_frags and
now always going skb_shinfo(skb)->nr_frags? It's a relatively common
pattern.

Re: [PATCH net-next 0/3] net: add new properties for of_get_mac_address from nvmem

2021-04-09 Thread Jakub Kicinski

On Fri,  9 Apr 2021 17:07:08 +0800 Joakim Zhang wrote:
> This patch set adds new properties for of_get_mac_address from nvmem.

Apart from addressing Rob's (and potentially other comments to come)
please also make sure to rebase before posting. This series doesn't
seem to apply to net-next.

Re: [RFC] net: core: devlink: add port_params_ops for devlink port parameters altering

2021-04-09 Thread Jakub Kicinski

On Fri, 9 Apr 2021 20:01:14 +0300 Vadym Kochan wrote:
> On Fri, Apr 09, 2021 at 09:51:13AM -0700, Samudrala, Sridhar wrote:
> > On 4/9/2021 9:22 AM, Oleksandr Mazur wrote:  
> > > I'd like to discuss a possibility of handling devlink port parameters
> > > with devlink port pointer supplied.
> > > 
> > > Current design makes it impossible to distinguish which port's parameter
> > > should get altered (set) or retrieved (get) whenever there's a single
> > > parameter registered within a few ports.  
> > 
> > I also noticed this issue recently when trying to add port parameters and
> > I have a patch that handles this in a different way. The ops in 
> > devlink_param
> > struct can be updated to include port_index as an argument
> 
> We were thinking on this direction but rather decided to have more strict
> cb signature which reflects that we are working with devlink_port only.

+1 for passing the actual pointer

Re: [RFC net-next 1/1] seg6: add counters support for SRv6 Behaviors

2021-04-07 Thread Jakub Kicinski

On Wed,  7 Apr 2021 20:03:32 +0200 Andrea Mayer wrote:
> This patch provides counters for SRv6 Behaviors as defined in [1], section
> 6. For each SRv6 Behavior instance, the counters defined in [1] are:
> 
>  - the total number of packets that have been correctly processed;
>  - the total amount of traffic in bytes of all packets that have been
>correctly processed;
> 
> In addition, we introduces a new counter that counts the number of packets
> that have NOT been properly processed (i.e. errors) by an SRv6 Behavior
> instance.
> 
> Each SRv6 Behavior instance can be configured, at the time of its creation,
> to make use of counters.
> This is done through iproute2 which allows the user to create an SRv6
> Behavior instance specifying the optional "count" attribute as shown in the
> following example:
> 
>  $ ip -6 route add 2001:db8::1 encap seg6local action End count dev eth0
> 
> per-behavior counters can be shown by adding "-s" to the iproute2 command
> line, i.e.:
> 
>  $ ip -s -6 route show 2001:db8::1
>  2001:db8::1 encap seg6local action End packets 0 bytes 0 errors 0 dev eth0
> 
> [1] https://www.rfc-editor.org/rfc/rfc8986.html#name-counters
> 
> Signed-off-by: Andrea Mayer 

> +static int put_nla_counters(struct sk_buff *skb, struct seg6_local_lwt *slwt)
> +{
> + struct seg6_local_counters counters = { 0, 0, 0 };
> + struct nlattr *nla;
> + int i;
> +
> + nla = nla_reserve(skb, SEG6_LOCAL_COUNTERS, sizeof(counters));
> + if (!nla)
> + return -EMSGSIZE;

nla_reserve_64bit(), IIUC netlink guarantees alignment of 64 bit values.

Re: [igb] netconsole triggers warning in netpoll_poll_dev

2021-04-07 Thread Jakub Kicinski

On Wed, 7 Apr 2021 09:25:28 -0700 Alexander Duyck wrote:
> On Wed, Apr 7, 2021 at 8:37 AM Jakub Kicinski  wrote:
> >
> > On Wed, 7 Apr 2021 08:00:53 +0200 Oleksandr Natalenko wrote:  
> > > Thanks for the effort, but reportedly [1] it made no difference,
> > > unfortunately.
> > >
> > > [1] https://bugzilla.kernel.org/show_bug.cgi?id=212573#c8  
> >
> > The only other option I see is that somehow the NAPI has no rings.
> >
> > diff --git a/drivers/net/ethernet/intel/igb/igb_main.c 
> > b/drivers/net/ethernet/intel/igb/igb_main.c
> > index a45cd2b416c8..24568adc2fb1 100644
> > --- a/drivers/net/ethernet/intel/igb/igb_main.c
> > +++ b/drivers/net/ethernet/intel/igb/igb_main.c
> > @@ -7980,7 +7980,7 @@ static int igb_poll(struct napi_struct *napi, int 
> > budget)
> > struct igb_q_vector *q_vector = container_of(napi,
> >  struct igb_q_vector,
> >  napi);
> > -   bool clean_complete = true;
> > +   bool clean_complete = q_vector->tx.ring || q_vector->rx.ring;
> > int work_done = 0;
> >
> >  #ifdef CONFIG_IGB_DCA  
> 
> It might make sense to just cast the work_done as a unsigned int, and
> then on the end of igb_poll use:
>   return min_t(unsigned int, work_done, budget - 1);

Sure, that's simplest. I wasn't sure something is supposed to prevent
this condition or if it's okay to cover it up.

Re: [igb] netconsole triggers warning in netpoll_poll_dev

2021-04-07 Thread Jakub Kicinski

On Wed, 7 Apr 2021 08:00:53 +0200 Oleksandr Natalenko wrote:
> Thanks for the effort, but reportedly [1] it made no difference,
> unfortunately.
> 
> [1] https://bugzilla.kernel.org/show_bug.cgi?id=212573#c8

The only other option I see is that somehow the NAPI has no rings.

diff --git a/drivers/net/ethernet/intel/igb/igb_main.c 
b/drivers/net/ethernet/intel/igb/igb_main.c
index a45cd2b416c8..24568adc2fb1 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -7980,7 +7980,7 @@ static int igb_poll(struct napi_struct *napi, int budget)
struct igb_q_vector *q_vector = container_of(napi,
 struct igb_q_vector,
 napi);
-   bool clean_complete = true;
+   bool clean_complete = q_vector->tx.ring || q_vector->rx.ring;
int work_done = 0;
 
 #ifdef CONFIG_IGB_DCA

Re: [PATCH net v1 1/1] ethtool: fix incorrect datatype in set_eee ops

2021-04-06 Thread Jakub Kicinski

On Tue,  6 Apr 2021 21:17:30 +0800 Wong Vee Khee wrote:
> The member 'tx_lpi_timer' is defined with __u32 datatype in the ethtool
> header file. Hence, we should use ethnl_update_u32() in set_eee ops.
> 
> Fixes: fd77be7bd43c ("ethtool: set EEE settings with EEE_SET request")
> Cc:  # 5.10.x
> Cc: Michal Kubecek 
> Signed-off-by: Wong Vee Khee 

Reviewed-by: Jakub Kicinski

Re: [igb] netconsole triggers warning in netpoll_poll_dev

2021-04-06 Thread Jakub Kicinski

On Tue, 6 Apr 2021 14:36:19 +0200 Oleksandr Natalenko wrote:
> Hello.
> 
> I've raised this here [1] first, but was suggested to engage igb devs,
> so here we are.
> 
> I'm experiencing the following woes while using netconsole regularly:
> 
> ```
> [22038.710800] [ cut here ]
> [22038.710801] igb_poll+0x0/0x1440 [igb] exceeded budget in poll
> [22038.710802] WARNING: CPU: 12 PID: 40362 at net/core/netpoll.c:155 
> netpoll_poll_dev+0x18a/0x1a0
> [22038.710802] Modules linked in: ...
> [22038.710835] CPU: 12 PID: 40362 Comm: systemd-sleep Not tainted 5.11.0-pf7 
> #1
> [22038.710836] Hardware name: ASUS System Product Name/Pro WS X570-ACE, BIOS 
> 3302 03/05/2021
> [22038.710836] RIP: 0010:netpoll_poll_dev+0x18a/0x1a0
> [22038.710837] Code: 6e ff 80 3d d2 9d f8 00 00 0f 85 5c ff ff ff 48 8b 73 28 
> 48 c7 c7 0c b8 21 84 89 44 24 04 c6 05 b6 9d f8 00 01 e8 84 21 1c 00 <0f> 0b 
> 8b 54 24 04 e9 36 ff ff ff 66 66 2e 0f 1f 84 00 00 00 00 00
> [22038.710838] RSP: 0018:b24106e37ba0 EFLAGS: 00010086
> [22038.710838] RAX:  RBX: 9599d2929c50 RCX: 
> 959f8ed1ac30
> [22038.710839] RDX:  RSI: 0023 RDI: 
> 959f8ed1ac28
> [22038.710839] RBP: 9598981d4058 R08: 0019 R09: 
> b24206e3796d
> [22038.710839] R10:  R11: b24106e37968 R12: 
> 959887e51ec8
> [22038.710840] R13: 000c R14:  R15: 
> 9599d2929c60
> [22038.710840] FS:  7f3ade370a40() GS:959f8ed0() 
> knlGS:
> [22038.710841] CS:  0010 DS:  ES:  CR0: 80050033
> [22038.710841] CR2:  CR3: 0003017b CR4: 
> 00350ee0
> [22038.710841] Call Trace:
> [22038.710842]  netpoll_send_skb+0x185/0x240
> [22038.710842]  write_msg+0xe5/0x100 [netconsole]
> [22038.710842]  console_unlock+0x37d/0x640
> [22038.710842]  ? __schedule+0x2e5/0xc90
> [22038.710843]  suspend_devices_and_enter+0x2ac/0x7f0
> [22038.710843]  pm_suspend.cold+0x321/0x36c
> [22038.710843]  state_store+0xa6/0x140
> [22038.710844]  kernfs_fop_write_iter+0x124/0x1b0
> [22038.710844]  new_sync_write+0x16a/0x200
> [22038.710844]  vfs_write+0x21c/0x2e0
> [22038.710844]  __x64_sys_write+0x6d/0xf0
> [22038.710845]  do_syscall_64+0x33/0x40
> [22038.710845]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [22038.710845] RIP: 0033:0x7f3adece10f7
> [22038.710846] Code: 0d 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 
> f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 
> 00 f0 ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24
> [22038.710847] RSP: 002b:7ffc51c555b8 EFLAGS: 0246 ORIG_RAX: 
> 0001
> [22038.710847] RAX: ffda RBX: 0004 RCX: 
> 7f3adece10f7
> [22038.710848] RDX: 0004 RSI: 7ffc51c556a0 RDI: 
> 0004
> [22038.710848] RBP: 7ffc51c556a0 R08: 55ea374302a0 R09: 
> 7f3aded770c0
> [22038.710849] R10: 7f3aded76fc0 R11: 0246 R12: 
> 0004
> [22038.710849] R13: 55ea3742c430 R14: 0004 R15: 
> 7f3adedb3700
> [22038.710849] ---[ end trace 6eae54fbf23807f8 ]---
> ```
> 
> This one happened during suspend/resume cycle (on resume), followed by:
> 
> ```
> [22038.868669] igb :05:00.0 enp5s0: Reset adapter
> [22040.998673] igb :05:00.0 enp5s0: Reset adapter
> [22043.819198] igb :05:00.0 enp5s0: igb: enp5s0 NIC Link is Up 1000 Mbps 
> Full Duplex, Flow Control: RX
> ```
> 
> I've bumped into a similar issue in BZ 211911 [2] (see c#16),
> and in c#17 it was suggested it was a separate unrelated issue,
> hence I'm raising a new concern.
> 
> Please help in finding out why this woe happens and in fixing it.
> 
> Thanks.
> 
> [1] https://bugzilla.kernel.org/show_bug.cgi?id=212573
> [2] https://bugzilla.kernel.org/show_bug.cgi?id=211911

Looks like igb_clean_tx_irq() should not return true if budget is 0,
ever, otherwise we risk hitting the min(work, budget - 1) which may
go negative.

So something like this? 

diff --git a/drivers/net/ethernet/intel/igb/igb_main.c 
b/drivers/net/ethernet/intel/igb/igb_main.c
index c9e8c65a3cfe..7a237b5311ca 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -8028,7 +8028,7 @@ static bool igb_clean_tx_irq(struct igb_q_vector 
*q_vector, int napi_budget)
unsigned int i = tx_ring->next_to_clean;
 
if (test_bit(__IGB_DOWN, >state))
-   return true;
+   goto out;
 
tx_buffer = _ring->tx_buffer_info[i];
tx_desc = IGB_TX_DESC(tx_ring, i);
@@ -8157,7 +8157,7 @@ static bool igb_clean_tx_irq(struct igb_q_vector 
*q_vector, int napi_budget)
tx_ring->queue_index);
 
/* we are about to reset, no point in enabling stuff */
-   return true;
+   goto out;
}
}
 
@@

Re: [PATCH net-next] netdevsim: remove unneeded semicolon

2021-04-06 Thread Jakub Kicinski

On Tue, 6 Apr 2021 11:18:13 +0800 Qiheng Lin wrote:
> Eliminate the following coccicheck warning:
>  drivers/net/netdevsim/fib.c:569:2-3: Unneeded semicolon
> 
> Signed-off-by: Qiheng Lin 

Acked-by: Jakub Kicinski

Re: [PATCH net-next v3 5/6] net: stmmac: Add support for XDP_TX action

2021-03-31 Thread Jakub Kicinski

On Wed, 31 Mar 2021 23:41:34 +0800 Ong Boon Leong wrote:
> This patch adds support for XDP_TX action which enables XDP program to
> transmit back received frames.
> 
> This patch has been tested with the "xdp2" app located in samples/bpf
> dir. The DUT receives burst traffic packet generated using pktgen script
> 'pktgen_sample03_burst_single_flow.sh'.
> 
> v3: Added 'nq->trans_start = jiffies' to avoid TX time-out as we are
> sharing TX queue between slow path and XDP. Thanks to Jakub Kicinski
> for pointing out.
> 
> Signed-off-by: Ong Boon Leong 

> +static int stmmac_xdp_xmit_back(struct stmmac_priv *priv,
> + struct xdp_buff *xdp)
> +{
> + struct xdp_frame *xdpf = xdp_convert_buff_to_frame(xdp);
> + int cpu = smp_processor_id();
> + struct netdev_queue *nq;
> + int queue;
> + int res;
> +
> + if (unlikely(!xdpf))
> + return -EFAULT;

Can you return -EFAULT here? looks like the function is otherwise
returning positive STMMAC_XDP_* return codes/masks.

> + queue = stmmac_xdp_get_tx_queue(priv, cpu);
> + nq = netdev_get_tx_queue(priv->dev, queue);
> +
> + __netif_tx_lock(nq, cpu);
> + /* Avoids TX time-out as we are sharing with slow path */
> + nq->trans_start = jiffies;
> + res = stmmac_xdp_xmit_xdpf(priv, queue, xdpf);
> + if (res == STMMAC_XDP_TX) {
> + stmmac_flush_tx_descriptors(priv, queue);
> + stmmac_tx_timer_arm(priv, queue);

Would it make sense to arm the timer and flush descriptors at the end
of the NAPI poll cycle? Instead of after every TX frame?

> + }
> + __netif_tx_unlock(nq);
> +
> + return res;
> +}

> @@ -4365,16 +4538,26 @@ static int stmmac_rx(struct stmmac_priv *priv, int 
> limit, u32 queue)
>   xdp.data_hard_start = page_address(buf->page);
>   xdp_set_data_meta_invalid();
>   xdp.frame_sz = buf_sz;
> + xdp.rxq = _q->xdp_rxq;
>  
> + pre_len = xdp.data_end - xdp.data_hard_start -
> +   buf->page_offset;
>   skb = stmmac_xdp_run_prog(priv, );
> + /* Due xdp_adjust_tail: DMA sync for_device
> +  * cover max len CPU touch
> +  */
> + sync_len = xdp.data_end - xdp.data_hard_start -
> +buf->page_offset;
> + sync_len = max(sync_len, pre_len);
>  
>   /* For Not XDP_PASS verdict */
>   if (IS_ERR(skb)) {
>   unsigned int xdp_res = -PTR_ERR(skb);
>  
>   if (xdp_res & STMMAC_XDP_CONSUMED) {
> - 
> page_pool_recycle_direct(rx_q->page_pool,
> -  buf->page);
> + page_pool_put_page(rx_q->page_pool,
> +
> virt_to_head_page(xdp.data),
> +sync_len, true);

IMHO the dma_sync_size logic is a little question, but it's not really
related to your patch, others are already doing the same thing, so it's
fine, I guess.

>   buf->page = NULL;
>   priv->dev->stats.rx_dropped++;

Re: [PATCH net-next v2 5/6] net: stmmac: Add support for XDP_TX action

2021-03-30 Thread Jakub Kicinski

On Tue, 30 Mar 2021 10:49:48 +0800 Ong Boon Leong wrote:
> + __netif_tx_lock(nq, cpu);
> + res = stmmac_xdp_xmit_xdpf(priv, queue, xdpf);
> + if (res == STMMAC_XDP_TX) {
> + stmmac_flush_tx_descriptors(priv, queue);
> + stmmac_tx_timer_arm(priv, queue);
> + }
> + __netif_tx_unlock(nq);

You may want to update nq->trans_start, see commit ec107e775d843

Re: [PATCH net-next v2 4/6] net: stmmac: Add initial XDP support

2021-03-30 Thread Jakub Kicinski

On Tue, 30 Mar 2021 10:49:47 +0800 Ong Boon Leong wrote:
> + if (!skb) {
> + dma_sync_single_for_cpu(priv->device, buf->addr,
> + buf1_len, dma_dir);
> +
> + xdp.data = page_address(buf->page) + buf->page_offset;
> + xdp.data_end = xdp.data + len;
> + xdp.data_hard_start = page_address(buf->page);
> + xdp_set_data_meta_invalid();
> + xdp.frame_sz = buf_sz;
> +
> + skb = stmmac_xdp_run_prog(priv, );
> +
> + /* For Not XDP_PASS verdict */
> + if (IS_ERR(skb)) {
> + unsigned int xdp_res = -PTR_ERR(skb);
> +
> + if (xdp_res & STMMAC_XDP_CONSUMED) {
> + 
> page_pool_recycle_direct(rx_q->page_pool,
> +  buf->page);
> + buf->page = NULL;
> + priv->dev->stats.rx_dropped++;
> +
> + /* Clear skb as it was set as
> +  * status by XDP program.
> +  */
> + skb = NULL;
> +
> + if (unlikely((status & rx_not_ls)))
> + goto read_again;
> +
> + count++;
> + continue;
> + }
> + }
> + }
> +
>   if (!skb) {
>   skb = napi_alloc_skb(>rx_napi, buf1_len);
>   if (!skb) {
> @@ -4322,9 +4400,8 @@ static int stmmac_rx(struct stmmac_priv *priv, int 
> limit, u32 queue)
>   goto drain_data;
>   }
>  
> - dma_sync_single_for_cpu(priv->device, buf->addr,
> - buf1_len, DMA_FROM_DEVICE);
> - skb_copy_to_linear_data(skb, page_address(buf->page),
> + skb_copy_to_linear_data(skb, page_address(buf->page) +
> + buf->page_offset,
>   buf1_len);

XDP can prepend or remove headers (using the bpf_xdp_adjust_head()
helper), so the start of data may no longer be page + HEADROOM, 
and the length of the frame may have changed. 

Are you accounting for this?

Re: [PATCH] ethernet/netronome/nfp: Fix a use after free in nfp_bpf_ctrl_msg_rx

2021-03-29 Thread Jakub Kicinski

On Mon, 29 Mar 2021 04:50:02 -0700 Lv Yunlong wrote:
> In nfp_bpf_ctrl_msg_rx, if
> nfp_ccm_get_type(skb) == NFP_CCM_TYPE_BPF_BPF_EVENT is true, the skb
> will be freed. But the skb is still used by nfp_ccm_rx(>ccm, skb).
> 
> My patch adds a return when the skb was freed.
> 
> Fixes: bcf0cafab44fd ("nfp: split out common control message handling code")
> Signed-off-by: Lv Yunlong 

Reviewed-by: Jakub Kicinski

Re: [PATCH 2/2] gpiolib: Allow drivers to return EOPNOTSUPP from config

2021-03-29 Thread Jakub Kicinski

On Mon, 29 Mar 2021 18:25:46 +0300 Andy Shevchenko wrote:
> On Mon, Mar 29, 2021 at 08:08:52AM -0700, Joe Perches wrote:
> > On Mon, 2021-03-29 at 14:59 +0300, Andy Shevchenko wrote:  
> > > On Mon, Mar 29, 2021 at 2:43 PM Matti Vaittinen
> > >  wrote:  
> > > > 
> > > > The checkpacth instructs to switch from ENOSUPP to EOPNOTSUPP.  
> > > > > WARNING: ENOTSUPP is not a SUSV4 error code, prefer EOPNOTSUPP  
> > > > 
> > > > Make the gpiolib allow drivers to return both so driver developers
> > > > can avoid one of the checkpatch complaints.  
> > > 
> > > Internally we are fine to use the ENOTSUPP.
> > > Checkpatch false positives there.
> > > 
> > > I doubt we need this change. Rather checkpatch should rephrase this to
> > > point out that this is only applicable to _user-visible_ error path.
> > > Cc'ed Joe.  
> > 
> > Adding CC for Jakub Kicinski who added that particular rule/test.
> > 
> > And the output message report of the rule is merely a suggestion indicating
> > a preference.  It's always up to an individual to accept/reject.
> > 
> > At best, perhaps wordsmithing the checkpatch message might be an OK option. 
> >  
> 
> Thanks, Joe!
> 
> Jakub, what do you think?

Agreed, weaving into the message that ENOTSUPP is okay internally
sounds good. Perhaps we should append "if error may be returned to 
user space"?

Re: [PATCH net-next v2 1/1] net: stmmac: add per-queue TX & RX coalesce ethtool support

2021-03-17 Thread Jakub Kicinski

On Wed, 17 Mar 2021 09:01:23 +0800 Ong Boon Leong wrote:
> Extending the driver to support per-queue RX and TX coalesce settings in
> order to support below commands:
> 
> To show per-queue coalesce setting:-
>  $ ethtool --per-queue  queue_mask  --show-coalesce
> 
> To set per-queue coalesce setting:-
>  $ ethtool --per-queue  queue_mask  --coalesce \
>  [rx-usecs N] [rx-frames M] [tx-usecs P] [tx-frames Q]
> 
> Signed-off-by: Ong Boon Leong 

Acked-by: Jakub Kicinski 

> + if (queue < tx_cnt) {
> + ec->tx_coalesce_usecs = priv->tx_coal_timer[queue];
> + ec->tx_max_coalesced_frames = priv->tx_coal_frames[queue];
> + } else {
> + ec->tx_coalesce_usecs = 0;
> + ec->tx_max_coalesced_frames = 0;

nit: I think the struct is initialized to 0 so there is no need to set
it explicitly.

Re: [net PATCH 4/9] octeontx2-af: Remove TOS field from MKEX TX

2021-03-17 Thread Jakub Kicinski

On Wed, 17 Mar 2021 12:07:12 +0530 sundeep subbaraya wrote:
> On Tue, Mar 16, 2021 at 10:53 PM Jakub Kicinski  wrote:
> >
> > On Tue, 16 Mar 2021 14:57:08 +0530 Hariprasad Kelam wrote:  
> > > From: Subbaraya Sundeep 
> > >
> > > TOS overlaps with DMAC field in mcam search key and hence installing
> > > rules for TX side are failing. Hence remove TOS field from TX profile.  
> >
> > Could you clarify what "installing rules is failing" means?
> > Return error or does not behave correctly?  
> 
> Returns error. The MKEX profile can be in a way where higher layer packet 
> fields
> can overwrite lower layer packet fields in output MCAM Key. The commit
> 42006910 ("octeontx2-af: cleanup KPU config data") introduced TX TOS field and
> it overwrites DMAC. AF driver return error when TX rule is installed
> with DMAC as
> match criteria since DMAC gets overwritten and cannot be supported. Layers 
> from
> lower to higher in our case:
> LA - Ethernet
> LB - VLAN
> LC - IP
> LD - TCP/UDP
> and so on.
> 
> We make sure there are no overlaps between layers but TOS got added by 
> mistake.
> We will elaborate the commit description and send the next version.

Thank you! The longer explanation makes the error clear.

Re: [PATCH] Add MHI bus support and driver for T99W175 5G modem

2021-03-16 Thread Jakub Kicinski

On Tue, 16 Mar 2021 05:42:37 -0700 Jarvis Jiang wrote:
>  drivers/bus/mhi/devices/mhi_netdev.c  | 1830 +

There's already a drivers/net/mhi/

Please make sure the drivers live in their respective subsystem.

Virtio drivers don't sit under drivers/virtio, and most certainly not
under drivers/bus/virtio...

>  drivers/bus/mhi/devices/mhi_satellite.c   | 1155 +
>  drivers/bus/mhi/devices/mhi_uci.c |  802 ++

Ugh, can you clarify what's the source of this code dump and if you're
coordinating with others working on Qualcomm drivers?

Re: [RESEND v1 net-next 4/5] stmmac: intel: add support for multi-vector msi and msi-x

2021-03-16 Thread Jakub Kicinski

On Tue, 16 Mar 2021 20:18:22 +0800 Voon Weifeng wrote:
> From: Ong Boon Leong 
> 
> Intel mgbe controller supports multi-vector interrupts:
> msi_rx_vec0,2,4,6,8,10,12,14
> msi_tx_vec1,3,5,7,9,11,13,15
> msi_sfty_ue_vec   26
> msi_sfty_ce_vec   27
> msi_lpi_vec   28
> msi_mac_vec   29
> 
> During probe(), the driver will starts with request allocation for
> multi-vector interrupts. If it fails, then it will automatically fallback
> to request allocation for single interrupts.
> 
> Signed-off-by: Ong Boon Leong 
> Co-developed-by: Voon Weifeng 
> Signed-off-by: Voon Weifeng 

> +
> +static int stmmac_config_multi_msi(struct pci_dev *pdev,
> +struct plat_stmmacenet_data *plat,
> +struct stmmac_resources *res)
> +{
> + int ret;
> + int i;
> +
> + ret = pci_alloc_irq_vectors(pdev, 2, STMMAC_MSI_VEC_MAX,
> + PCI_IRQ_MSI | PCI_IRQ_MSIX);
> + if (ret < 0) {
> + dev_info(>dev, "%s: multi MSI enablement failed\n",
> +  __func__);
> + return ret;
> + }
> +
> + if (plat->msi_rx_base_vec >= STMMAC_MSI_VEC_MAX ||
> + plat->msi_tx_base_vec >= STMMAC_MSI_VEC_MAX) {
> + dev_info(>dev, "%s: Invalid RX & TX vector defined\n",
> +  __func__);
> + return -1;

free_irq_vectors?  Or move the check before alloc if possible.

> + }


> @@ -699,6 +786,19 @@ static int intel_eth_pci_probe(struct pci_dev *pdev,
>   writel(tx_lpi_usec, res.addr + GMAC_1US_TIC_COUNTER);
>   }
>  
> + ret = stmmac_config_multi_msi(pdev, plat, );
> + if (!ret)
> + goto msi_done;

Please don't use gotos where an if statement would do perfectly well.

> + ret = stmmac_config_single_msi(pdev, plat, );
> + if (ret) {
> + dev_err(>dev, "%s: ERROR: failed to enable IRQ\n",
> + __func__);
> + return ret;

return? isn't there some cleanup that needs to happen before exiting?

> + }
> +
> +msi_done:
> +
>   ret = stmmac_dvr_probe(>dev, plat, );
>   if (ret) {
>   pci_free_irq_vectors(pdev);

Re: [RESEND v1 net-next 3/5] net: stmmac: introduce MSI Interrupt routines for mac, safety, RX & TX

2021-03-16 Thread Jakub Kicinski

On Tue, 16 Mar 2021 20:18:21 +0800 Voon Weifeng wrote:
> From: Ong Boon Leong 
> 
> Now we introduce MSI interrupt service routines and hook these routines
> up if stmmac_open() sees valid irq line being requested:-
> 
> stmmac_mac_interrupt():- MAC (dev->irq), WOL (wol_irq), LPI (lpi_irq)
> stmmac_safety_interrupt() :- Safety Feat Correctible Error (sfty_ce_irq)
>  & Uncorrectible Error (sfty_ue_irq)
> stmmac_msi_intr_rx()  :- For all RX MSI irq (rx_irq)
> stmmac_msi_intr_tx()  :- For all TX MSI irq (tx_irq)

Do you split RX and TX irqs out on purpose? Most commonly one queue
pair maps to one CPU, so using single IRQ for Rx and Tx results in
fewer IRQs being triggered and better system performance.

> Each of IRQs will have its unique name so that we can differentiate
> them easily under /proc/interrupts.
> 
> Signed-off-by: Ong Boon Leong 
> Signed-off-by: Voon Weifeng 

> +static int stmmac_request_irq(struct net_device *dev)

This function is a one huge if statement, please factor out both sides
into separate subfunctions.

> + netdev_info(priv->dev, "PASS: requesting IRQs\n");

Does the user really need to know interrupts were requested on every
probe?

> + return ret;

return 0; ?

> +irq_error:
> + stmmac_free_irq(dev, irq_err, irq_idx);
> + return ret;
> +}

Re: [RESEND v1 net-next 2/5] net: stmmac: make stmmac_interrupt() function more friendly to MSI

2021-03-16 Thread Jakub Kicinski

On Tue, 16 Mar 2021 20:18:20 +0800 Voon Weifeng wrote:
> + if (unlikely(!dev)) {
> + netdev_err(priv->dev, "%s: invalid dev pointer\n", __func__);
> + return IRQ_NONE;
> + }

Where did this check come from? Please avoid defensive programming 
in the kernel unless you can point out how the condition can arise
legitimately.

Re: [mm, net-next v2] mm: net: memcg accounting for TCP rx zerocopy

2021-03-16 Thread Jakub Kicinski

On Mon, 15 Mar 2021 23:28:08 -0700 Arjun Roy wrote:
> On Mon, Mar 15, 2021 at 11:22 PM Arjun Roy  wrote:
> >
> > On Mon, Mar 15, 2021 at 9:29 PM Shakeel Butt  wrote:  
> > >
> > > On Mon, Mar 15, 2021 at 9:20 PM Arjun Roy  wrote:  
>  [...]  
> > > [...]  
>  [...]  
>  [...]  
> > >
> > > It is due to "mm: page-writeback: simplify memcg handling in
> > > test_clear_page_writeback()" patch in the mm tree. You would need to
> > > reintroduce the lock_page_memcg() which returns the memcg and make
> > > __unlock_page_memcg() non-static.  
> >
> > To clarify, Shakeel - the tag "tag: v5.12-rc2-mmots-2021-03-11-21-49"
> > fails to build on a clean checkout, without this patch, due to a
> > compilation failure in mm/shmem.c, for reference:
> > https://pastebin.com/raw/12eSGdGD
> > (and that's why I'm basing this patch off of net-next in this email).
> >
> > -Arjun  
> 
> Another seeming anomaly - the patch sent out passes
> scripts/checkpatch.pl but netdev/checkpatch finds plenty of actionable
> fixes here: 
> https://patchwork.kernel.org/project/netdevbpf/patch/20210316041645.144249-1-arjunroy.k...@gmail.com/
> 
> Is netdev using some other automated checker instead of scripts/checkpatch.pl?

--strict --max-line-length=80

Re: [PATCH] net: ipv4: route.c: simplify procfs code

2021-03-16 Thread Jakub Kicinski

On Tue, 16 Mar 2021 10:57:36 +0800 Yejune Deng wrote:
> proc_creat_seq() that directly take a struct seq_operations,
> and deal with network namespaces in ->open.
> 
> Signed-off-by: Yejune Deng 

Looks equivalent to me:

Reviewed-by: Jakub Kicinski

Re: [net PATCH 3/9] octeontx2-af: Do not allocate memory for devlink private

2021-03-16 Thread Jakub Kicinski

On Tue, 16 Mar 2021 23:33:40 +0530 sundeep subbaraya wrote:
> On Tue, Mar 16, 2021 at 10:53 PM Jakub Kicinski  wrote:
> >
> > On Tue, 16 Mar 2021 14:57:07 +0530 Hariprasad Kelam wrote:  
> > > From: Subbaraya Sundeep 
> > >
> > > Memory for driver private structure rvu_devlink is
> > > also allocated during devlink_alloc. Hence use
> > > the allocated memory by devlink_alloc and access it
> > > by devlink_priv call.
> > >
> > > Fixes: fae06da4("octeontx2-af: Add devlink suppoort to af driver")
> > > Signed-off-by: Subbaraya Sundeep 
> > > Signed-off-by: Hariprasad Kelam 
> > > Signed-off-by: Sunil Kovvuri Goutham   
> >
> > Does it fix any bug? Looks like a coding improvement.  
> 
> Without this we cannot fetch our private struct 'rvu_devlink'  from any
> of the functions in devlink_ops which may get added in future.

"which may get added in future" does not sound like it's fixing 
an existing problem to me :(

If you have particular case where the existing setup is problematic
please describe it in more detail, or mention what other fix depends 
on this patch. Otherwise sending this one patch for net-next would 
be better IMHO.

Re: [net PATCH 4/9] octeontx2-af: Remove TOS field from MKEX TX

2021-03-16 Thread Jakub Kicinski

On Tue, 16 Mar 2021 14:57:08 +0530 Hariprasad Kelam wrote:
> From: Subbaraya Sundeep 
> 
> TOS overlaps with DMAC field in mcam search key and hence installing
> rules for TX side are failing. Hence remove TOS field from TX profile.

Could you clarify what "installing rules is failing" means?
Return error or does not behave correctly?

Re: [net PATCH 3/9] octeontx2-af: Do not allocate memory for devlink private

2021-03-16 Thread Jakub Kicinski

On Tue, 16 Mar 2021 14:57:07 +0530 Hariprasad Kelam wrote:
> From: Subbaraya Sundeep 
> 
> Memory for driver private structure rvu_devlink is
> also allocated during devlink_alloc. Hence use
> the allocated memory by devlink_alloc and access it
> by devlink_priv call.
> 
> Fixes: fae06da4("octeontx2-af: Add devlink suppoort to af driver")
> Signed-off-by: Subbaraya Sundeep 
> Signed-off-by: Hariprasad Kelam 
> Signed-off-by: Sunil Kovvuri Goutham 

Does it fix any bug? Looks like a coding improvement.

Re: [net PATCH 1/9] octeontx2-pf: Do not modify number of rules

2021-03-16 Thread Jakub Kicinski

On Tue, 16 Mar 2021 14:57:05 +0530 Hariprasad Kelam wrote:
> From: Subbaraya Sundeep 
> 
> In the ETHTOOL_GRXCLSRLALL ioctl ethtool uses
> below structure to read number of rules from the driver.
> 
> struct ethtool_rxnfc {
> __u32   cmd;
> __u32   flow_type;
> __u64   data;
> struct ethtool_rx_flow_spec fs;
> union {
> __u32   rule_cnt;
> __u32   rss_context;
> };
> __u32   rule_locs[0];
> };
> 
> Driver must not modify rule_cnt member. But currently driver
> modifies it by modifying rss_context. Hence fix it by using a
> local variable.
> 
> Fixes: 81a43620("octeontx2-pf: Add RSS multi group support")

Fixes tag: Fixes: 81a43620("octeontx2-pf: Add RSS multi group support")
Has these problem(s):
- missing space between the SHA1 and the subject
- SHA1 should be at least 12 digits long
  Can be fixed by setting core.abbrev to 12 (or more) or (for git v2.11
  or later) just making sure it is not set (or set to "auto").

Please fix the entire submission.

Re: [PATCH net-next 1/2] net: stmmac: EST interrupts handling and error reporting

2021-03-15 Thread Jakub Kicinski

On Tue, 16 Mar 2021 06:14:08 +0800 mohammad.athari.ism...@intel.com
wrote:
> From: Voon Weifeng 
> 
> Enabled EST related interrupts as below:
> 1) Constant Gate Control Error (CGCE)
> 2) Head-of-Line Blocking due to Scheduling (HLBS)
> 3) Head-of-Line Blocking due to Frame Size (HLBF).
> 4) Base Time Register error (BTRE)
> 5) Switch to S/W owned list Complete (SWLC)
> 
> For HLBS, the user will get the info of all the queues that shows this
> error. For HLBF, the user will get the info of all the queue with the
> latest frame size which causes the error. Frame size 0 indicates no
> error.
> 
> The ISR handling takes place when EST feature is enabled by user.
> 
> Signed-off-by: Voon Weifeng 
> Signed-off-by: Ong Boon Leong 
> Co-developed-by: Mohammad Athari Bin Ismail 
> Signed-off-by: Mohammad Athari Bin Ismail 

> + if (status & HLBS) {
> + value = readl(ioaddr + MTL_EST_SCH_ERR);
> + value &= txqcnt_mask;
> +
> + /* Clear Interrupt */
> + writel(value, ioaddr + MTL_EST_SCH_ERR);
> +
> + /* Collecting info to shows all the queues that has HLBS
> +  * issue. The only way to clear this is to clear the
> +  * statistic
> +  */
> + if (net_ratelimit())
> + netdev_err(dev, "EST: HLB(sched) Queue %u\n", value);

This is a mask so probably better display it as hex?

> + }
> +
> + if (status & HLBF) {
> + value = readl(ioaddr + MTL_EST_FRM_SZ_ERR);
> + feqn = value & txqcnt_mask;
> +
> + value = readl(ioaddr + MTL_EST_FRM_SZ_CAP);
> + hbfq = (value & SZ_CAP_HBFQ_MASK(txqcnt)) >> SZ_CAP_HBFQ_SHIFT;
> + hbfs = value & SZ_CAP_HBFS_MASK;
> +
> + /* Clear Interrupt */
> + writel(feqn, ioaddr + MTL_EST_FRM_SZ_ERR);
> +
> + if (net_ratelimit())
> + netdev_err(dev, "EST: HLB(size) Queue %u Size %u\n",
> +hbfq, hbfs);
> + }
> +
> + if (status & BTRE) {
> + btrl = (status & BTRL) >> BTRL_SHIFT;
> +
> + if (net_ratelimit())
> + netdev_info(dev, "EST: BTR Error Loop Count %u\n",
> + btrl);
> +
> + writel(BTRE, ioaddr + MTL_EST_STATUS);
> + }
> +
> + if (status & SWLC) {
> + writel(SWLC, ioaddr + MTL_EST_STATUS);
> + netdev_info(dev, "EST: SWOL has been switched\n");
> + }
> +
> + return status;

Caller never checks the return value, it probably should if this driver
supports shared irqs? Otherwise you can make this function void.

> +}

Re: [PATCH net-next 1/1] net: stmmac: add per-queue TX & RX coalesce ethtool support

2021-03-15 Thread Jakub Kicinski

On Mon, 15 Mar 2021 14:44:48 +0800 Ong Boon Leong wrote:
> Extending the driver to support per-queue RX and TX coalesce settings in
> order to support below commands:
> 
> To show per-queue coalesce setting:-
>  $ ethtool --per-queue  queue_mask  --show-coalesce
> 
> To set per-queue coalesce setting:-
>  $ ethtool --per-queue  queue_mask  --coalesce \
>  [rx-usecs N] [rx-frames M] [tx-usecs P] [tx-frames Q]
> 
> Signed-off-by: Ong Boon Leong 

> -static int stmmac_get_coalesce(struct net_device *dev,
> -struct ethtool_coalesce *ec)
> +static int __stmmac_get_coalesce(struct net_device *dev,
> +  struct ethtool_coalesce *ec,
> +  int queue)
>  {
>   struct stmmac_priv *priv = netdev_priv(dev);
> + u32 max_cnt;
> + u32 rx_cnt;
> + u32 tx_cnt;
>  
> - ec->tx_coalesce_usecs = priv->tx_coal_timer;
> - ec->tx_max_coalesced_frames = priv->tx_coal_frames;
> + rx_cnt = priv->plat->rx_queues_to_use;
> + tx_cnt = priv->plat->tx_queues_to_use;
> + max_cnt = max(rx_cnt, tx_cnt);
>  
> - if (priv->use_riwt) {
> - ec->rx_max_coalesced_frames = priv->rx_coal_frames;
> - ec->rx_coalesce_usecs = stmmac_riwt2usec(priv->rx_riwt, priv);
> + if (queue < 0)
> + queue = 0;
> + else if (queue >= max_cnt)
> + return -EINVAL;
> +
> + if (queue < tx_cnt) {
> + ec->tx_coalesce_usecs = priv->tx_coal_timer[queue];
> + ec->tx_max_coalesced_frames = priv->tx_coal_frames[queue];
> + } else {
> + ec->tx_coalesce_usecs = -1;
> + ec->tx_max_coalesced_frames = -1;
> + }
> +
> + if (priv->use_riwt && queue < rx_cnt) {
> + ec->rx_max_coalesced_frames = priv->rx_coal_frames[queue];
> + ec->rx_coalesce_usecs = stmmac_riwt2usec(priv->rx_riwt[queue],
> +  priv);
> + } else {
> + ec->rx_max_coalesced_frames = -1;
> + ec->rx_coalesce_usecs = -1;

Why the use of negative values? why not leave them as 0?

>   }
>  
>   return 0;
>  }
>  
> -static int stmmac_set_coalesce(struct net_device *dev,
> +static int stmmac_get_coalesce(struct net_device *dev,
>  struct ethtool_coalesce *ec)
> +{
> + return __stmmac_get_coalesce(dev, ec, -1);
> +}
> +
> +static int stmmac_get_per_queue_coalesce(struct net_device *dev, u32 queue,
> +  struct ethtool_coalesce *ec)
> +{
> + return __stmmac_get_coalesce(dev, ec, queue);
> +}
> +
> +static int __stmmac_set_coalesce(struct net_device *dev,
> +  struct ethtool_coalesce *ec,
> +  int queue)
>  {
>   struct stmmac_priv *priv = netdev_priv(dev);
> - u32 rx_cnt = priv->plat->rx_queues_to_use;
> + bool all_queues = false;
>   unsigned int rx_riwt;
> + u32 max_cnt;
> + u32 rx_cnt;
> + u32 tx_cnt;
> +
> + rx_cnt = priv->plat->rx_queues_to_use;
> + tx_cnt = priv->plat->tx_queues_to_use;
> + max_cnt = max(rx_cnt, tx_cnt);
> +
> + if (queue < 0)
> + all_queues = true;
> + else if (queue >= max_cnt)
> + return -EINVAL;
> +
> + /* Check not supported parameters  */
> + if (ec->rx_coalesce_usecs_irq ||
> + ec->rx_max_coalesced_frames_irq || ec->tx_coalesce_usecs_irq ||
> + ec->use_adaptive_rx_coalesce || ec->use_adaptive_tx_coalesce ||
> + ec->pkt_rate_low || ec->rx_coalesce_usecs_low ||
> + ec->rx_max_coalesced_frames_low || ec->tx_coalesce_usecs_high ||
> + ec->tx_max_coalesced_frames_low || ec->pkt_rate_high ||
> + ec->tx_coalesce_usecs_low || ec->rx_coalesce_usecs_high ||
> + ec->rx_max_coalesced_frames_high ||
> + ec->tx_max_coalesced_frames_irq ||
> + ec->stats_block_coalesce_usecs ||
> + ec->tx_max_coalesced_frames_high || ec->rate_sample_interval)
> + return -EOPNOTSUPP;

This shouldn't be needed now that supporter types are expressed in 
dev->ethtool_ops->supported_coalesce_params, no?

>   if (priv->use_riwt && (ec->rx_coalesce_usecs > 0)) {
>   rx_riwt = stmmac_usec2riwt(ec->rx_coalesce_usecs, priv);
> @@ -785,8 +846,23 @@ static int stmmac_set_coalesce(struct net_device *dev,
>   if ((rx_riwt > MAX_DMA_RIWT) || (rx_riwt < MIN_DMA_RIWT))
>   return -EINVAL;
>  
> - priv->rx_riwt = rx_riwt;
> - stmmac_rx_watchdog(priv, priv->ioaddr, priv->rx_riwt, rx_cnt);
> + if (all_queues) {
> + int i;
> +
> + for (i = 0; i < rx_cnt; i++) {
> + priv->rx_riwt[i] = rx_riwt;
> + stmmac_rx_watchdog(priv, priv->ioaddr,
> +rx_riwt, i);
> +

Re: [PATCH v2 net-next] net: dsa: b53: mmap: Add device tree support

2021-03-15 Thread Jakub Kicinski

On Mon, 15 Mar 2021 16:11:40 +0100 Álvaro Fernández Rojas wrote:
> Add device tree support to b53_mmap.c while keeping platform devices support.
> 
> Signed-off-by: Álvaro Fernández Rojas 
> ---
>  v2: add change suggested by Florian Fainelli (less "OF-centric") and replace
>   brcm,ports property with a ports child scan.
> 
>  drivers/net/dsa/b53/b53_mmap.c | 54 ++
>  1 file changed, 54 insertions(+)
> 
> diff --git a/drivers/net/dsa/b53/b53_mmap.c b/drivers/net/dsa/b53/b53_mmap.c
> index c628d0980c0b..94a4e3929ebf 100644
> --- a/drivers/net/dsa/b53/b53_mmap.c
> +++ b/drivers/net/dsa/b53/b53_mmap.c
> @@ -16,6 +16,7 @@
>   * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
>   */
>  
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -228,11 +229,64 @@ static const struct b53_io_ops b53_mmap_ops = {
>   .write64 = b53_mmap_write64,
>  };
>  
> +static int b53_mmap_probe_of(struct platform_device *pdev,
> +  struct b53_platform_data **ppdata)
> +{
> + struct device *dev = >dev;
> + struct device_node *np = dev->of_node;
> + struct device_node *of_ports, *of_port;
> + struct b53_platform_data *pdata;
> + void __iomem *mem;

reverse xmas tree? Initialize out of line if needed.

> + mem = devm_platform_ioremap_resource(pdev, 0);
> + if (IS_ERR(mem))
> + return PTR_ERR(mem);
> +
> + pdata = devm_kzalloc(dev, sizeof(struct b53_platform_data),
> +  GFP_KERNEL);
> + if (!pdata)
> + return -ENOMEM;
> +
> + pdata->regs = mem;
> + pdata->chip_id = BCM63XX_DEVICE_ID;
> + pdata->big_endian = of_property_read_bool(np, "big-endian");
> +
> + of_ports = of_get_child_by_name(np, "ports");

Isn't this leaking a ref on of_ports? I don't see any put.

> + if (!of_ports) {
> + dev_err(dev, "no ports child node found\n");
> + return -EINVAL;
> + }
> +
> + for_each_available_child_of_node(of_ports, of_port) {
> + u32 reg;
> +
> + if (of_property_read_u32(of_port, "reg", ))
> + continue;
> +
> + if (reg < B53_CPU_PORT)
> + pdata->enabled_ports |= BIT(reg);
> + }
> +
> + *ppdata = pdata;
> +
> + return 0;
> +}
> +
>  static int b53_mmap_probe(struct platform_device *pdev)
>  {
> + struct device_node *np = pdev->dev.of_node;
>   struct b53_platform_data *pdata = pdev->dev.platform_data;
>   struct b53_mmap_priv *priv;
>   struct b53_device *dev;
> + int ret;
> +
> + if (!pdata && np) {
> + ret = b53_mmap_probe_of(pdev, );
> + if (ret) {
> + dev_err(>dev, "OF probe error\n");
> + return ret;
> + }
> + }
>  
>   if (!pdata)
>   return -EINVAL;

Re: [PATCH v3 net-next 4/6] linux/etherdevice.h: misc trailing whitespace cleanup

2021-03-15 Thread Jakub Kicinski

On Mon, 15 Mar 2021 09:38:57 + Alexander Lobakin wrote:
> From: Vladimir Oltean 
> Date: Sun, 14 Mar 2021 23:04:53 +0200
> 
> > On Sun, Mar 14, 2021 at 11:11:32AM +, Alexander Lobakin wrote:  
> > > Caught by the text editor. Fix it separately from the actual changes.
> > >
> > > Signed-off-by: Alexander Lobakin 
> > > ---
> > >  include/linux/etherdevice.h | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > >
> > > diff --git a/include/linux/etherdevice.h b/include/linux/etherdevice.h
> > > index 2e5debc0373c..bcb2f81baafb 100644
> > > --- a/include/linux/etherdevice.h
> > > +++ b/include/linux/etherdevice.h
> > > @@ -11,7 +11,7 @@
> > >   * Authors:  Ross Biro
> > >   *   Fred N. van Kempen, 
> > >   *
> > > - *   Relocated to include/linux where it belongs by Alan Cox
> > > + *   Relocated to include/linux where it belongs by Alan Cox
> > >   *   
> > > 
> > >   */
> > >  #ifndef _LINUX_ETHERDEVICE_H
> > > --
> > > 2.30.2
> > >
> > >  
> >
> > Your mailer did something weird here, it trimmed the trailing whitespace
> > from the "-" line. The patch doesn't apply.  
> 
> It's git-send-email + ProtonMail Bridge... Seems like that's the
> reason why only this series of mine was failing Patchwork
> everytime.

Yup. Sorry for the lack of logs in NIPA, you can run

  git pw series apply $id

and look at the output there. That's what I do, anyway, to double check
that the patch doesn't apply so the logs are of limited use.

Re: [RFC v2] net: sched: implement TCQ_F_CAN_BYPASS for lockless qdisc

2021-03-15 Thread Jakub Kicinski

On Mon, 15 Mar 2021 11:10:18 +0800 Yunsheng Lin wrote:
> @@ -606,6 +623,11 @@ static const u8 prio2band[TC_PRIO_MAX + 1] = {
>   */
>  struct pfifo_fast_priv {
>   struct skb_array q[PFIFO_FAST_BANDS];
> +
> + /* protect against data race between enqueue/dequeue and
> +  * qdisc->empty setting
> +  */
> + spinlock_t lock;
>  };
>  
>  static inline struct skb_array *band2list(struct pfifo_fast_priv *priv,
> @@ -623,7 +645,10 @@ static int pfifo_fast_enqueue(struct sk_buff *skb, 
> struct Qdisc *qdisc,
>   unsigned int pkt_len = qdisc_pkt_len(skb);
>   int err;
>  
> - err = skb_array_produce(q, skb);
> + spin_lock(>lock);
> + err = __ptr_ring_produce(>ring, skb);
> + WRITE_ONCE(qdisc->empty, false);
> + spin_unlock(>lock);
>  
>   if (unlikely(err)) {
>   if (qdisc_is_percpu_stats(qdisc))
> @@ -642,6 +667,7 @@ static struct sk_buff *pfifo_fast_dequeue(struct Qdisc 
> *qdisc)
>   struct sk_buff *skb = NULL;
>   int band;
>  
> + spin_lock(>lock);
>   for (band = 0; band < PFIFO_FAST_BANDS && !skb; band++) {
>   struct skb_array *q = band2list(priv, band);
>  
> @@ -655,6 +681,7 @@ static struct sk_buff *pfifo_fast_dequeue(struct Qdisc 
> *qdisc)
>   } else {
>   WRITE_ONCE(qdisc->empty, true);
>   }
> + spin_unlock(>lock);
>  
>   return skb;
>  }

I thought pfifo was supposed to be "lockless" and this change
re-introduces a lock between producer and consumer, no?

Re: [PATCH] Net: gtp: Fixed missing blank line after declaration

2021-03-15 Thread Jakub Kicinski

On Sat, 13 Mar 2021 22:21:28 +0530 Chinmayi Shetty wrote:
> Signed-off-by: Chinmayi Shetty 
> ---
>  drivers/net/gtp.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/net/gtp.c b/drivers/net/gtp.c
> index efe5247d8c42..79998f4394e5 100644
> --- a/drivers/net/gtp.c
> +++ b/drivers/net/gtp.c
> @@ -437,7 +437,7 @@ static inline void gtp1_push_header(struct sk_buff *skb, 
> struct pdp_ctx *pctx)
>   gtp1->length= htons(payload_len);
>   gtp1->tid   = htonl(pctx->u.v1.o_tei);
>  
> - /* TODO: Suppport for extension header, sequence number and N-PDU.
> + /* TODO: Support for extension header, sequence number and N-PDU.
>*   Update the length field if any of them is available.
>*/
>  }

Subject does not match the patch.

Re: [PATCH v2] eeprom/optoe: driver to read/write SFP/QSFP/CMIS EEPROMS

2021-03-15 Thread Jakub Kicinski

On Sat, 13 Mar 2021 13:35:56 -0800 Don Bollinger wrote:
> > away parts of the bottom end and replace it with a different KAPI, and
> > nobody will notice? In fact, this is probably how it was designed. Anybody  
> 
> Actually everyone who touches this code would notice, each implementation
> would have to be modified, with literally no benefit to this community.

You keep saying that kernel API is "of no benefit to this community"
yet you don't want to accept the argument that your code is of no
benefit to the upstream community.

> optoe does not undermine the netlink KAPI that Moshe is working on.  

It does, although it may be hard to grasp for a vendor who can just EoL
a product at will once nobody is paying for it. We aim to provide
uniform support for all networking devices and an infinite backward
compatibility guarantee.

People will try to use optoe-based tools on the upstream drivers and
they won't work. Realistically we will need to support both APIs.

> If your community is interested, it could adopt optoe, WITH your
> KAPI, to consolidate and improve module EEPROM access for mainstream
> netdev consumers.  I am eager to collaborate on the fairly simple
> integration.

Nacked-by: Jakub Kicinski 

Please move on.

Re: [PATCH v15 0/4] Adding the Sparx5 Serdes driver

2021-03-15 Thread Jakub Kicinski

On Mon, 15 Mar 2021 16:04:24 +0100 Steen Hegelund wrote:
> Hi Kishon, Vinod, Andrew, Jacub, and David, 
> 
> I just wanted to know if you think that the Generic PHY subsystem might
> not be the right place for this Ethernet SerDes PHY driver after all.
> 
> Originally I chose this subsystem for historic reasons: The
> Microchip/Microsemi Ocelot SerDes driver was added here when it was
> upstreamed.
> On the other hand the Ocelot Serdes can do both PCIe and Ethernet, so
> it might fit the signature of a generic PHY better.

Are you saying this PHY is Ethernet only?

> At the moment the acceptance of the Sparx5 Serdes driver is blocking us
> from adding the Sparx5 SwitchDev driver (to net), so it is really
> important for us to resolve which subsystem the Serdes driver belongs
> to.
> 
> I am very much looking forward to your response.

FWIW even if this is merged via gen phy subsystem we can pull it into
net-next as well to unblock your other work in this dev cycle. You just
need to send the patches as a pull request, based on merge-base between
the gen phy tree and net-next.

Re: [PATCH net] net: lapbether: Prevent racing when checking whether the netif is running

2021-03-11 Thread Jakub Kicinski

On Thu, 11 Mar 2021 16:28:47 -0800 Xie He wrote:
> On Thu, Mar 11, 2021 at 4:10 PM Jakub Kicinski  wrote:
> >
> > And the "noqueue" queue is there because it's on top of hdlc_fr.c
> > somehow or some out of tree driver? Or do you install it manually?  
> 
> No, this driver is not related to "hdlc_fr.c" or any out-of-tree
> driver. The default qdisc is "noqueue" for this driver because this
> driver doesn't set "tx_queue_len". This means the value of
> "tx_queue_len" would be 0. In this case, "alloc_netdev_mqs" will
> automatically add the "IFF_NO_QUEUE" flag to the device, then
> "attach_one_default_qdisc" in "net/sched/sch_generic.c" will attach
> the "noqueue" qdisc for devices with the "IFF_NO_QUEUE" flag.

I see.

Re: [PATCH net] net: lapbether: Prevent racing when checking whether the netif is running

2021-03-11 Thread Jakub Kicinski

On Thu, 11 Mar 2021 15:13:01 -0800 Xie He wrote:
> On Thu, Mar 11, 2021 at 2:52 PM Jakub Kicinski  wrote:
> >
> > Normally driver's ndo_stop() calls netif_tx_disable() which takes TX
> > locks, so unless your driver is lockless (LLTX) there should be no xmit
> > calls after that point.  
> 
> Do you mean I should call "netif_tx_disable" inside my "ndo_stop"
> function as a fix for the racing between "ndo_stop" and
> "ndo_start_xmit"?
> 
> I can't call "netif_tx_disable" inside my "ndo_stop" function because
> "netif_tx_disable" will call "netif_tx_stop_queue", which causes
> another racing problem. Please see my recent commit f7d9d4854519
> ("net: lapbether: Remove netif_start_queue / netif_stop_queue")

And the "noqueue" queue is there because it's on top of hdlc_fr.c
somehow or some out of tree driver? Or do you install it manually?

Re: [PATCH net] net: lapbether: Prevent racing when checking whether the netif is running

2021-03-11 Thread Jakub Kicinski

On Thu, 11 Mar 2021 13:12:25 -0800 Xie He wrote:
> On Thu, Mar 11, 2021 at 12:43 PM Jakub Kicinski  wrote:
> >
> > Is this a theoretical issues or do you see a path where it triggers?
> >
> > Who are the callers sending frames to a device which went down?  
> 
> This is a theoretical issue. I didn't see this issue in practice.
> 
> When "__dev_queue_xmit" and "sch_direct_xmit" call
> "dev_hard_start_xmit", there appears to be no locking mechanism
> preventing the netif from going down while "dev_hard_start_xmit" is
> doing its work.

Normally driver's ndo_stop() calls netif_tx_disable() which takes TX
locks, so unless your driver is lockless (LLTX) there should be no xmit
calls after that point.

Re: [PATCH net] net: lapbether: Prevent racing when checking whether the netif is running

2021-03-11 Thread Jakub Kicinski

On Wed, 10 Mar 2021 23:23:09 -0800 Xie He wrote:
> There are two "netif_running" checks in this driver. One is in
> "lapbeth_xmit" and the other is in "lapbeth_rcv". They serve to make
> sure that the LAPB APIs called in these functions are called before
> "lapb_unregister" is called by the "ndo_stop" function.
> 
> However, these "netif_running" checks are unreliable, because it's
> possible that immediately after "netif_running" returns true, "ndo_stop"
> is called (which causes "lapb_unregister" to be called).
> 
> This patch adds locking to make sure "lapbeth_xmit" and "lapbeth_rcv" can
> reliably check and ensure the netif is running while doing their work.
> 
> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
> Signed-off-by: Xie He 

Is this a theoretical issues or do you see a path where it triggers?

Who are the callers sending frames to a device which went down?

Re: [PATCH v2] netdevsim: fib: Remove redundant code

2021-03-11 Thread Jakub Kicinski

On Thu, 11 Mar 2021 15:11:01 +0800 Jiapeng Chong wrote:
> Fix the following coccicheck warnings:
> 
> ./drivers/net/netdevsim/fib.c:874:5-8: Unneeded variable: "err". Return
> "0" on line 889.
> 
> Reported-by: Abaci Robot 
> Signed-off-by: Jiapeng Chong 

Reviewed-by: Jakub Kicinski

Re: [V2 net-next] net: mvpp2: Add reserved port private flag configuration

2021-03-11 Thread Jakub Kicinski

On Thu, 11 Mar 2021 18:43:27 +0200 stef...@marvell.com wrote:
> According to Armada SoC architecture and design, all the PPv2 ports
> which are populated on the same communication processor silicon die
> (CP11x) share the same Classifier and Parser engines.
> 
> Armada is an embedded platform and therefore there is a need to reserve
> some of the PPv2 ports for different use cases.
> 
> For example, a port can be reserved for a CM3 CPU running FreeRTOS for
> management purposes or by user-space data plane application.
> 
> During port reservation all common configurations are preserved and
> only RXQ, TXQ, and interrupt vectors are disabled.
> Since TXQ's are disabled, the Kernel won't transmit any packet
> from this port, and to due the closed RXQ interrupts, the Kernel won't
> receive any packet.
> The port MAC address and administrative UP/DOWN state can still
> be changed.
> The only permitted configuration in this mode is MTU change.
> The driver's .ndo_change_mtu callback has logic that switches between
> percpu_pools and shared pools buffer mode, since the buffer management
> not done by Kernel this should be permitted.

Andrew asks good questions. This looks like a strange construct.

IMO Linux should either not see the port (like it doesn't see NC-SI),
or we need representors for physical and logical ports and explicit
forwarding rules.

Re: [PATCH net-next] net: phy: Expose phydev::dev_flags through sysfs

2021-03-10 Thread Jakub Kicinski

On Wed, 10 Mar 2021 14:12:43 -0800 Florian Fainelli wrote:
> phydev::dev_flags contains a bitmask of configuration bits requested by
> the consumer of a PHY device (Ethernet MAC or switch) towards the PHY
> driver. Since these flags are often used for requesting LED or other
> type of configuration being able to quickly audit them without
> instrumenting the kernel is useful.
> 
> Signed-off-by: Florian Fainelli 
> ---
>  Documentation/ABI/testing/sysfs-class-net-phydev | 12 
>  drivers/net/phy/phy_device.c | 11 +++
>  2 files changed, 23 insertions(+)
> 
> diff --git a/Documentation/ABI/testing/sysfs-class-net-phydev 
> b/Documentation/ABI/testing/sysfs-class-net-phydev
> index 40ced0ea4316..ac722dd5e694 100644
> --- a/Documentation/ABI/testing/sysfs-class-net-phydev
> +++ b/Documentation/ABI/testing/sysfs-class-net-phydev
> @@ -51,3 +51,15 @@ Description:
>   Boolean value indicating whether the PHY device is used in
>   standalone mode, without a net_device associated, by PHYLINK.
>   Attribute created only when this is the case.
> +
> +What:/sys/class/mdio_bus///phy_dev_flags
> +Date:March 2021
> +KernelVersion:   5.13
> +Contact: net...@vger.kernel.org
> +Description:
> + 32-bit hexadecimal number representing a bit mask of the
> + configuration bits passed from the consumer of the PHY
> + (Ethernet MAC, switch, etc.) to the PHY driver. The flags are
> + only used internally by the kernel and their placement are
> + not meant to be stable across kernel versions. This is intended
> + for facilitating the debugging of PHY drivers.

Why not debugfs, then?

Re: [PATCH v2] bus: mhi: Add Qcom WWAN control driver

2021-03-08 Thread Jakub Kicinski

On Mon,  8 Mar 2021 21:59:27 +0100 Loic Poulain wrote:
> ---
>  v2: update copyright (2021)

Please look again at my reply to your v1.

Re: [PATCH 2/2] net: mdio: Add BCM6368 MDIO mux bus controller

2021-03-08 Thread Jakub Kicinski

On Mon,  8 Mar 2021 19:41:02 +0100 Álvaro Fernández Rojas wrote:
> This controller is present on BCM6318, BCM6328, BCM6362, BCM6368 and BCM63268
> SoCs.
> 
> Signed-off-by: Álvaro Fernández Rojas 

make[2]: *** Deleting file 'Module.symvers'
ERROR: modpost: missing MODULE_LICENSE() in drivers/net/mdio/mdio-mux-bcm6368.o
make[2]: *** [Module.symvers] Error 1
make[1]: *** [modules] Error 2
make: *** [__sub-make] Error 2
make[2]: *** Deleting file 'Module.symvers'

Re: [PATCH] bus: mhi: Add Qcom WWAN control driver

2021-03-08 Thread Jakub Kicinski

On Mon,  8 Mar 2021 19:40:51 +0100 Loic Poulain wrote:
> The MHI WWWAN control driver allows MHI Qcom based modems to expose
> different modem control protocols to userspace, so that userspace
> modem tools or daemon (e.g. ModemManager) can control WWAN config
> and state (APN config, SMS, provider selection...). A Qcom based
> modem can expose one or several of the following protocols:
> - AT: Well known AT commands interactive protocol (microcom, minicom...)
> - MBIM: Mobile Broadband Interface Model (libmbim, mbimcli)
> - QMI: Qcom MSM/Modem Interface (libqmi, qmicli)
> - QCDM: Qcom Modem diagnostic interface (libqcdm)
> - FIREHOSE: XML-based protocol for Modem firmware management
>   (qmi-firmware-update)
> 
> The different interfaces are exposed as character devices, in the same
> way as for USB modem variants.
> 
> Note that this patch is mostly a rework of the earlier MHI UCI
> tentative that was a generic interface for accessing MHI bus from
> userspace. As suggested, this new version is WWAN specific and is
> dedicated to only expose channels used for controlling a modem, and
> for which related opensource user support exist. Other MHI channels
> not fitting the requirements will request either to be plugged to
> the right Linux subsystem (when available) or to be discussed as a
> new MHI driver (e.g AI accelerator, WiFi debug channels, etc...).
> 
> Co-developed-by: Hemant Kumar 
> Signed-off-by: Hemant Kumar 
> Signed-off-by: Loic Poulain 

You need to CC netdev.

>  drivers/bus/mhi/Kconfig |  12 +
>  drivers/bus/mhi/Makefile|   3 +
>  drivers/bus/mhi/wwan_ctrl.c | 559 
> 

Linux kernel tree is not organized by bus. This belongs somewhere under
drivers/net.

Re: [PATCH v2] eeprom/optoe: driver to read/write SFP/QSFP/CMIS EEPROMS

2021-03-05 Thread Jakub Kicinski

On Fri, 5 Mar 2021 11:07:08 -0800 Don Bollinger wrote:
> Acknowledging your objections, I nonetheless request that optoe be accepted
> into upstream as an eeprom driver in drivers/misc/eeprom.  It is a
> legitimate driver, with a legitimate user community, which deserves the
> benefits of being managed as a legitimate part of the linux kernel.

It's in the best interest of the community to standardize on how 
we expect things to operate. You're free to do whatever you want
in your proprietary systems but please don't expect us to accept
a parallel, in now way superior method of accessing SFPs.

Re: [PATCH net-next RFC v4] net: hdlc_x25: Queue outgoing LAPB frames

2021-03-03 Thread Jakub Kicinski

On Tue, 02 Mar 2021 08:04:20 +0100 Martin Schiller wrote:
> On 2021-03-01 09:56, Xie He wrote:
> > On Sun, Feb 28, 2021 at 10:56 PM Martin Schiller  wrote:  
> >> I mean the change from only one hdlc interface to both hdlc and
> >> hdlc_x25.
> >> 
> >> I can't estimate how many users are out there and how their setup 
> >> looks
> >> like.  
> > 
> > I'm also thinking about solving this issue by adding new APIs to the
> > HDLC subsystem (hdlc_stop_queue / hdlc_wake_queue) for hardware
> > drivers to call instead of netif_stop_queue / netif_wake_queue. This
> > way we can preserve backward compatibility.
> > 
> > However I'm reluctant to change the code of all the hardware drivers
> > because I'm afraid of introducing bugs, etc. When I look at the code
> > of "wan/lmc/lmc_main.c", I feel I'm not able to make sure there are no
> > bugs (related to stop_queue / wake_queue) after my change (and even
> > before my change, actually). There are even serious style problems:
> > the majority of its lines are indented by spaces.
> > 
> > So I don't want to mess with all the hardware drivers. Hardware driver
> > developers (if they wish to properly support hdlc_x25) should do the
> > change themselves. This is not a problem for me, because I use my own
> > out-of-tree hardware driver. However if I add APIs with no user code
> > in the kernel, other developers may think these APIs are not
> > necessary.  
> 
> I don't think a change that affects the entire HDLC subsystem is
> justified, since the actual problem only affects the hdlc_x25 area.
> 
> The approach with the additional hdlc_x25 is clean and purposeful and
> I personally could live with it.
> 
> I just don't see myself in the position to decide such a change at the
> moment.
> 
> @Jakub: What is your opinion on this.

Hard question to answer, existing users seem happy and Xie's driver
isn't upstream, so the justification for potentially breaking backward
compatibility isn't exactly "strong".

Can we cop out and add a knob somewhere to control spawning the extra
netdev? Let people who just want a newer kernel carry on without
distractions and those who want the extra layer can flip the switch?

Re: [PATCH] netdevsim: init u64 stats for 32bit hardware

2021-03-03 Thread Jakub Kicinski

On Tue, 2 Mar 2021 12:55:47 +0100 Dmitry Vyukov wrote:
> On Tue, Mar 2, 2021 at 10:06 AM Hillf Danton  wrote:
> > On Mar 2, 2021 at 16:40 Dmitry Vyukov wrote:
> > >I hoped this would get at least into 5.12. syzbot can't start testing  
> > >arm32 because of this.  

FWIW the submission never got into patchwork or lore so we had 
no public source to apply from.

> > Or what is more feasible is you send a fix to Jakub today.  
> 
> So far I can't figure out how to make git work with my Gmail account
> with 1.5-factor auth enabled, neither password nor asp work...

LMK if you get overly frustrated, I can get the patch from my inbox and
resend it for you :)

Re: [PATCH 01/11] pragma once: delete include/linux/atm_suni.h

2021-02-28 Thread Jakub Kicinski

On Sun, 28 Feb 2021 19:58:17 +0300 Alexey Dobriyan wrote:
> From c17ac63e1334c742686cd411736699c1d34d45a7 Mon Sep 17 00:00:00 2001
> From: Alexey Dobriyan 
> Date: Wed, 10 Feb 2021 21:07:45 +0300
> Subject: [PATCH 01/11] pragma once: delete include/linux/atm_suni.h
> 
> This file has been empty since 2.3.99-pre3!
> Delete it instead of converting to #pragma once.
> 
> Signed-off-by: Alexey Dobriyan 

I'm guessing you want this to be merged via the networking tree?
(Guessing since you didn't CC us on the cover letter).

In that case please wait a couple of days and re-post it as a
standalone patch to netdev. Our build & validation bots can't deal 
with series where we only get patches 1 and 10 on the list.

If someone else is willing to merge the entire series - consider 
this patch acked.

Re: [PATCH v3] net/core/skbuff: fix passing wrong size to __alloc_skb

2021-02-28 Thread Jakub Kicinski

On Sun, 28 Feb 2021 18:14:46 + Alexander Lobakin wrote:
> > [1] WARNING in __alloc_pages_nodemask+0x5f8/0x730 mm/page_alloc.c:5014
> > Call Trace:
> >  __alloc_pages include/linux/gfp.h:511 [inline]
> >  __alloc_pages_node include/linux/gfp.h:524 [inline]
> >  alloc_pages_node include/linux/gfp.h:538 [inline]
> >  kmalloc_large_node+0x60/0x110 mm/slub.c:3999
> >  __kmalloc_node_track_caller+0x319/0x3f0 mm/slub.c:4496
> >  __kmalloc_reserve net/core/skbuff.c:150 [inline]
> >  __alloc_skb+0x4e4/0x5a0 net/core/skbuff.c:210
> >  __netdev_alloc_skb+0x70/0x400 net/core/skbuff.c:446
> >  netdev_alloc_skb include/linux/skbuff.h:2832 [inline]
> >  qrtr_endpoint_post+0x84/0x11b0 net/qrtr/qrtr.c:442
> >  qrtr_tun_write_iter+0x11f/0x1a0 net/qrtr/tun.c:98
> >  call_write_iter include/linux/fs.h:1901 [inline]
> >  new_sync_write+0x426/0x650 fs/read_write.c:518
> >  vfs_write+0x791/0xa30 fs/read_write.c:605
> >  ksys_write+0x12d/0x250 fs/read_write.c:658
> >  do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
> >  entry_SYSCALL_64_after_hwframe+0x44/0xa9  
> 
> Ah, by the way. Have you tried to seek for the root cause, why
> a request for such insanely large (at least 4 Mib) skb happens
> in QRTR? I don't believe it's intended to be like this.
> Now I feel that silencing this error with early return isn't
> really correct approach for this.

Right, IIUC Eric suggested we limit the length of the allocation 
to 64KB because that's the max reasonable skb length, and QRTR tun write
results in generating a single skb. That seems like a good approach.

Re: [PATCH net] net: phy: fix save wrong speed and duplex problem if autoneg is on

2021-02-26 Thread Jakub Kicinski

On Fri, 26 Feb 2021 15:44:42 +0800 Huazhong Tan wrote:
> From: Guangbin Huang 
> 
> If phy uses generic driver and autoneg is on, enter command
> "ethtool -s eth0 speed 50" will not change phy speed actually, but
> command "ethtool eth0" shows speed is 50Mb/s because phydev->speed
> has been set to 50 and no update later.
> 
> And duplex setting has same problem too.
> 
> However, if autoneg is on, phy only changes speed and duplex according to
> phydev->advertising, but not phydev->speed and phydev->duplex. So in this
> case, phydev->speed and phydev->duplex don't need to be set in function
> phy_ethtool_ksettings_set() if autoneg is on.

Can we get a Fixes tag for this one? How far back does this behavior
date?

Re: [PATCH v2 1/1] net: fec: ptp: avoid register access when ipg clock is disabled

2021-02-26 Thread Jakub Kicinski

On Fri, 26 Feb 2021 07:23:31 -0800 Richard Cochran wrote:
> On Thu, Feb 25, 2021 at 10:15:16PM +0100, Heiko Thiery wrote:
> > When accessing the timecounter register on an i.MX8MQ the kernel hangs.
> > This is only the case when the interface is down. This can be reproduced
> > by reading with 'phc_ctrl eth0 get'.
> > 
> > Like described in the change in 91c0d987a9788dcc5fe26baafd73bf9242b68900
> > the igp clock is disabled when the interface is down and leads to a
> > system hang.
> > 
> > So we check if the ptp clock status before reading the timecounter
> > register.
> > 
> > Signed-off-by: Heiko Thiery 
> > ---
> > v2:
> >  - add mutex (thanks to Richard)
> > 
> > v3:
> > I did a mistake and did not test properly
> >  - add parenteses
> >  - fix the used variable  

On Fri, 26 Feb 2021 08:22:50 +0100 Heiko Thiery wrote:
> Sorry for the noise. But just realized that I sent a v3 version of the
> patch but forgot to update the subject line (still v2). Should I
> resend it with the correct subject?

No need, looks like patchwork caught the right version.

> Acked-by: Richard Cochran 

Applied, thanks!

Re: [PATCH] net: phy: make mdio_bus_phy_suspend/resume as __maybe_unused

2021-02-26 Thread Jakub Kicinski

On Thu, 25 Feb 2021 23:53:20 +0100 Andrew Lunn wrote:
> On Thu, Feb 25, 2021 at 03:57:27PM +0100, Arnd Bergmann wrote:
> > From: Arnd Bergmann 
> > 
> > When CONFIG_PM_SLEEP is disabled, the compiler warns about unused
> > functions:
> > 
> > drivers/net/phy/phy_device.c:273:12: error: unused function 
> > 'mdio_bus_phy_suspend' [-Werror,-Wunused-function]
> > static int mdio_bus_phy_suspend(struct device *dev)
> > drivers/net/phy/phy_device.c:293:12: error: unused function 
> > 'mdio_bus_phy_resume' [-Werror,-Wunused-function]
> > static int mdio_bus_phy_resume(struct device *dev)
> > 
> > The logic is intentional, so just mark these two as __maybe_unused
> > and remove the incorrect #ifdef.
> > 
> > Fixes: 4c0d2e96ba05 ("net: phy: consider that suspend2ram may cut off PHY 
> > power")
> > Signed-off-by: Arnd Bergmann   
> 
> Reviewed-by: Andrew Lunn 

Applied, thanks!

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 2078 matches

Mail list logo