Re: [PATCH net-next 0/8] nfp: offload LAG for tc flower egress
On Wed, May 30, 2018 at 9:29 PM, Jiri Pirko wrote: > Wed, May 30, 2018 at 11:26:23AM CEST, john.hur...@netronome.com wrote: >>On Tue, May 29, 2018 at 11:09 PM, Jiri Pirko wrote: >>> Tue, May 29, 2018 at 04:08:48PM CEST, john.hur...@netronome.com wrote: On Sat, May 26, 2018 at 3:47 AM, Jakub Kicinski wrote: > On Fri, 25 May 2018 08:48:09 +0200, Jiri Pirko wrote: >> Thu, May 24, 2018 at 04:22:47AM CEST, jakub.kicin...@netronome.com wrote: >> >Hi! >> > >> >This series from John adds bond offload to the nfp driver. Patch 5 >> >exposes the hash type for NETDEV_LAG_TX_TYPE_HASH to make sure nfp >> >hashing matches that of the software LAG. This may be unnecessarily >> >conservative, let's see what LAG maintainers think :) >> >> So you need to restrict offload to only certain hash algo? In mlxsw, we >> just ignore the lag setting and do some hw default hashing. Would not be >> enough? Note that there's a good reason for it, as you see, in team, the >> hashing is done in a BPF function and could be totally arbitrary. >> Your patchset effectively disables team offload for nfp. > > My understanding is that the project requirements only called for L3/L4 > hash algorithm offload, hence the temptation to err on the side of > caution and not offload all the bond configurations. John can provide > more details. Not being able to offload team is unfortunate indeed. Hi Jiri, Yes, as Jakub mentions, we restrict ourselves to L3/L4 hash algorithm as this is currently what is supported in fw. >>> >>> In mlxsw, a default l3/l4 is used always, no matter what the >>> bonding/team sets. It is not correct, but it works with team as well. >>> Perhaps we can have NETDEV_LAG_HASH_UNKNOWN to indicate to the driver to >>> do some default? That would make the "team" offload functional. >>> >> >>yes, I would agree with that. >>Thanks > > Okay, would you please adjust your driver? > Will do. Thanks, Jiri > I will teka care of mlxsw bits. > > Thanks! > >> Hopefully this will change as fw features are expanded. I understand the issue this presents with offloading team. Perhaps resorting to a default hw hash for team is acceptable. John
Re: [PATCH net-next 0/8] nfp: offload LAG for tc flower egress
Wed, May 30, 2018 at 11:26:23AM CEST, john.hur...@netronome.com wrote: >On Tue, May 29, 2018 at 11:09 PM, Jiri Pirko wrote: >> Tue, May 29, 2018 at 04:08:48PM CEST, john.hur...@netronome.com wrote: >>>On Sat, May 26, 2018 at 3:47 AM, Jakub Kicinski >>> wrote: On Fri, 25 May 2018 08:48:09 +0200, Jiri Pirko wrote: > Thu, May 24, 2018 at 04:22:47AM CEST, jakub.kicin...@netronome.com wrote: > >Hi! > > > >This series from John adds bond offload to the nfp driver. Patch 5 > >exposes the hash type for NETDEV_LAG_TX_TYPE_HASH to make sure nfp > >hashing matches that of the software LAG. This may be unnecessarily > >conservative, let's see what LAG maintainers think :) > > So you need to restrict offload to only certain hash algo? In mlxsw, we > just ignore the lag setting and do some hw default hashing. Would not be > enough? Note that there's a good reason for it, as you see, in team, the > hashing is done in a BPF function and could be totally arbitrary. > Your patchset effectively disables team offload for nfp. My understanding is that the project requirements only called for L3/L4 hash algorithm offload, hence the temptation to err on the side of caution and not offload all the bond configurations. John can provide more details. Not being able to offload team is unfortunate indeed. >>> >>>Hi Jiri, >>>Yes, as Jakub mentions, we restrict ourselves to L3/L4 hash algorithm >>>as this is currently what is supported in fw. >> >> In mlxsw, a default l3/l4 is used always, no matter what the >> bonding/team sets. It is not correct, but it works with team as well. >> Perhaps we can have NETDEV_LAG_HASH_UNKNOWN to indicate to the driver to >> do some default? That would make the "team" offload functional. >> > >yes, I would agree with that. >Thanks Okay, would you please adjust your driver? I will teka care of mlxsw bits. Thanks! > >>>Hopefully this will change as fw features are expanded. >>>I understand the issue this presents with offloading team. >>>Perhaps resorting to a default hw hash for team is acceptable. >>>John
Re: [PATCH net-next 0/8] nfp: offload LAG for tc flower egress
On Tue, May 29, 2018 at 11:09 PM, Jiri Pirko wrote: > Tue, May 29, 2018 at 04:08:48PM CEST, john.hur...@netronome.com wrote: >>On Sat, May 26, 2018 at 3:47 AM, Jakub Kicinski >> wrote: >>> On Fri, 25 May 2018 08:48:09 +0200, Jiri Pirko wrote: Thu, May 24, 2018 at 04:22:47AM CEST, jakub.kicin...@netronome.com wrote: >Hi! > >This series from John adds bond offload to the nfp driver. Patch 5 >exposes the hash type for NETDEV_LAG_TX_TYPE_HASH to make sure nfp >hashing matches that of the software LAG. This may be unnecessarily >conservative, let's see what LAG maintainers think :) So you need to restrict offload to only certain hash algo? In mlxsw, we just ignore the lag setting and do some hw default hashing. Would not be enough? Note that there's a good reason for it, as you see, in team, the hashing is done in a BPF function and could be totally arbitrary. Your patchset effectively disables team offload for nfp. >>> >>> My understanding is that the project requirements only called for L3/L4 >>> hash algorithm offload, hence the temptation to err on the side of >>> caution and not offload all the bond configurations. John can provide >>> more details. Not being able to offload team is unfortunate indeed. >> >>Hi Jiri, >>Yes, as Jakub mentions, we restrict ourselves to L3/L4 hash algorithm >>as this is currently what is supported in fw. > > In mlxsw, a default l3/l4 is used always, no matter what the > bonding/team sets. It is not correct, but it works with team as well. > Perhaps we can have NETDEV_LAG_HASH_UNKNOWN to indicate to the driver to > do some default? That would make the "team" offload functional. > yes, I would agree with that. Thanks >>Hopefully this will change as fw features are expanded. >>I understand the issue this presents with offloading team. >>Perhaps resorting to a default hw hash for team is acceptable. >>John
Re: [PATCH net-next 0/8] nfp: offload LAG for tc flower egress
Tue, May 29, 2018 at 04:08:48PM CEST, john.hur...@netronome.com wrote: >On Sat, May 26, 2018 at 3:47 AM, Jakub Kicinski > wrote: >> On Fri, 25 May 2018 08:48:09 +0200, Jiri Pirko wrote: >>> Thu, May 24, 2018 at 04:22:47AM CEST, jakub.kicin...@netronome.com wrote: >>> >Hi! >>> > >>> >This series from John adds bond offload to the nfp driver. Patch 5 >>> >exposes the hash type for NETDEV_LAG_TX_TYPE_HASH to make sure nfp >>> >hashing matches that of the software LAG. This may be unnecessarily >>> >conservative, let's see what LAG maintainers think :) >>> >>> So you need to restrict offload to only certain hash algo? In mlxsw, we >>> just ignore the lag setting and do some hw default hashing. Would not be >>> enough? Note that there's a good reason for it, as you see, in team, the >>> hashing is done in a BPF function and could be totally arbitrary. >>> Your patchset effectively disables team offload for nfp. >> >> My understanding is that the project requirements only called for L3/L4 >> hash algorithm offload, hence the temptation to err on the side of >> caution and not offload all the bond configurations. John can provide >> more details. Not being able to offload team is unfortunate indeed. > >Hi Jiri, >Yes, as Jakub mentions, we restrict ourselves to L3/L4 hash algorithm >as this is currently what is supported in fw. In mlxsw, a default l3/l4 is used always, no matter what the bonding/team sets. It is not correct, but it works with team as well. Perhaps we can have NETDEV_LAG_HASH_UNKNOWN to indicate to the driver to do some default? That would make the "team" offload functional. >Hopefully this will change as fw features are expanded. >I understand the issue this presents with offloading team. >Perhaps resorting to a default hw hash for team is acceptable. >John
Re: [PATCH net-next 0/8] nfp: offload LAG for tc flower egress
On Sat, May 26, 2018 at 3:47 AM, Jakub Kicinski wrote: > On Fri, 25 May 2018 08:48:09 +0200, Jiri Pirko wrote: >> Thu, May 24, 2018 at 04:22:47AM CEST, jakub.kicin...@netronome.com wrote: >> >Hi! >> > >> >This series from John adds bond offload to the nfp driver. Patch 5 >> >exposes the hash type for NETDEV_LAG_TX_TYPE_HASH to make sure nfp >> >hashing matches that of the software LAG. This may be unnecessarily >> >conservative, let's see what LAG maintainers think :) >> >> So you need to restrict offload to only certain hash algo? In mlxsw, we >> just ignore the lag setting and do some hw default hashing. Would not be >> enough? Note that there's a good reason for it, as you see, in team, the >> hashing is done in a BPF function and could be totally arbitrary. >> Your patchset effectively disables team offload for nfp. > > My understanding is that the project requirements only called for L3/L4 > hash algorithm offload, hence the temptation to err on the side of > caution and not offload all the bond configurations. John can provide > more details. Not being able to offload team is unfortunate indeed. Hi Jiri, Yes, as Jakub mentions, we restrict ourselves to L3/L4 hash algorithm as this is currently what is supported in fw. Hopefully this will change as fw features are expanded. I understand the issue this presents with offloading team. Perhaps resorting to a default hw hash for team is acceptable. John
Re: [PATCH net-next 0/8] nfp: offload LAG for tc flower egress
On Fri, 25 May 2018 08:48:09 +0200, Jiri Pirko wrote: > Thu, May 24, 2018 at 04:22:47AM CEST, jakub.kicin...@netronome.com wrote: > >Hi! > > > >This series from John adds bond offload to the nfp driver. Patch 5 > >exposes the hash type for NETDEV_LAG_TX_TYPE_HASH to make sure nfp > >hashing matches that of the software LAG. This may be unnecessarily > >conservative, let's see what LAG maintainers think :) > > So you need to restrict offload to only certain hash algo? In mlxsw, we > just ignore the lag setting and do some hw default hashing. Would not be > enough? Note that there's a good reason for it, as you see, in team, the > hashing is done in a BPF function and could be totally arbitrary. > Your patchset effectively disables team offload for nfp. My understanding is that the project requirements only called for L3/L4 hash algorithm offload, hence the temptation to err on the side of caution and not offload all the bond configurations. John can provide more details. Not being able to offload team is unfortunate indeed.
Re: [PATCH net-next 0/8] nfp: offload LAG for tc flower egress
Thu, May 24, 2018 at 04:22:47AM CEST, jakub.kicin...@netronome.com wrote: >Hi! > >This series from John adds bond offload to the nfp driver. Patch 5 >exposes the hash type for NETDEV_LAG_TX_TYPE_HASH to make sure nfp >hashing matches that of the software LAG. This may be unnecessarily >conservative, let's see what LAG maintainers think :) So you need to restrict offload to only certain hash algo? In mlxsw, we just ignore the lag setting and do some hw default hashing. Would not be enough? Note that there's a good reason for it, as you see, in team, the hashing is done in a BPF function and could be totally arbitrary. Your patchset effectively disables team offload for nfp.
Re: [PATCH net-next 0/8] nfp: offload LAG for tc flower egress
From: Jakub Kicinski Date: Wed, 23 May 2018 19:22:47 -0700 > This series from John adds bond offload to the nfp driver. Patch 5 > exposes the hash type for NETDEV_LAG_TX_TYPE_HASH to make sure nfp > hashing matches that of the software LAG. This may be unnecessarily > conservative, let's see what LAG maintainers think :) Series applied, thanks.
Re: [PATCH net-next 0/8] nfp: offload LAG for tc flower egress
On Thu, 24 May 2018 22:26:03 +0300, Or Gerlitz wrote: > On Thu, May 24, 2018 at 9:49 PM, Jakub Kicinski > wrote: > > On Thu, 24 May 2018 20:04:56 +0300, Or Gerlitz wrote: > > >> Does this apply also to non-uplink representors? if yes, what is the use > >> case? > >> > >> We are looking on supporting uplink lag in sriov switchdev scheme - we > >> refer to > >> it as "vf lag" -- b/c the netdev and rdma devices seen by the VF are > >> actually > >> subject to HA and/or LAG - I wasn't sure if/how you limit this series > >> to uplink reprs > > > > I don't think we have a limitation on the output port within the LAG. > > But keep in mind in our devices all ports belong to the same eswitch/PF > > so bonding uplink ports is generally sufficient, I'm not sure VF > > bonding adds much HA. IOW AFAIK we support VF bonding because HW can do > > it easily, not because we have a strong use case for it. > > To make it clear, vf lag is code name for uplink lag, I think we want > to say that we provide the VM a lagged VF, anyway, again, the lag is > done on the uplink reps not on the vf reps. Ah, ack, same use case here! > Unlike the uplink port which is physical one, the vf vport is virtual > one, what could be the benefit to bond two vports? I'm not sure what it could be :) We can also bond an uplink and a VF! All outputs on the nfp are working same, so why limit ourselves if we can do it? :)
Re: [PATCH net-next 0/8] nfp: offload LAG for tc flower egress
On Thu, May 24, 2018 at 9:49 PM, Jakub Kicinski wrote: > On Thu, 24 May 2018 20:04:56 +0300, Or Gerlitz wrote: >> Does this apply also to non-uplink representors? if yes, what is the use >> case? >> >> We are looking on supporting uplink lag in sriov switchdev scheme - we refer >> to >> it as "vf lag" -- b/c the netdev and rdma devices seen by the VF are actually >> subject to HA and/or LAG - I wasn't sure if/how you limit this series >> to uplink reprs > > I don't think we have a limitation on the output port within the LAG. > But keep in mind in our devices all ports belong to the same eswitch/PF > so bonding uplink ports is generally sufficient, I'm not sure VF > bonding adds much HA. IOW AFAIK we support VF bonding because HW can do > it easily, not because we have a strong use case for it. To make it clear, vf lag is code name for uplink lag, I think we want to say that we provide the VM a lagged VF, anyway, again, the lag is done on the uplink reps not on the vf reps. Unlike the uplink port which is physical one, the vf vport is virtual one, what could be the benefit to bond two vports?
Re: [PATCH net-next 0/8] nfp: offload LAG for tc flower egress
On Thu, 24 May 2018 11:23:00 -0700, Samudrala, Sridhar wrote: > On 5/24/2018 10:04 AM, Or Gerlitz wrote: > > On Thu, May 24, 2018 at 5:22 AM, Jakub Kicinski > > wrote: > >> Hi! > >> > >> This series from John adds bond offload to the nfp driver. Patch 5 > >> exposes the hash type for NETDEV_LAG_TX_TYPE_HASH to make sure nfp > >> hashing matches that of the software LAG. This may be unnecessarily > >> conservative, let's see what LAG maintainers think :) > >> > >> John says: > >> > >> This patchset sets up the infrastructure and offloads output actions for > >> when a TC flower rule attempts to egress a packet to a LAG port. > >> > >> Firstly it adds some of the infrastructure required to the flower app and > >> to the nfp core. This includes the ability to change the MAC address of a > >> repr, a function for combining lookup and write to a FW symbol, and the > >> addition of private data to a repr on a per app basis. > >> > >> Patch 6 continues by implementing notifiers that track Linux bonds and > >> communicates to the FW those which enslave reprs, along with the current > >> state of reprs within the bond. > >> > >> Patch 7 ensures bonds are synchronised with FW by receiving and acting > >> upon cmsgs sent to the kernel. These may request that a bond message is > >> retransmitted when FW can process it, or may request a full sync of the > >> bonds defined in the kernel. > >> > >> Patch 8 offloads a flower action when that action requires egressing to a > >> pre-defined Linux bond. > > Does this apply also to non-uplink representors? if yes, what is the use > > case? > > > > We are looking on supporting uplink lag in sriov switchdev scheme - we > > refer to > > it as "vf lag" -- b/c the netdev and rdma devices seen by the VF are > > actually > > subject to HA and/or LAG - I wasn't sure if/how you limit this series > > to uplink reprs > > Also, does this patchset support offloading LAG when using vxlan based > tunnels? > > When using OVS offloading with vxlan, the encap rule that gets offloaded via > tc-flower > has egress port as vxlan device and the decap rule has the in-port as vxlan > device, not > the actual egress port. How are you addressing this issue? It is very much on our radar, I think we will send out a related RFC later today :) But to be honest I think you can just install an egress callback on the bond and that will pretty much work today. You don't have to "own" the egress device to install a egdev callback on it.
Re: [PATCH net-next 0/8] nfp: offload LAG for tc flower egress
On Thu, 24 May 2018 20:04:56 +0300, Or Gerlitz wrote: > On Thu, May 24, 2018 at 5:22 AM, Jakub Kicinski wrote: > > Hi! > > > > This series from John adds bond offload to the nfp driver. Patch 5 > > exposes the hash type for NETDEV_LAG_TX_TYPE_HASH to make sure nfp > > hashing matches that of the software LAG. This may be unnecessarily > > conservative, let's see what LAG maintainers think :) > > > > John says: > > > > This patchset sets up the infrastructure and offloads output actions for > > when a TC flower rule attempts to egress a packet to a LAG port. > > > > Firstly it adds some of the infrastructure required to the flower app and > > to the nfp core. This includes the ability to change the MAC address of a > > repr, a function for combining lookup and write to a FW symbol, and the > > addition of private data to a repr on a per app basis. > > > > Patch 6 continues by implementing notifiers that track Linux bonds and > > communicates to the FW those which enslave reprs, along with the current > > state of reprs within the bond. > > > > Patch 7 ensures bonds are synchronised with FW by receiving and acting > > upon cmsgs sent to the kernel. These may request that a bond message is > > retransmitted when FW can process it, or may request a full sync of the > > bonds defined in the kernel. > > > > Patch 8 offloads a flower action when that action requires egressing to a > > pre-defined Linux bond. > > Does this apply also to non-uplink representors? if yes, what is the use case? > > We are looking on supporting uplink lag in sriov switchdev scheme - we refer > to > it as "vf lag" -- b/c the netdev and rdma devices seen by the VF are actually > subject to HA and/or LAG - I wasn't sure if/how you limit this series > to uplink reprs I don't think we have a limitation on the output port within the LAG. But keep in mind in our devices all ports belong to the same eswitch/PF so bonding uplink ports is generally sufficient, I'm not sure VF bonding adds much HA. IOW AFAIK we support VF bonding because HW can do it easily, not because we have a strong use case for it.
Re: [PATCH net-next 0/8] nfp: offload LAG for tc flower egress
On 5/24/2018 10:04 AM, Or Gerlitz wrote: On Thu, May 24, 2018 at 5:22 AM, Jakub Kicinski wrote: Hi! This series from John adds bond offload to the nfp driver. Patch 5 exposes the hash type for NETDEV_LAG_TX_TYPE_HASH to make sure nfp hashing matches that of the software LAG. This may be unnecessarily conservative, let's see what LAG maintainers think :) John says: This patchset sets up the infrastructure and offloads output actions for when a TC flower rule attempts to egress a packet to a LAG port. Firstly it adds some of the infrastructure required to the flower app and to the nfp core. This includes the ability to change the MAC address of a repr, a function for combining lookup and write to a FW symbol, and the addition of private data to a repr on a per app basis. Patch 6 continues by implementing notifiers that track Linux bonds and communicates to the FW those which enslave reprs, along with the current state of reprs within the bond. Patch 7 ensures bonds are synchronised with FW by receiving and acting upon cmsgs sent to the kernel. These may request that a bond message is retransmitted when FW can process it, or may request a full sync of the bonds defined in the kernel. Patch 8 offloads a flower action when that action requires egressing to a pre-defined Linux bond. Does this apply also to non-uplink representors? if yes, what is the use case? We are looking on supporting uplink lag in sriov switchdev scheme - we refer to it as "vf lag" -- b/c the netdev and rdma devices seen by the VF are actually subject to HA and/or LAG - I wasn't sure if/how you limit this series to uplink reprs Also, does this patchset support offloading LAG when using vxlan based tunnels? When using OVS offloading with vxlan, the encap rule that gets offloaded via tc-flower has egress port as vxlan device and the decap rule has the in-port as vxlan device, not the actual egress port. How are you addressing this issue?
Re: [PATCH net-next 0/8] nfp: offload LAG for tc flower egress
On Thu, May 24, 2018 at 5:22 AM, Jakub Kicinski wrote: > Hi! > > This series from John adds bond offload to the nfp driver. Patch 5 > exposes the hash type for NETDEV_LAG_TX_TYPE_HASH to make sure nfp > hashing matches that of the software LAG. This may be unnecessarily > conservative, let's see what LAG maintainers think :) > > John says: > > This patchset sets up the infrastructure and offloads output actions for > when a TC flower rule attempts to egress a packet to a LAG port. > > Firstly it adds some of the infrastructure required to the flower app and > to the nfp core. This includes the ability to change the MAC address of a > repr, a function for combining lookup and write to a FW symbol, and the > addition of private data to a repr on a per app basis. > > Patch 6 continues by implementing notifiers that track Linux bonds and > communicates to the FW those which enslave reprs, along with the current > state of reprs within the bond. > > Patch 7 ensures bonds are synchronised with FW by receiving and acting > upon cmsgs sent to the kernel. These may request that a bond message is > retransmitted when FW can process it, or may request a full sync of the > bonds defined in the kernel. > > Patch 8 offloads a flower action when that action requires egressing to a > pre-defined Linux bond. Does this apply also to non-uplink representors? if yes, what is the use case? We are looking on supporting uplink lag in sriov switchdev scheme - we refer to it as "vf lag" -- b/c the netdev and rdma devices seen by the VF are actually subject to HA and/or LAG - I wasn't sure if/how you limit this series to uplink reprs