Re: [PATCH net-next 0/8] nfp: offload LAG for tc flower egress

2018-05-31 Thread John Hurley
On Wed, May 30, 2018 at 9:29 PM, Jiri Pirko  wrote:
> Wed, May 30, 2018 at 11:26:23AM CEST, john.hur...@netronome.com wrote:
>>On Tue, May 29, 2018 at 11:09 PM, Jiri Pirko  wrote:
>>> Tue, May 29, 2018 at 04:08:48PM CEST, john.hur...@netronome.com wrote:
On Sat, May 26, 2018 at 3:47 AM, Jakub Kicinski
 wrote:
> On Fri, 25 May 2018 08:48:09 +0200, Jiri Pirko wrote:
>> Thu, May 24, 2018 at 04:22:47AM CEST, jakub.kicin...@netronome.com wrote:
>> >Hi!
>> >
>> >This series from John adds bond offload to the nfp driver.  Patch 5
>> >exposes the hash type for NETDEV_LAG_TX_TYPE_HASH to make sure nfp
>> >hashing matches that of the software LAG.  This may be unnecessarily
>> >conservative, let's see what LAG maintainers think :)
>>
>> So you need to restrict offload to only certain hash algo? In mlxsw, we
>> just ignore the lag setting and do some hw default hashing. Would not be
>> enough? Note that there's a good reason for it, as you see, in team, the
>> hashing is done in a BPF function and could be totally arbitrary.
>> Your patchset effectively disables team offload for nfp.
>
> My understanding is that the project requirements only called for L3/L4
> hash algorithm offload, hence the temptation to err on the side of
> caution and not offload all the bond configurations.  John can provide
> more details.  Not being able to offload team is unfortunate indeed.

Hi Jiri,
Yes, as Jakub mentions, we restrict ourselves to L3/L4 hash algorithm
as this is currently what is supported in fw.
>>>
>>> In mlxsw, a default l3/l4 is used always, no matter what the
>>> bonding/team sets. It is not correct, but it works with team as well.
>>> Perhaps we can have NETDEV_LAG_HASH_UNKNOWN to indicate to the driver to
>>> do some default? That would make the "team" offload functional.
>>>
>>
>>yes, I would agree with that.
>>Thanks
>
> Okay, would you please adjust your driver?
>

Will do.
Thanks, Jiri

> I will teka care of mlxsw bits.
>
> Thanks!
>
>>
Hopefully this will change as fw features are expanded.
I understand the issue this presents with offloading team.
Perhaps resorting to a default hw hash for team is acceptable.
John


Re: [PATCH net-next 0/8] nfp: offload LAG for tc flower egress

2018-05-30 Thread Jiri Pirko
Wed, May 30, 2018 at 11:26:23AM CEST, john.hur...@netronome.com wrote:
>On Tue, May 29, 2018 at 11:09 PM, Jiri Pirko  wrote:
>> Tue, May 29, 2018 at 04:08:48PM CEST, john.hur...@netronome.com wrote:
>>>On Sat, May 26, 2018 at 3:47 AM, Jakub Kicinski
>>> wrote:
 On Fri, 25 May 2018 08:48:09 +0200, Jiri Pirko wrote:
> Thu, May 24, 2018 at 04:22:47AM CEST, jakub.kicin...@netronome.com wrote:
> >Hi!
> >
> >This series from John adds bond offload to the nfp driver.  Patch 5
> >exposes the hash type for NETDEV_LAG_TX_TYPE_HASH to make sure nfp
> >hashing matches that of the software LAG.  This may be unnecessarily
> >conservative, let's see what LAG maintainers think :)
>
> So you need to restrict offload to only certain hash algo? In mlxsw, we
> just ignore the lag setting and do some hw default hashing. Would not be
> enough? Note that there's a good reason for it, as you see, in team, the
> hashing is done in a BPF function and could be totally arbitrary.
> Your patchset effectively disables team offload for nfp.

 My understanding is that the project requirements only called for L3/L4
 hash algorithm offload, hence the temptation to err on the side of
 caution and not offload all the bond configurations.  John can provide
 more details.  Not being able to offload team is unfortunate indeed.
>>>
>>>Hi Jiri,
>>>Yes, as Jakub mentions, we restrict ourselves to L3/L4 hash algorithm
>>>as this is currently what is supported in fw.
>>
>> In mlxsw, a default l3/l4 is used always, no matter what the
>> bonding/team sets. It is not correct, but it works with team as well.
>> Perhaps we can have NETDEV_LAG_HASH_UNKNOWN to indicate to the driver to
>> do some default? That would make the "team" offload functional.
>>
>
>yes, I would agree with that.
>Thanks

Okay, would you please adjust your driver?

I will teka care of mlxsw bits.

Thanks!

>
>>>Hopefully this will change as fw features are expanded.
>>>I understand the issue this presents with offloading team.
>>>Perhaps resorting to a default hw hash for team is acceptable.
>>>John


Re: [PATCH net-next 0/8] nfp: offload LAG for tc flower egress

2018-05-30 Thread John Hurley
On Tue, May 29, 2018 at 11:09 PM, Jiri Pirko  wrote:
> Tue, May 29, 2018 at 04:08:48PM CEST, john.hur...@netronome.com wrote:
>>On Sat, May 26, 2018 at 3:47 AM, Jakub Kicinski
>> wrote:
>>> On Fri, 25 May 2018 08:48:09 +0200, Jiri Pirko wrote:
 Thu, May 24, 2018 at 04:22:47AM CEST, jakub.kicin...@netronome.com wrote:
 >Hi!
 >
 >This series from John adds bond offload to the nfp driver.  Patch 5
 >exposes the hash type for NETDEV_LAG_TX_TYPE_HASH to make sure nfp
 >hashing matches that of the software LAG.  This may be unnecessarily
 >conservative, let's see what LAG maintainers think :)

 So you need to restrict offload to only certain hash algo? In mlxsw, we
 just ignore the lag setting and do some hw default hashing. Would not be
 enough? Note that there's a good reason for it, as you see, in team, the
 hashing is done in a BPF function and could be totally arbitrary.
 Your patchset effectively disables team offload for nfp.
>>>
>>> My understanding is that the project requirements only called for L3/L4
>>> hash algorithm offload, hence the temptation to err on the side of
>>> caution and not offload all the bond configurations.  John can provide
>>> more details.  Not being able to offload team is unfortunate indeed.
>>
>>Hi Jiri,
>>Yes, as Jakub mentions, we restrict ourselves to L3/L4 hash algorithm
>>as this is currently what is supported in fw.
>
> In mlxsw, a default l3/l4 is used always, no matter what the
> bonding/team sets. It is not correct, but it works with team as well.
> Perhaps we can have NETDEV_LAG_HASH_UNKNOWN to indicate to the driver to
> do some default? That would make the "team" offload functional.
>

yes, I would agree with that.
Thanks

>>Hopefully this will change as fw features are expanded.
>>I understand the issue this presents with offloading team.
>>Perhaps resorting to a default hw hash for team is acceptable.
>>John


Re: [PATCH net-next 0/8] nfp: offload LAG for tc flower egress

2018-05-29 Thread Jiri Pirko
Tue, May 29, 2018 at 04:08:48PM CEST, john.hur...@netronome.com wrote:
>On Sat, May 26, 2018 at 3:47 AM, Jakub Kicinski
> wrote:
>> On Fri, 25 May 2018 08:48:09 +0200, Jiri Pirko wrote:
>>> Thu, May 24, 2018 at 04:22:47AM CEST, jakub.kicin...@netronome.com wrote:
>>> >Hi!
>>> >
>>> >This series from John adds bond offload to the nfp driver.  Patch 5
>>> >exposes the hash type for NETDEV_LAG_TX_TYPE_HASH to make sure nfp
>>> >hashing matches that of the software LAG.  This may be unnecessarily
>>> >conservative, let's see what LAG maintainers think :)
>>>
>>> So you need to restrict offload to only certain hash algo? In mlxsw, we
>>> just ignore the lag setting and do some hw default hashing. Would not be
>>> enough? Note that there's a good reason for it, as you see, in team, the
>>> hashing is done in a BPF function and could be totally arbitrary.
>>> Your patchset effectively disables team offload for nfp.
>>
>> My understanding is that the project requirements only called for L3/L4
>> hash algorithm offload, hence the temptation to err on the side of
>> caution and not offload all the bond configurations.  John can provide
>> more details.  Not being able to offload team is unfortunate indeed.
>
>Hi Jiri,
>Yes, as Jakub mentions, we restrict ourselves to L3/L4 hash algorithm
>as this is currently what is supported in fw.

In mlxsw, a default l3/l4 is used always, no matter what the
bonding/team sets. It is not correct, but it works with team as well.
Perhaps we can have NETDEV_LAG_HASH_UNKNOWN to indicate to the driver to
do some default? That would make the "team" offload functional.

>Hopefully this will change as fw features are expanded.
>I understand the issue this presents with offloading team.
>Perhaps resorting to a default hw hash for team is acceptable.
>John


Re: [PATCH net-next 0/8] nfp: offload LAG for tc flower egress

2018-05-29 Thread John Hurley
On Sat, May 26, 2018 at 3:47 AM, Jakub Kicinski
 wrote:
> On Fri, 25 May 2018 08:48:09 +0200, Jiri Pirko wrote:
>> Thu, May 24, 2018 at 04:22:47AM CEST, jakub.kicin...@netronome.com wrote:
>> >Hi!
>> >
>> >This series from John adds bond offload to the nfp driver.  Patch 5
>> >exposes the hash type for NETDEV_LAG_TX_TYPE_HASH to make sure nfp
>> >hashing matches that of the software LAG.  This may be unnecessarily
>> >conservative, let's see what LAG maintainers think :)
>>
>> So you need to restrict offload to only certain hash algo? In mlxsw, we
>> just ignore the lag setting and do some hw default hashing. Would not be
>> enough? Note that there's a good reason for it, as you see, in team, the
>> hashing is done in a BPF function and could be totally arbitrary.
>> Your patchset effectively disables team offload for nfp.
>
> My understanding is that the project requirements only called for L3/L4
> hash algorithm offload, hence the temptation to err on the side of
> caution and not offload all the bond configurations.  John can provide
> more details.  Not being able to offload team is unfortunate indeed.

Hi Jiri,
Yes, as Jakub mentions, we restrict ourselves to L3/L4 hash algorithm
as this is currently what is supported in fw.
Hopefully this will change as fw features are expanded.
I understand the issue this presents with offloading team.
Perhaps resorting to a default hw hash for team is acceptable.
John


Re: [PATCH net-next 0/8] nfp: offload LAG for tc flower egress

2018-05-25 Thread Jakub Kicinski
On Fri, 25 May 2018 08:48:09 +0200, Jiri Pirko wrote:
> Thu, May 24, 2018 at 04:22:47AM CEST, jakub.kicin...@netronome.com wrote:
> >Hi!
> >
> >This series from John adds bond offload to the nfp driver.  Patch 5
> >exposes the hash type for NETDEV_LAG_TX_TYPE_HASH to make sure nfp
> >hashing matches that of the software LAG.  This may be unnecessarily
> >conservative, let's see what LAG maintainers think :)  
> 
> So you need to restrict offload to only certain hash algo? In mlxsw, we
> just ignore the lag setting and do some hw default hashing. Would not be
> enough? Note that there's a good reason for it, as you see, in team, the
> hashing is done in a BPF function and could be totally arbitrary.
> Your patchset effectively disables team offload for nfp.

My understanding is that the project requirements only called for L3/L4
hash algorithm offload, hence the temptation to err on the side of
caution and not offload all the bond configurations.  John can provide
more details.  Not being able to offload team is unfortunate indeed.


Re: [PATCH net-next 0/8] nfp: offload LAG for tc flower egress

2018-05-25 Thread Jiri Pirko
Thu, May 24, 2018 at 04:22:47AM CEST, jakub.kicin...@netronome.com wrote:
>Hi!
>
>This series from John adds bond offload to the nfp driver.  Patch 5
>exposes the hash type for NETDEV_LAG_TX_TYPE_HASH to make sure nfp
>hashing matches that of the software LAG.  This may be unnecessarily
>conservative, let's see what LAG maintainers think :)

So you need to restrict offload to only certain hash algo? In mlxsw, we
just ignore the lag setting and do some hw default hashing. Would not be
enough? Note that there's a good reason for it, as you see, in team, the
hashing is done in a BPF function and could be totally arbitrary.
Your patchset effectively disables team offload for nfp.


Re: [PATCH net-next 0/8] nfp: offload LAG for tc flower egress

2018-05-24 Thread David Miller
From: Jakub Kicinski 
Date: Wed, 23 May 2018 19:22:47 -0700

> This series from John adds bond offload to the nfp driver.  Patch 5
> exposes the hash type for NETDEV_LAG_TX_TYPE_HASH to make sure nfp
> hashing matches that of the software LAG.  This may be unnecessarily
> conservative, let's see what LAG maintainers think :)

Series applied, thanks.


Re: [PATCH net-next 0/8] nfp: offload LAG for tc flower egress

2018-05-24 Thread Jakub Kicinski
On Thu, 24 May 2018 22:26:03 +0300, Or Gerlitz wrote:
> On Thu, May 24, 2018 at 9:49 PM, Jakub Kicinski
>  wrote:
> > On Thu, 24 May 2018 20:04:56 +0300, Or Gerlitz wrote:  
> 
> >> Does this apply also to non-uplink representors? if yes, what is the use 
> >> case?
> >>
> >> We are looking on supporting uplink lag in sriov switchdev scheme - we 
> >> refer to
> >> it as "vf lag" -- b/c the netdev and rdma devices seen by the VF are 
> >> actually
> >> subject to HA and/or LAG - I wasn't sure if/how you limit this series
> >> to uplink reprs  
> >
> > I don't think we have a limitation on the output port within the LAG.
> > But keep in mind in our devices all ports belong to the same eswitch/PF
> > so bonding uplink ports is generally sufficient, I'm not sure VF
> > bonding adds much HA.  IOW AFAIK we support VF bonding because HW can do
> > it easily, not because we have a strong use case for it.  
> 
> To make it clear, vf lag is code name for uplink lag, I think we want
> to say that we provide the VM a lagged VF, anyway, again, the lag is
> done on the uplink reps not on the vf reps.

Ah, ack, same use case here!

> Unlike the uplink port which is physical one, the vf vport is virtual
> one, what could be the benefit to bond two vports?

I'm not sure what it could be :)  We can also bond an uplink and a VF!
All outputs on the nfp are working same, so why limit ourselves if we
can do it? :)


Re: [PATCH net-next 0/8] nfp: offload LAG for tc flower egress

2018-05-24 Thread Or Gerlitz
On Thu, May 24, 2018 at 9:49 PM, Jakub Kicinski
 wrote:
> On Thu, 24 May 2018 20:04:56 +0300, Or Gerlitz wrote:

>> Does this apply also to non-uplink representors? if yes, what is the use 
>> case?
>>
>> We are looking on supporting uplink lag in sriov switchdev scheme - we refer 
>> to
>> it as "vf lag" -- b/c the netdev and rdma devices seen by the VF are actually
>> subject to HA and/or LAG - I wasn't sure if/how you limit this series
>> to uplink reprs
>
> I don't think we have a limitation on the output port within the LAG.
> But keep in mind in our devices all ports belong to the same eswitch/PF
> so bonding uplink ports is generally sufficient, I'm not sure VF
> bonding adds much HA.  IOW AFAIK we support VF bonding because HW can do
> it easily, not because we have a strong use case for it.


To make it clear, vf lag is code name for uplink lag, I think we want
to say that
we provide the VM a lagged VF, anyway, again, the lag is done on the uplink reps
not on the vf reps. Unlike the uplink port which is physical one, the
vf vport is virtual
one, what could be the benefit to bond two vports?


Re: [PATCH net-next 0/8] nfp: offload LAG for tc flower egress

2018-05-24 Thread Jakub Kicinski
On Thu, 24 May 2018 11:23:00 -0700, Samudrala, Sridhar wrote:
> On 5/24/2018 10:04 AM, Or Gerlitz wrote:
> > On Thu, May 24, 2018 at 5:22 AM, Jakub Kicinski
> >  wrote:  
> >> Hi!
> >>
> >> This series from John adds bond offload to the nfp driver.  Patch 5
> >> exposes the hash type for NETDEV_LAG_TX_TYPE_HASH to make sure nfp
> >> hashing matches that of the software LAG.  This may be unnecessarily
> >> conservative, let's see what LAG maintainers think :)
> >>
> >> John says:
> >>
> >> This patchset sets up the infrastructure and offloads output actions for
> >> when a TC flower rule attempts to egress a packet to a LAG port.
> >>
> >> Firstly it adds some of the infrastructure required to the flower app and
> >> to the nfp core. This includes the ability to change the MAC address of a
> >> repr, a function for combining lookup and write to a FW symbol, and the
> >> addition of private data to a repr on a per app basis.
> >>
> >> Patch 6 continues by implementing notifiers that track Linux bonds and
> >> communicates to the FW those which enslave reprs, along with the current
> >> state of reprs within the bond.
> >>
> >> Patch 7 ensures bonds are synchronised with FW by receiving and acting
> >> upon cmsgs sent to the kernel. These may request that a bond message is
> >> retransmitted when FW can process it, or may request a full sync of the
> >> bonds defined in the kernel.
> >>
> >> Patch 8 offloads a flower action when that action requires egressing to a
> >> pre-defined Linux bond.  
> > Does this apply also to non-uplink representors? if yes, what is the use 
> > case?
> >
> > We are looking on supporting uplink lag in sriov switchdev scheme - we 
> > refer to
> > it as "vf lag" -- b/c the netdev and rdma devices seen by the VF are 
> > actually
> > subject to HA and/or LAG - I wasn't sure if/how you limit this series
> > to uplink reprs  
> 
> Also, does this patchset support offloading LAG when using vxlan based 
> tunnels?
> 
> When using OVS offloading with vxlan,  the encap rule that gets offloaded via 
> tc-flower
> has egress port as vxlan device and the decap rule has the in-port as vxlan 
> device, not
> the actual egress port.  How are you addressing this issue?

It is very much on our radar, I think we will send out a related RFC
later today :)

But to be honest I think you can just install an egress callback on the
bond and that will pretty much work today.  You don't have to "own" the
egress device to install a egdev callback on it.


Re: [PATCH net-next 0/8] nfp: offload LAG for tc flower egress

2018-05-24 Thread Jakub Kicinski
On Thu, 24 May 2018 20:04:56 +0300, Or Gerlitz wrote:
> On Thu, May 24, 2018 at 5:22 AM, Jakub Kicinski wrote:
> > Hi!
> >
> > This series from John adds bond offload to the nfp driver.  Patch 5
> > exposes the hash type for NETDEV_LAG_TX_TYPE_HASH to make sure nfp
> > hashing matches that of the software LAG.  This may be unnecessarily
> > conservative, let's see what LAG maintainers think :)
> >
> > John says:
> >
> > This patchset sets up the infrastructure and offloads output actions for
> > when a TC flower rule attempts to egress a packet to a LAG port.
> >
> > Firstly it adds some of the infrastructure required to the flower app and
> > to the nfp core. This includes the ability to change the MAC address of a
> > repr, a function for combining lookup and write to a FW symbol, and the
> > addition of private data to a repr on a per app basis.
> >
> > Patch 6 continues by implementing notifiers that track Linux bonds and
> > communicates to the FW those which enslave reprs, along with the current
> > state of reprs within the bond.
> >
> > Patch 7 ensures bonds are synchronised with FW by receiving and acting
> > upon cmsgs sent to the kernel. These may request that a bond message is
> > retransmitted when FW can process it, or may request a full sync of the
> > bonds defined in the kernel.
> >
> > Patch 8 offloads a flower action when that action requires egressing to a
> > pre-defined Linux bond.  
> 
> Does this apply also to non-uplink representors? if yes, what is the use case?
> 
> We are looking on supporting uplink lag in sriov switchdev scheme - we refer 
> to
> it as "vf lag" -- b/c the netdev and rdma devices seen by the VF are actually
> subject to HA and/or LAG - I wasn't sure if/how you limit this series
> to uplink reprs

I don't think we have a limitation on the output port within the LAG.
But keep in mind in our devices all ports belong to the same eswitch/PF
so bonding uplink ports is generally sufficient, I'm not sure VF
bonding adds much HA.  IOW AFAIK we support VF bonding because HW can do
it easily, not because we have a strong use case for it.


Re: [PATCH net-next 0/8] nfp: offload LAG for tc flower egress

2018-05-24 Thread Samudrala, Sridhar


On 5/24/2018 10:04 AM, Or Gerlitz wrote:

On Thu, May 24, 2018 at 5:22 AM, Jakub Kicinski
 wrote:

Hi!

This series from John adds bond offload to the nfp driver.  Patch 5
exposes the hash type for NETDEV_LAG_TX_TYPE_HASH to make sure nfp
hashing matches that of the software LAG.  This may be unnecessarily
conservative, let's see what LAG maintainers think :)

John says:

This patchset sets up the infrastructure and offloads output actions for
when a TC flower rule attempts to egress a packet to a LAG port.

Firstly it adds some of the infrastructure required to the flower app and
to the nfp core. This includes the ability to change the MAC address of a
repr, a function for combining lookup and write to a FW symbol, and the
addition of private data to a repr on a per app basis.

Patch 6 continues by implementing notifiers that track Linux bonds and
communicates to the FW those which enslave reprs, along with the current
state of reprs within the bond.

Patch 7 ensures bonds are synchronised with FW by receiving and acting
upon cmsgs sent to the kernel. These may request that a bond message is
retransmitted when FW can process it, or may request a full sync of the
bonds defined in the kernel.

Patch 8 offloads a flower action when that action requires egressing to a
pre-defined Linux bond.

Does this apply also to non-uplink representors? if yes, what is the use case?

We are looking on supporting uplink lag in sriov switchdev scheme - we refer to
it as "vf lag" -- b/c the netdev and rdma devices seen by the VF are actually
subject to HA and/or LAG - I wasn't sure if/how you limit this series
to uplink reprs


Also, does this patchset support offloading LAG when using vxlan based tunnels?

When using OVS offloading with vxlan,  the encap rule that gets offloaded via 
tc-flower
has egress port as vxlan device and the decap rule has the in-port as vxlan 
device, not
the actual egress port.  How are you addressing this issue?








Re: [PATCH net-next 0/8] nfp: offload LAG for tc flower egress

2018-05-24 Thread Or Gerlitz
On Thu, May 24, 2018 at 5:22 AM, Jakub Kicinski
 wrote:
> Hi!
>
> This series from John adds bond offload to the nfp driver.  Patch 5
> exposes the hash type for NETDEV_LAG_TX_TYPE_HASH to make sure nfp
> hashing matches that of the software LAG.  This may be unnecessarily
> conservative, let's see what LAG maintainers think :)
>
> John says:
>
> This patchset sets up the infrastructure and offloads output actions for
> when a TC flower rule attempts to egress a packet to a LAG port.
>
> Firstly it adds some of the infrastructure required to the flower app and
> to the nfp core. This includes the ability to change the MAC address of a
> repr, a function for combining lookup and write to a FW symbol, and the
> addition of private data to a repr on a per app basis.
>
> Patch 6 continues by implementing notifiers that track Linux bonds and
> communicates to the FW those which enslave reprs, along with the current
> state of reprs within the bond.
>
> Patch 7 ensures bonds are synchronised with FW by receiving and acting
> upon cmsgs sent to the kernel. These may request that a bond message is
> retransmitted when FW can process it, or may request a full sync of the
> bonds defined in the kernel.
>
> Patch 8 offloads a flower action when that action requires egressing to a
> pre-defined Linux bond.

Does this apply also to non-uplink representors? if yes, what is the use case?

We are looking on supporting uplink lag in sriov switchdev scheme - we refer to
it as "vf lag" -- b/c the netdev and rdma devices seen by the VF are actually
subject to HA and/or LAG - I wasn't sure if/how you limit this series
to uplink reprs