Re: [net-next PATCH 0/7] tc offload for cls_u32 on ixgbe

2016-02-09 Thread Fastabend, John R
On 2/4/2016 5:12 AM, Jamal Hadi Salim wrote:
> 
> On 16-02-03 01:48 PM, Fastabend, John R wrote:
> 
> BTW: For the record John, I empathize with you that we need to
> move. Please have patience - we are close; lets just get this resolved
> in Seville. I like your patches a lot and would love to just have
> your patches pushed in, but the challenges with community is being able
> to reach some middle ground. We are not as bad as some of the standards
> organizations. I am sure we'll get this resolved by end of next week
> if not, I am %100 in agreement some form of your patches (And Amir's
> need to go in and then we can refactor as needed)

Agreed although I'm a bit worried we are starting to talk about a
single hardware IR. This discussion has always failed in my experience.

> 
>>> 1) "priorities" for filters and some form of "index" for actions is
>>> is needed. I think index (which tends to be a 32 bit value is what
>>> Amir's patches refered to as "cookie" - or at least some hardware
>>> can be used to query the action with). Priorities maybe implicit in
>>> the order in which they are added. And th idea of appending vs
>>> exclusivity vs replace (which  netlink already supports)
>>> is important to worry about (TCAMS tend to assume an append mode
>>> for example).
>>
>> The code denotes add/del/replace already. I'm not sure why a TCAM
>> would assume an append mode but OK maybe that is some API you have
>> the APIs I use don't have these semantics.
>>
> 
> Basically most hardware (or i should say driver implementations of
> mostly TCAMS) allow you to add exactly the same filter as many times
> as you want. They dont really look at what you want to filter on
> and then scream "conflict". IOW, you (user) are responsible for
> conflict resolution at the filter level. The driver sees this blob
> and requests for some index/key from the hardware then just adds it.
> You can then use this key/index to delete/replace etc.
> This is what i meant by "append" mode.
> However if a classifier implementation cares about filter ambiguity
> resolution, then priorities are used. We need to worry about the
> bigger picture.
> 

Sure in other classifiers its used but its not needed in the set I
planned to added it later.

> 
>> For this series using cls_u32 the handle gives you everything you need
>> to put entries in the right table and row. Namely the ht # and order #
>> from 'tc'.
> 
> True - but with a caveat. There are only 2^12 max tables you can
> have for example and up to 2^12 filters per bucket etc.
> 

This is a software limitation as well right? If it hasn't showed up
as a limitation on the software side why would it be an issue here?
Do you have more than 2^12 tables on your devices? If so I guess we
can tack on another 32bits somewhere.

>> Take a look at u32_change and u32_classify its the handle
>> that places the filter into the list and the handle that is matched in
>> classify. We should place the filters in the hardware in the same order
>> that is used by u32_change.
>>
> 
> I can see some parallels, but:
> The nodeid in itself is insufficent for two reasons:
> You cant have more than 2^12 filters per bucket;
> and the nodeid then takes two meanings: a) it is an id
> b) it specifies the order in which things are looked up.
> 
> I think you need to take the u32 address and map it to something in your
> hardware. But at the same time it is important to have the abstraction
> closely emulate your hardware.
> 

IMO the hardware/interface must preserve the same ordering of
filters/hash_Tables/etc. How it does that mapping should be
a driver concern and it can always abort if it fails.

>> Also ran a few tests and can't see how priority works in u32 maybe you
>> can shed some light but as best I can tell it doesn't have any effect
>> on rule execution.
>>
> 
> True.
> u32 doesnt care because it will give you a nodeid if you dont specify
> one. i.e conflict resolution is mapped to you not specifying exactly
> the same ht:bkt:nodeid more than once. And if you will let the
> kernel do it for you (as i am assumming you are saying your hardware
> will) then no need.

Yep. Faithfully offloading u32 here not changing anything except
I do have to abort on some cases with the simpler devices. fm10k for
example can model hash nodes with divisors > 1.

> 
>>>
>>> 2) I like the u32 approach where it makes sense; but sometimes it
>>> doesnt make sense from a usability pov. I work with some ASICs
>>> that have 10 tuples that are  fixed. Yes, a user can describe a policy
>>> with u32 but flower would be more  usable say with flower (both
>>> programmatic and cli)
>>
>> Sure so create a set of offload hooks for flower we don't need only
>> one hardware classifier any more than we would like a single software
>> classifiers.
> 
> 
> Glad to hear that.
> I was a little concerned that despite my love for u32 it was
> going to be _the_ classifier. It doesnt fit for all offload cases
> and sometimes it is because of human 

Re: [net-next PATCH 0/7] tc offload for cls_u32 on ixgbe

2016-02-09 Thread Fastabend, John R
On 2/4/2016 3:19 PM, Pablo Neira Ayuso wrote:
> On Thu, Feb 04, 2016 at 10:16:56AM +0100, Jiri Pirko wrote:
>> Wed, Feb 03, 2016 at 10:27:32AM CET, john.fastab...@gmail.com wrote:
>>>
>>> Also by adding get_parse_graph and set_parse_graph attributes as
>>> in my previous flow_api work we can build programmable devices
>>> and programmatically learn when rules can or can not be loaded
>>> into the hardware. Again future work.
>>>
>>> Any comments/feedback appreciated.

Sorry if you get this twice it doesn't look like my original response
made it to netdev and the laptop I replied on charger blew up.

>>
>> I like this being thin and elegant solution. However, ~2 years ago when I
>> pushed openvswitch kernel datapath offload patchset, people were yelling
>> at me that it is not generic enough solution, that tc has to be able
>> to use the api (Jamal :)), nftables as well.
>

The other problem with OVS is if you have the capabilities to do
wildcard lookups (e.g. TCAM/SRAM/etc) then offloading the exact
match table in OVS is really inefficient use of the resource. You
really want to load the megaflow table into hardware. I just don't
think its a good scheme for what you want.

> I would be glad to join this debate during NetDev 1.1 too.
>

great.

> I think we should provide a solution that allows people uses both
> tc and nftables, this would require a bit of generic infrastructure on
> top of it so we don't restrict users to one single solution, in other
> words, we allow the user to select its own poison.
>
>> Now this patch is making offload strictly tc-based and nobody seems to
>> care :) I do. I think that we might try to find some generic middle
>> layer.

If we can build the universal model for 'tc' and 'nftable' we should
unify them higher in the stack? It doesn't make sense to me for the
driver folks to try and create the unified model for two subsystems
if we don't think its worthwhile in software as well.

>
> I agree and I'll be happy to help to push this ahead. Let's try to sit
> and get together to resolve this.

Great.

>
> See you soon.
>



Re: [net-next PATCH 0/7] tc offload for cls_u32 on ixgbe

2016-02-09 Thread Jamal Hadi Salim

On 16-02-09 06:24 AM, Fastabend, John R wrote:

On 2/4/2016 5:12 AM, Jamal Hadi Salim wrote:


On 16-02-03 01:48 PM, Fastabend, John R wrote:




Basically most hardware (or i should say driver implementations of
mostly TCAMS) allow you to add exactly the same filter as many times
as you want. They dont really look at what you want to filter on
and then scream "conflict". IOW, you (user) are responsible for
conflict resolution at the filter level. The driver sees this blob
and requests for some index/key from the hardware then just adds it.
You can then use this key/index to delete/replace etc.
This is what i meant by "append" mode.
However if a classifier implementation cares about filter ambiguity
resolution, then priorities are used. We need to worry about the
bigger picture.



Sure in other classifiers its used but its not needed in the set I
planned to added it later.



If you leave it open for some other hardware to use we should be fine.




For this series using cls_u32 the handle gives you everything you need
to put entries in the right table and row. Namely the ht # and order #
from 'tc'.


True - but with a caveat. There are only 2^12 max tables you can
have for example and up to 2^12 filters per bucket etc.



This is a software limitation as well right? If it hasn't showed up
as a limitation on the software side why would it be an issue here?
Do you have more than 2^12 tables on your devices? If so I guess we
can tack on another 32bits somewhere.



That handle is used as an "Address" to the 32 bit filter.
Just beware of the semantics the handle has.
It hasnt shown up as a software limitation because the defaults
are good enough for most people. But if you ever want to install
a million rules that can be looked up at a reasonable pps rate
it will become very obvious quickly. I have a sample setup in the
talk tommorow which shows such an example.


I think you need to take the u32 address and map it to something in your
hardware. But at the same time it is important to have the abstraction
closely emulate your hardware.



IMO the hardware/interface must preserve the same ordering of
filters/hash_Tables/etc. How it does that mapping should be
a driver concern and it can always abort if it fails.



Sure.


Also ran a few tests and can't see how priority works in u32 maybe you
can shed some light but as best I can tell it doesn't have any effect
on rule execution.



True.
u32 doesnt care because it will give you a nodeid if you dont specify
one. i.e conflict resolution is mapped to you not specifying exactly
the same ht:bkt:nodeid more than once. And if you will let the
kernel do it for you (as i am assumming you are saying your hardware
will) then no need.


Yep. Faithfully offloading u32 here not changing anything except
I do have to abort on some cases with the simpler devices. fm10k for
example can model hash nodes with divisors > 1.



I wonder if when we get to capabilities we can do this...







My issue is we can map flower onto u32 that is fine and u32 onto
bpf. But we lose a lot of the power of each classifier when we
do this. flower for example is nice because of its simplicity
presumably this translates into faster updates, u32 is great because
we get full parse graph support and hash tables, ebpf is the biggest
beast of all and lets us load arbitrary functions into the device.
All are nice in their own right.



Did i send you my slides? ;->


cheers,
jamal


Re: [net-next PATCH 0/7] tc offload for cls_u32 on ixgbe

2016-02-04 Thread Pablo Neira Ayuso
On Thu, Feb 04, 2016 at 10:16:56AM +0100, Jiri Pirko wrote:
> Wed, Feb 03, 2016 at 10:27:32AM CET, john.fastab...@gmail.com wrote:
> >
> >Also by adding get_parse_graph and set_parse_graph attributes as
> >in my previous flow_api work we can build programmable devices
> >and programmatically learn when rules can or can not be loaded
> >into the hardware. Again future work.
> >
> >Any comments/feedback appreciated.
> 
> I like this being thin and elegant solution. However, ~2 years ago when I
> pushed openvswitch kernel datapath offload patchset, people were yelling
> at me that it is not generic enough solution, that tc has to be able
> to use the api (Jamal :)), nftables as well.

I would be glad to join this debate during NetDev 1.1 too.

I think we should provide a solution that allows people uses both
tc and nftables, this would require a bit of generic infrastructure on
top of it so we don't restrict users to one single solution, in other
words, we allow the user to select its own poison.

> Now this patch is making offload strictly tc-based and nobody seems to
> care :) I do. I think that we might try to find some generic middle layer.

I agree and I'll be happy to help to push this ahead. Let's try to sit
and get together to resolve this.

See you soon.


Re: [net-next PATCH 0/7] tc offload for cls_u32 on ixgbe

2016-02-04 Thread Jamal Hadi Salim


On 16-02-03 01:48 PM, Fastabend, John R wrote:

BTW: For the record John, I empathize with you that we need to
move. Please have patience - we are close; lets just get this resolved
in Seville. I like your patches a lot and would love to just have
your patches pushed in, but the challenges with community is being able
to reach some middle ground. We are not as bad as some of the standards
organizations. I am sure we'll get this resolved by end of next week
if not, I am %100 in agreement some form of your patches (And Amir's
need to go in and then we can refactor as needed)


1) "priorities" for filters and some form of "index" for actions is
is needed. I think index (which tends to be a 32 bit value is what
Amir's patches refered to as "cookie" - or at least some hardware
can be used to query the action with). Priorities maybe implicit in
the order in which they are added. And th idea of appending vs
exclusivity vs replace (which  netlink already supports)
is important to worry about (TCAMS tend to assume an append mode
for example).


The code denotes add/del/replace already. I'm not sure why a TCAM
would assume an append mode but OK maybe that is some API you have
the APIs I use don't have these semantics.



Basically most hardware (or i should say driver implementations of
mostly TCAMS) allow you to add exactly the same filter as many times
as you want. They dont really look at what you want to filter on
and then scream "conflict". IOW, you (user) are responsible for
conflict resolution at the filter level. The driver sees this blob
and requests for some index/key from the hardware then just adds it.
You can then use this key/index to delete/replace etc.
This is what i meant by "append" mode.
However if a classifier implementation cares about filter ambiguity
resolution, then priorities are used. We need to worry about the
bigger picture.



For this series using cls_u32 the handle gives you everything you need
to put entries in the right table and row. Namely the ht # and order #
from 'tc'.


True - but with a caveat. There are only 2^12 max tables you can
have for example and up to 2^12 filters per bucket etc.


Take a look at u32_change and u32_classify its the handle
that places the filter into the list and the handle that is matched in
classify. We should place the filters in the hardware in the same order
that is used by u32_change.



I can see some parallels, but:
The nodeid in itself is insufficent for two reasons:
You cant have more than 2^12 filters per bucket;
and the nodeid then takes two meanings: a) it is an id
b) it specifies the order in which things are looked up.

I think you need to take the u32 address and map it to something in your
hardware. But at the same time it is important to have the abstraction
closely emulate your hardware.


Also ran a few tests and can't see how priority works in u32 maybe you
can shed some light but as best I can tell it doesn't have any effect
on rule execution.



True.
u32 doesnt care because it will give you a nodeid if you dont specify
one. i.e conflict resolution is mapped to you not specifying exactly
the same ht:bkt:nodeid more than once. And if you will let the
kernel do it for you (as i am assumming you are saying your hardware
will) then no need.



2) I like the u32 approach where it makes sense; but sometimes it
doesnt make sense from a usability pov. I work with some ASICs
that have 10 tuples that are  fixed. Yes, a user can describe a policy
with u32 but flower would be more  usable say with flower (both
programmatic and cli)


Sure so create a set of offload hooks for flower we don't need only
one hardware classifier any more than we would like a single software
classifiers.



Glad to hear that.
I was a little concerned that despite my love for u32 it was
going to be _the_ classifier. It doesnt fit for all offload cases
and sometimes it is because of human operators (the 10 tuple
hardware classifier i mentioned earlier).
BTW: Classifier in this case is very wide ranging (a regex hardware
offload for example qualifies).



Again I'm trying to faithfully implement what we have in software
and load that into the hardware. The handle today gives ingress/egres
hook. If you want an all ports hook we should add it to 'tc' software
first and then push that to the hardware not create magic hardware
bits. See I've drank the cool aid software first than hardware.



;-> No disagreement. It felt like a small sensible
change - thats why i suggested it.


4) Why are we forsaking switchdev John?
This is certainly re-usable beyond NICs and SRIOV.



Sure and switchdev can use it just like they use fdb_add and friends.
I just don't want to require switchdev infrastructure on things that
really are not switches. I think Amir indicated he would take a try
at the switchdev integration. If not I'm willing to do it but it
doesn't block this series in any way imo.



Ok. Makes sense.


5)What happened to being both able to hardware and/or software?



Re: [net-next PATCH 0/7] tc offload for cls_u32 on ixgbe

2016-02-04 Thread Jiri Pirko
Wed, Feb 03, 2016 at 10:27:32AM CET, john.fastab...@gmail.com wrote:
>This extends the setup_tc framework so it can support more than just
>the mqprio offload and push other classifiers and qdiscs into the
>hardware. The series here targets the u32 classifier and ixgbe
>driver. I worked out the u32 classifier because it is protocol
>oblivious and aligns with multiple hardware devices I have access
>to. I did an initial implementation on ixgbe because (a) I have one
>in my box (b) its a stable driver and (c) it is relatively simple
>compared to the other devices I have here but still has enough
>flexibility to exercise the features of cls_u32.
>
>I intentionally limited the scope of this series to the basic
>feature set. Specifically this uses a 'big hammer' feature bit
>to do the offload or not. If the bit is set you get offloaded rules
>if it is not then rules will not be offloaded. If we can agree on
>this patch series there are some more patches on my queue we can
>talk about to make the offload decision per rule using flags similar
>to how we do l2 mac updates. Additionally the error strategy can
>be improved to be hard aborting, log and continue, etc. I think
>these are nice to have improvements but shouldn't block this series.
>
>Also by adding get_parse_graph and set_parse_graph attributes as
>in my previous flow_api work we can build programmable devices
>and programmatically learn when rules can or can not be loaded
>into the hardware. Again future work.
>
>Any comments/feedback appreciated.

I like this being thin and elegant solution. However, ~2 years ago when I
pushed openvswitch kernel datapath offload patchset, people were yelling
at me that it is not generic enough solution, that tc has to be able
to use the api (Jamal :)), nftables as well.

Now this patch is making offload strictly tc-based and nobody seems to
care :) I do. I think that we might try to find some generic middle layer.

Let's discuss this more in person next week.

Thanks!


Re: [net-next PATCH 0/7] tc offload for cls_u32 on ixgbe

2016-02-03 Thread Or Gerlitz

On 2/3/2016 12:21 PM, John Fastabend wrote:

Thanks, we will need at least a v2 to fixup some build errors
with various compile flags caught by build_bot and missed by me.

Hi John,

You didn't mark that as RFC... but we said this direction/approach yet
to be talked @ netdev next-week, so.. can you clarify?

I suggest not to rush and asking pulling this, lets have the tc workshop 
beforehand...


Please add to v2 listing of changes from V0/V1 to assist with the review.

thanks,

Or.


[net-next PATCH 0/7] tc offload for cls_u32 on ixgbe

2016-02-03 Thread John Fastabend
This extends the setup_tc framework so it can support more than just
the mqprio offload and push other classifiers and qdiscs into the
hardware. The series here targets the u32 classifier and ixgbe
driver. I worked out the u32 classifier because it is protocol
oblivious and aligns with multiple hardware devices I have access
to. I did an initial implementation on ixgbe because (a) I have one
in my box (b) its a stable driver and (c) it is relatively simple
compared to the other devices I have here but still has enough
flexibility to exercise the features of cls_u32.

I intentionally limited the scope of this series to the basic
feature set. Specifically this uses a 'big hammer' feature bit
to do the offload or not. If the bit is set you get offloaded rules
if it is not then rules will not be offloaded. If we can agree on
this patch series there are some more patches on my queue we can
talk about to make the offload decision per rule using flags similar
to how we do l2 mac updates. Additionally the error strategy can
be improved to be hard aborting, log and continue, etc. I think
these are nice to have improvements but shouldn't block this series.

Also by adding get_parse_graph and set_parse_graph attributes as
in my previous flow_api work we can build programmable devices
and programmatically learn when rules can or can not be loaded
into the hardware. Again future work.

Any comments/feedback appreciated.

Thanks,
John

---

John Fastabend (7):
  net: rework ndo tc op to consume additional qdisc handle parameter
  net: rework setup_tc ndo op to consume general tc operand
  net: sched: add cls_u32 offload hooks for netdevs
  net: add tc offload feature flag
  net: tc: helper functions to query action types
  net: ixgbe: add minimal parser details for ixgbe
  net: ixgbe: add support for tc_u32 offload


 drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c  |8 +
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h  |2 
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c |2 
 drivers/net/ethernet/broadcom/bnxt/bnxt.c|9 +
 drivers/net/ethernet/intel/fm10k/fm10k_netdev.c  |   11 +
 drivers/net/ethernet/intel/i40e/i40e_main.c  |   10 +
 drivers/net/ethernet/intel/ixgbe/ixgbe.h |3 
 drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c |6 -
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c|  206 ++
 drivers/net/ethernet/intel/ixgbe/ixgbe_model.h   |  112 
 drivers/net/ethernet/mellanox/mlx4/en_netdev.c   |   13 +
 drivers/net/ethernet/sfc/efx.h   |3 
 drivers/net/ethernet/sfc/tx.c|   10 +
 drivers/net/ethernet/ti/netcp_core.c |   14 +
 include/linux/netdev_features.h  |3 
 include/linux/netdevice.h|   24 ++-
 include/net/pkt_cls.h|   33 
 include/net/tc_act/tc_gact.h |   16 ++
 net/core/ethtool.c   |1 
 net/sched/cls_u32.c  |   73 
 net/sched/sch_mqprio.c   |8 +
 21 files changed, 541 insertions(+), 26 deletions(-)
 create mode 100644 drivers/net/ethernet/intel/ixgbe/ixgbe_model.h

--
Signature


Re: [net-next PATCH 0/7] tc offload for cls_u32 on ixgbe

2016-02-03 Thread Amir Vadai"
On Wed, Feb 03, 2016 at 01:27:32AM -0800, John Fastabend wrote:
> This extends the setup_tc framework so it can support more than just
> the mqprio offload and push other classifiers and qdiscs into the
> hardware. The series here targets the u32 classifier and ixgbe
> driver. I worked out the u32 classifier because it is protocol
> oblivious and aligns with multiple hardware devices I have access
> to. I did an initial implementation on ixgbe because (a) I have one
> in my box (b) its a stable driver and (c) it is relatively simple
> compared to the other devices I have here but still has enough
> flexibility to exercise the features of cls_u32.
> 
> I intentionally limited the scope of this series to the basic
> feature set. Specifically this uses a 'big hammer' feature bit
> to do the offload or not. If the bit is set you get offloaded rules
> if it is not then rules will not be offloaded. If we can agree on
> this patch series there are some more patches on my queue we can
> talk about to make the offload decision per rule using flags similar
> to how we do l2 mac updates. Additionally the error strategy can
> be improved to be hard aborting, log and continue, etc. I think
> these are nice to have improvements but shouldn't block this series.
> 
> Also by adding get_parse_graph and set_parse_graph attributes as
> in my previous flow_api work we can build programmable devices
> and programmatically learn when rules can or can not be loaded
> into the hardware. Again future work.
> 
> Any comments/feedback appreciated.
> 
> Thanks,
> John
> 
> ---
> 
> John Fastabend (7):
>   net: rework ndo tc op to consume additional qdisc handle parameter
>   net: rework setup_tc ndo op to consume general tc operand
>   net: sched: add cls_u32 offload hooks for netdevs
>   net: add tc offload feature flag
>   net: tc: helper functions to query action types
>   net: ixgbe: add minimal parser details for ixgbe
>   net: ixgbe: add support for tc_u32 offload
> 

Hi John,

Nice work :)

I will add mlx5 support, and see if can live with u32. If not - will
add flower support too.

Amir


Re: [net-next PATCH 0/7] tc offload for cls_u32 on ixgbe

2016-02-03 Thread John Fastabend
On 16-02-03 02:11 AM, Amir Vadai" wrote:
> On Wed, Feb 03, 2016 at 01:27:32AM -0800, John Fastabend wrote:
>> This extends the setup_tc framework so it can support more than just
>> the mqprio offload and push other classifiers and qdiscs into the
>> hardware. The series here targets the u32 classifier and ixgbe
>> driver. I worked out the u32 classifier because it is protocol
>> oblivious and aligns with multiple hardware devices I have access
>> to. I did an initial implementation on ixgbe because (a) I have one
>> in my box (b) its a stable driver and (c) it is relatively simple
>> compared to the other devices I have here but still has enough
>> flexibility to exercise the features of cls_u32.
>>
>> I intentionally limited the scope of this series to the basic
>> feature set. Specifically this uses a 'big hammer' feature bit
>> to do the offload or not. If the bit is set you get offloaded rules
>> if it is not then rules will not be offloaded. If we can agree on
>> this patch series there are some more patches on my queue we can
>> talk about to make the offload decision per rule using flags similar
>> to how we do l2 mac updates. Additionally the error strategy can
>> be improved to be hard aborting, log and continue, etc. I think
>> these are nice to have improvements but shouldn't block this series.
>>
>> Also by adding get_parse_graph and set_parse_graph attributes as
>> in my previous flow_api work we can build programmable devices
>> and programmatically learn when rules can or can not be loaded
>> into the hardware. Again future work.
>>
>> Any comments/feedback appreciated.
>>
>> Thanks,
>> John
>>
>> ---
>>
>> John Fastabend (7):
>>   net: rework ndo tc op to consume additional qdisc handle parameter
>>   net: rework setup_tc ndo op to consume general tc operand
>>   net: sched: add cls_u32 offload hooks for netdevs
>>   net: add tc offload feature flag
>>   net: tc: helper functions to query action types
>>   net: ixgbe: add minimal parser details for ixgbe
>>   net: ixgbe: add support for tc_u32 offload
>>
> 
> Hi John,
> 
> Nice work :)

Thanks, we will need at least a v2 to fixup some build errors
with various compile flags caught by build_bot and missed by me.

> 
> I will add mlx5 support, and see if can live with u32. If not - will
> add flower support too.

That would be great.

Thanks
.John

> 
> Amir
> 



Re: [net-next PATCH 0/7] tc offload for cls_u32 on ixgbe

2016-02-03 Thread Jamal Hadi Salim

On 16-02-03 05:31 AM, Or Gerlitz wrote:

On 2/3/2016 12:21 PM, John Fastabend wrote:

Thanks, we will need at least a v2 to fixup some build errors
with various compile flags caught by build_bot and missed by me.

Hi John,

You didn't mark that as RFC... but we said this direction/approach yet
to be talked @ netdev next-week, so.. can you clarify?

I suggest not to rush and asking pulling this, lets have the tc workshop
beforehand...



Yes, the tc workshop is a good place for this.
I think we can spill some of it into the switchdev workshop (which is a
nice flow since that happens later).

Some comments:
1) "priorities" for filters and some form of "index" for actions is
is needed. I think index (which tends to be a 32 bit value is what
Amir's patches refered to as "cookie" - or at least some hardware
can be used to query the action with). Priorities maybe implicit in
the order in which they are added. And the idea of appending vs
exclusivity vs replace (which  netlink already supports)
is important to worry about (TCAMS tend to assume an append mode
for example).

2) I like the u32 approach where it makes sense; but sometimes it
doesnt make sense from a usability pov. I work with some ASICs
that have 10 tuples that are  fixed. Yes, a user can describe a policy
with u32 but flower would be more  usable say with flower (both
programmatic and cli)

3) The concept of "hook address" is important to be able to express.
Amir's patches seemed to miss that (and John brought it up in an
email). It could be as simple as ifindex + hookid. With ifindex of
0 meaning all ports and maybe hookid of 0 meaning all hooks.
Hook semantics are as mentioned by John (as it stands right now
in/egress)

4) Why are we forsaking switchdev John?
This is certainly re-usable beyond NICs and SRIOV.

5)What happened to being both able to hardware and/or software?

Anyways, I think Seville would be a blast! Come one, come all.

cheers,
jamal




Re: [net-next PATCH 0/7] tc offload for cls_u32 on ixgbe

2016-02-03 Thread Fastabend, John R
On 2/3/2016 4:21 AM, Jamal Hadi Salim wrote:
> On 16-02-03 05:31 AM, Or Gerlitz wrote:
>> On 2/3/2016 12:21 PM, John Fastabend wrote:
>>> Thanks, we will need at least a v2 to fixup some build errors
>>> with various compile flags caught by build_bot and missed by me.
>> Hi John,
>>
>> You didn't mark that as RFC... but we said this direction/approach yet
>> to be talked @ netdev next-week, so.. can you clarify?

Yeah I think this set of patches is ready and it really is what we've
been talking about doing for two or three conferences now. Also I don't
know where "we" decided to talk about this @ netdev or how that became
a prereq for patches. I think this is the correct approach and I am not
seeing any contentious pieces here so why not consider it for inclusion.
Do you have some issue with the approach? I don't recall any when we
talked about this last time.

>>
>> I suggest not to rush and asking pulling this, lets have the tc workshop
>> beforehand...
>>

rush? we've been talking about it for a year+

> 
> Yes, the tc workshop is a good place for this.
> I think we can spill some of it into the switchdev workshop (which is a
> nice flow since that happens later).
> 

Sure but this patch is really the basic stuff we should move on to
some of the more interesting pieces. I have ~30 or so patches behind
this that do the fun stuff like resource allocation, capababilities,
support for the divisor > 1, some new actions, etc.

> Some comments:
> 1) "priorities" for filters and some form of "index" for actions is
> is needed. I think index (which tends to be a 32 bit value is what
> Amir's patches refered to as "cookie" - or at least some hardware
> can be used to query the action with). Priorities maybe implicit in
> the order in which they are added. And th idea of appending vs
> exclusivity vs replace (which  netlink already supports)
> is important to worry about (TCAMS tend to assume an append mode
> for example).

The code denotes add/del/replace already. I'm not sure why a TCAM
would assume an append mode but OK maybe that is some API you have
the APIs I use don't have these semantics.

For this series using cls_u32 the handle gives you everything you need
to put entries in the right table and row. Namely the ht # and order #
from 'tc'. Take a look at u32_change and u32_classify its the handle
that places the filter into the list and the handle that is matched in
classify. We should place the filters in the hardware in the same order
that is used by u32_change.

Also ran a few tests and can't see how priority works in u32 maybe you
can shed some light but as best I can tell it doesn't have any effect
on rule execution.

> 
> 2) I like the u32 approach where it makes sense; but sometimes it
> doesnt make sense from a usability pov. I work with some ASICs
> that have 10 tuples that are  fixed. Yes, a user can describe a policy
> with u32 but flower would be more  usable say with flower (both
> programmatic and cli)

Sure so create a set of offload hooks for flower we don't need only
one hardware classifier any more than we would like a single software
classifiers. I'll send out my flower patches when I get to a real system
I'm on a corporate laptop at the moment.

> 
> 3) The concept of "hook address" is important to be able to express.
> Amir's patches seemed to miss that (and John brought it up in an
> email). It could be as simple as ifindex + hookid. With ifindex of
> 0 meaning all ports and maybe hookid of 0 meaning all hooks.
> Hook semantics are as mentioned by John (as it stands right now
> in/egress)
> 

Again I'm trying to faithfully implement what we have in software
and load that into the hardware. The handle today gives ingress/egres
hook. If you want an all ports hook we should add it to 'tc' software
first and then push that to the hardware not create magic hardware
bits. See I've drank the cool aid software first than hardware.

> 4) Why are we forsaking switchdev John?
> This is certainly re-usable beyond NICs and SRIOV.
> 

Sure and switchdev can use it just like they use fdb_add and friends.
I just don't want to require switchdev infrastructure on things that
really are not switches. I think Amir indicated he would take a try
at the switchdev integration. If not I'm willing to do it but it
doesn't block this series in any way imo.

> 5)What happened to being both able to hardware and/or software?

Follow up patch once we get the basic infrastructure in place with
the big feature flag bit. I have a patch I'm testing for this now
but again I want to move in logical and somewhat minimal sets.

> 
> Anyways, I think Seville would be a blast! Come one, come all.
> 

I'll be there but lets be sure to follow up with this online I
know folks are following this who wont be at Seville and I don't
see any reason to block these patches and stop the thread for a
week or more.

> cheers,
> jamal
> 
>