Re: [patch net-next v2 00/10] Add support for resource abstraction

2018-01-25 Thread Jiri Pirko
Thu, Jan 25, 2018 at 04:38:41PM CET, d...@cumulusnetworks.com wrote:
>On 1/25/18 8:24 AM, Jiri Pirko wrote:
> Now I remember. You wrote it independently and but needed iproute2 be a
> delivery vehicle. It uses none of the common infrastructure from
> iproute2. Could we make this more difficult 

 Feel free to rewrite it to use lib/libnetlink.c. Should not be that
 hard. Note that at the time I was pushing devlink userspace, tipc also
 used libmnl as a part of iproute2, so devlink was not the first one.
 That is why I decided not to rewrite.

 As of the rest of the "common infrastructure", what exactly do you
 have in mind?

>>>
>>> This is what I am getting at. Apparently, these resource patches for
>>> devlink require a patched libmnl to work properly. It is wrong for
>> 
>> What do you mean, "work properly". If there is no patched libmnl, only
>> thing what happens is that the extack won't be processed and shown.
>> That is the same behaviour as if you compile iproute2 package without
>> libmnl - all works, used just don't see extack messages.
>> 
>> I don't see any problem in that. Do you?
>
>I have libmnl installed. I build iproute2. I don't get extended ack
>messages on devlink failures. That is a problem. That is building a
>command that is known not to work as it should given the build dependencies.

Known to work, but not for devlink. But nevermind.


Re: [patch net-next v2 00/10] Add support for resource abstraction

2018-01-25 Thread David Ahern
On 1/25/18 8:24 AM, Jiri Pirko wrote:
 Now I remember. You wrote it independently and but needed iproute2 be a
 delivery vehicle. It uses none of the common infrastructure from
 iproute2. Could we make this more difficult 
>>>
>>> Feel free to rewrite it to use lib/libnetlink.c. Should not be that
>>> hard. Note that at the time I was pushing devlink userspace, tipc also
>>> used libmnl as a part of iproute2, so devlink was not the first one.
>>> That is why I decided not to rewrite.
>>>
>>> As of the rest of the "common infrastructure", what exactly do you
>>> have in mind?
>>>
>>
>> This is what I am getting at. Apparently, these resource patches for
>> devlink require a patched libmnl to work properly. It is wrong for
> 
> What do you mean, "work properly". If there is no patched libmnl, only
> thing what happens is that the extack won't be processed and shown.
> That is the same behaviour as if you compile iproute2 package without
> libmnl - all works, used just don't see extack messages.
> 
> I don't see any problem in that. Do you?

I have libmnl installed. I build iproute2. I don't get extended ack
messages on devlink failures. That is a problem. That is building a
command that is known not to work as it should given the build dependencies.


Re: [patch net-next v2 00/10] Add support for resource abstraction

2018-01-25 Thread Jiri Pirko
Thu, Jan 04, 2018 at 05:58:51PM CET, d...@cumulusnetworks.com wrote:
>On 1/3/18 11:36 AM, Jiri Pirko wrote:
>> Wed, Jan 03, 2018 at 07:29:46PM CET, d...@cumulusnetworks.com wrote:
>>> On 1/3/18 11:17 AM, Jiri Pirko wrote:
 Wed, Jan 03, 2018 at 07:14:16PM CET, d...@cumulusnetworks.com wrote:
> On 1/3/18 11:05 AM, Arkadi Sharshevsky wrote:
>> As I stated this is a user-space bug which I fixed, and updated my repo
>> so please pull. Devlink uses mnl,and currently mnl does not support
>> extended ack. I added support for this in my local ver of libmnl:
>>
>> https://github.com/arkadis/libmnl.git
>>
>> On branch master, so you can check it out. Besides this bugs, which were
>> userspace, can please specify what are the pending problems from your
>> point of view? Thanks!
>
> devlink is in iproute2 package and it has extack support. See 'git log
> lib/libnetlink.c'

 Dave, devlink uses libmnl.

>>>
>>> Now I remember. You wrote it independently and but needed iproute2 be a
>>> delivery vehicle. It uses none of the common infrastructure from
>>> iproute2. Could we make this more difficult 
>> 
>> Feel free to rewrite it to use lib/libnetlink.c. Should not be that
>> hard. Note that at the time I was pushing devlink userspace, tipc also
>> used libmnl as a part of iproute2, so devlink was not the first one.
>> That is why I decided not to rewrite.
>> 
>> As of the rest of the "common infrastructure", what exactly do you
>> have in mind?
>> 
>
>This is what I am getting at. Apparently, these resource patches for
>devlink require a patched libmnl to work properly. It is wrong for

What do you mean, "work properly". If there is no patched libmnl, only
thing what happens is that the extack won't be processed and shown.
That is the same behaviour as if you compile iproute2 package without
libmnl - all works, used just don't see extack messages.

I don't see any problem in that. Do you?


>iproute2 to accept this patch and to build a devlink command that we
>know does not work. That means iproute2 needs a dependency on a specific
>version of libmnl - a version which does not yet exist because those
>changes have not been accepted and libmnl version released. Adding that
>dependency is going to inconvenience to all current users of the
>iproute2 repo.


Re: [patch net-next v2 00/10] Add support for resource abstraction

2018-01-04 Thread David Ahern
On 1/4/18 9:13 AM, Arkadi Sharshevsky wrote:
 Also, it seems like the occ of 0 is wrong since we know from past
 responses that if I set linear to 0 all of networking breaks.

Ok, this was a David bug. I was running ifreload after the devlink
reload command, but all of my connections to the switch are through
breakout ports and the ifreload was not running the devlink port split
command. Doing that before the ifreload and all is fine.


Re: [patch net-next v2 00/10] Add support for resource abstraction

2018-01-04 Thread David Miller
From: David Ahern 
Date: Thu, 4 Jan 2018 09:58:51 -0700

> This is what I am getting at. Apparently, these resource patches for
> devlink require a patched libmnl to work properly. It is wrong for
> iproute2 to accept this patch and to build a devlink command that we
> know does not work. That means iproute2 needs a dependency on a specific
> version of libmnl - a version which does not yet exist because those
> changes have not been accepted and libmnl version released. Adding that
> dependency is going to inconvenience to all current users of the
> iproute2 repo.

+1


Re: [patch net-next v2 00/10] Add support for resource abstraction

2018-01-04 Thread David Ahern
On 1/3/18 11:36 AM, Jiri Pirko wrote:
> Wed, Jan 03, 2018 at 07:29:46PM CET, d...@cumulusnetworks.com wrote:
>> On 1/3/18 11:17 AM, Jiri Pirko wrote:
>>> Wed, Jan 03, 2018 at 07:14:16PM CET, d...@cumulusnetworks.com wrote:
 On 1/3/18 11:05 AM, Arkadi Sharshevsky wrote:
> As I stated this is a user-space bug which I fixed, and updated my repo
> so please pull. Devlink uses mnl,and currently mnl does not support
> extended ack. I added support for this in my local ver of libmnl:
>
> https://github.com/arkadis/libmnl.git
>
> On branch master, so you can check it out. Besides this bugs, which were
> userspace, can please specify what are the pending problems from your
> point of view? Thanks!

 devlink is in iproute2 package and it has extack support. See 'git log
 lib/libnetlink.c'
>>>
>>> Dave, devlink uses libmnl.
>>>
>>
>> Now I remember. You wrote it independently and but needed iproute2 be a
>> delivery vehicle. It uses none of the common infrastructure from
>> iproute2. Could we make this more difficult 
> 
> Feel free to rewrite it to use lib/libnetlink.c. Should not be that
> hard. Note that at the time I was pushing devlink userspace, tipc also
> used libmnl as a part of iproute2, so devlink was not the first one.
> That is why I decided not to rewrite.
> 
> As of the rest of the "common infrastructure", what exactly do you
> have in mind?
> 

This is what I am getting at. Apparently, these resource patches for
devlink require a patched libmnl to work properly. It is wrong for
iproute2 to accept this patch and to build a devlink command that we
know does not work. That means iproute2 needs a dependency on a specific
version of libmnl - a version which does not yet exist because those
changes have not been accepted and libmnl version released. Adding that
dependency is going to inconvenience to all current users of the
iproute2 repo.


Re: [patch net-next v2 00/10] Add support for resource abstraction

2018-01-04 Thread David Ahern
On 1/4/18 9:13 AM, Arkadi Sharshevsky wrote:
> 
> 
> On 01/04/2018 05:58 PM, David Ahern wrote:
>> On 1/4/18 2:24 AM, Arkadi Sharshevsky wrote:
 Again, my comments all stem from user experience.

 Can you explain what "double_word" means for a unit? I would expect a
 units to be kB or count (or items or entries).

>>>
>>> Double word is 64 bit, dont understand why this is confusing.
>>
>> As Andrew pointed out, double word can have a range of sizes.
>>
>> To my point here, a 'unit' for a number should not be the number of bits
>> it is represented by. I believe all of the kvd sizes are in entries
>> (ie., a linear size of 98304 means I can have 98,304 entries in that
>> resource).
>>
>>>
 $ devlink resource show pci/:03:00.0
 pci/:03:00.0:
   name kvd size 245760 unit double_word size_valid true
   resources:
 name linear size 98304 occ 0 unit double_word
 name hash_double size 60416 unit double_word
 name hash_single size 87040 unit double_word

 While that is confusing here from the userspace command it goes hand in
 hand with patch 2 and:

 +enum devlink_resource_unit {
 +  DEVLINK_RESOURCE_UNIT_DOUBLE_WORD,
 +};


 Also, it seems like the occ of 0 is wrong since we know from past
 responses that if I set linear to 0 all of networking breaks.
>>>
>>> You are clearly misunderstanding what is occupancy of the resource
>>> and what kvd linear is good for. As I mentioned in the last response
>>> kvd linear is mapped to adjacency table. So in case its 0 no nexthop
>>> routes could be configured, this information is provided by the
>>> dpipe<-->resource.
>>>
>>> Occupancy means how much of the resource is used right now, why is
>>> this wrong? and how its related to the size 0 exactly?
>>
>> The summary line above shows the current kvd/linear occupancy is '0'.
>> That suggests my L3 only deployment is not using any kvd/linear which
>> means I can set its allocation to 0 and devote more kvd resources to the
>> hash regions.
>>
>> But, as I pointed out in previous responses I can not set linear to 0.
>> If I do all of networking breaks. That suggests that kvd/linear is used
>> by more networking entities than you are currently reporting. Hence,
>> telling me the kvd/linear occupancy is 0 is wrong.
>>
>> I believe the stems from the how you are determining occupancy and the
>> fact that not all tables have been implemented. Showing the current
>> occupancy of a resource is very helpful, so it should be part of the API.
>>
>> However, until mlxsw shows a proper value (either by implementing all
>> dpipe tables or changing how it is calculated) it should not be
>> returning that attribute. As it stands you are giving a user invalid data.
>>
> 
> Wait, the occupancy is very precise it goes directly to the linear
> allocator inside the driver and gives exact current usage. You assumed
> it goes to the dpipe tables which is incorrect.

I am trying *guess* based on this discussion as to why it is 0. In past
responses you push that the tables expose how the resource is used.

> 
> The linear kvd memory is used by:
> 1. The adjacency table is responsible for the ECMP and nexthop
>resolution.
> 2. ACL flexible actions blocks (dpipe will contain table for this).
> 
> I dont know what you configured but in case you got some remote
> routes with a nexthop the occ should not be zero.
> 
> If your router contains only local routes and dont use ACL you can
> shrink it to zero, no problem.

My current setup is an L3 only deployment, no ACLs and no bridges.
We are meeting in 15 minutes to discuss this design; let's use some of
the time for debugging this problem.


> 
>>>



 How does a user learn the granularity of a resource:

 $ devlink resource set pci/:03:00.0 path /kvd/hash_double size 5
 Error: mlxsw_spectrum: resource set with wrong granularity.

 Try again with 51000 and then 52000 and ... Why not export the
 granularity read-only? I don't see it in the proposed UAPI.

>>>
>>> I would like more adding the granularity size to the extack string
>>> instead of adding this to UAPI. The user will try to update once
>>> and will get the required granularity in the error message.
>>
>> A user should not have to run a command to get an error message to know
>> essential data to running a command with the right value. Further, do
>> not assume 'user' is a person or the devlink command.
>>
>> Since granularity is essential to running a valid command, it should be
>> an attribute for each resource.
>>
> 
> This is also your approach towards the min/max for resource sizes?
> Am I correct?

yes.


Re: [patch net-next v2 00/10] Add support for resource abstraction

2018-01-04 Thread Arkadi Sharshevsky


On 01/04/2018 05:58 PM, David Ahern wrote:
> On 1/4/18 2:24 AM, Arkadi Sharshevsky wrote:
>>> Again, my comments all stem from user experience.
>>>
>>> Can you explain what "double_word" means for a unit? I would expect a
>>> units to be kB or count (or items or entries).
>>>
>>
>> Double word is 64 bit, dont understand why this is confusing.
> 
> As Andrew pointed out, double word can have a range of sizes.
> 
> To my point here, a 'unit' for a number should not be the number of bits
> it is represented by. I believe all of the kvd sizes are in entries
> (ie., a linear size of 98304 means I can have 98,304 entries in that
> resource).
> 
>>
>>> $ devlink resource show pci/:03:00.0
>>> pci/:03:00.0:
>>>   name kvd size 245760 unit double_word size_valid true
>>>   resources:
>>> name linear size 98304 occ 0 unit double_word
>>> name hash_double size 60416 unit double_word
>>> name hash_single size 87040 unit double_word
>>>
>>> While that is confusing here from the userspace command it goes hand in
>>> hand with patch 2 and:
>>>
>>> +enum devlink_resource_unit {
>>> +   DEVLINK_RESOURCE_UNIT_DOUBLE_WORD,
>>> +};
>>>
>>>
>>> Also, it seems like the occ of 0 is wrong since we know from past
>>> responses that if I set linear to 0 all of networking breaks.
>>
>> You are clearly misunderstanding what is occupancy of the resource
>> and what kvd linear is good for. As I mentioned in the last response
>> kvd linear is mapped to adjacency table. So in case its 0 no nexthop
>> routes could be configured, this information is provided by the
>> dpipe<-->resource.
>>
>> Occupancy means how much of the resource is used right now, why is
>> this wrong? and how its related to the size 0 exactly?
> 
> The summary line above shows the current kvd/linear occupancy is '0'.
> That suggests my L3 only deployment is not using any kvd/linear which
> means I can set its allocation to 0 and devote more kvd resources to the
> hash regions.
> 
> But, as I pointed out in previous responses I can not set linear to 0.
> If I do all of networking breaks. That suggests that kvd/linear is used
> by more networking entities than you are currently reporting. Hence,
> telling me the kvd/linear occupancy is 0 is wrong.
> 
> I believe the stems from the how you are determining occupancy and the
> fact that not all tables have been implemented. Showing the current
> occupancy of a resource is very helpful, so it should be part of the API.
> 
> However, until mlxsw shows a proper value (either by implementing all
> dpipe tables or changing how it is calculated) it should not be
> returning that attribute. As it stands you are giving a user invalid data.
> 

Wait, the occupancy is very precise it goes directly to the linear
allocator inside the driver and gives exact current usage. You assumed
it goes to the dpipe tables which is incorrect.

The linear kvd memory is used by:
1. The adjacency table is responsible for the ECMP and nexthop
   resolution.
2. ACL flexible actions blocks (dpipe will contain table for this).

I dont know what you configured but in case you got some remote
routes with a nexthop the occ should not be zero.

If your router contains only local routes and dont use ACL you can
shrink it to zero, no problem.

>>
>>>
>>>
>>>
>>> How does a user learn the granularity of a resource:
>>>
>>> $ devlink resource set pci/:03:00.0 path /kvd/hash_double size 5
>>> Error: mlxsw_spectrum: resource set with wrong granularity.
>>>
>>> Try again with 51000 and then 52000 and ... Why not export the
>>> granularity read-only? I don't see it in the proposed UAPI.
>>>
>>
>> I would like more adding the granularity size to the extack string
>> instead of adding this to UAPI. The user will try to update once
>> and will get the required granularity in the error message.
> 
> A user should not have to run a command to get an error message to know
> essential data to running a command with the right value. Further, do
> not assume 'user' is a person or the devlink command.
> 
> Since granularity is essential to running a valid command, it should be
> an attribute for each resource.
> 

This is also your approach towards the min/max for resource sizes?
Am I correct?

> 
>>
>>>
>>> And then on the reload:
>>>
>>> $ devlink reload pci/:03:00.0
>>> Error: devlink: resources size validation failed.
>>>
>>> Since the reload is not doing any resource sizing that error message is
>>> confusing. Maybe something like "Sum of the resource components exceeds
>>> total size."
>>>
>>
>> No problem, sounds better.
>>
>>>
>>> Finally, I still contend a 1-line description of each of the resources
>>> goes a long way to improving the user friendliness of this set.
>>>
>>
>> Strongly against it.
>>
> 


Re: [patch net-next v2 00/10] Add support for resource abstraction

2018-01-04 Thread David Ahern
On 1/4/18 2:24 AM, Arkadi Sharshevsky wrote:
>> Again, my comments all stem from user experience.
>>
>> Can you explain what "double_word" means for a unit? I would expect a
>> units to be kB or count (or items or entries).
>>
> 
> Double word is 64 bit, dont understand why this is confusing.

As Andrew pointed out, double word can have a range of sizes.

To my point here, a 'unit' for a number should not be the number of bits
it is represented by. I believe all of the kvd sizes are in entries
(ie., a linear size of 98304 means I can have 98,304 entries in that
resource).

> 
>> $ devlink resource show pci/:03:00.0
>> pci/:03:00.0:
>>   name kvd size 245760 unit double_word size_valid true
>>   resources:
>> name linear size 98304 occ 0 unit double_word
>> name hash_double size 60416 unit double_word
>> name hash_single size 87040 unit double_word
>>
>> While that is confusing here from the userspace command it goes hand in
>> hand with patch 2 and:
>>
>> +enum devlink_resource_unit {
>> +DEVLINK_RESOURCE_UNIT_DOUBLE_WORD,
>> +};
>>
>>
>> Also, it seems like the occ of 0 is wrong since we know from past
>> responses that if I set linear to 0 all of networking breaks.
> 
> You are clearly misunderstanding what is occupancy of the resource
> and what kvd linear is good for. As I mentioned in the last response
> kvd linear is mapped to adjacency table. So in case its 0 no nexthop
> routes could be configured, this information is provided by the
> dpipe<-->resource.
> 
> Occupancy means how much of the resource is used right now, why is
> this wrong? and how its related to the size 0 exactly?

The summary line above shows the current kvd/linear occupancy is '0'.
That suggests my L3 only deployment is not using any kvd/linear which
means I can set its allocation to 0 and devote more kvd resources to the
hash regions.

But, as I pointed out in previous responses I can not set linear to 0.
If I do all of networking breaks. That suggests that kvd/linear is used
by more networking entities than you are currently reporting. Hence,
telling me the kvd/linear occupancy is 0 is wrong.

I believe the stems from the how you are determining occupancy and the
fact that not all tables have been implemented. Showing the current
occupancy of a resource is very helpful, so it should be part of the API.

However, until mlxsw shows a proper value (either by implementing all
dpipe tables or changing how it is calculated) it should not be
returning that attribute. As it stands you are giving a user invalid data.

> 
>>
>>
>>
>> How does a user learn the granularity of a resource:
>>
>> $ devlink resource set pci/:03:00.0 path /kvd/hash_double size 5
>> Error: mlxsw_spectrum: resource set with wrong granularity.
>>
>> Try again with 51000 and then 52000 and ... Why not export the
>> granularity read-only? I don't see it in the proposed UAPI.
>>
> 
> I would like more adding the granularity size to the extack string
> instead of adding this to UAPI. The user will try to update once
> and will get the required granularity in the error message.

A user should not have to run a command to get an error message to know
essential data to running a command with the right value. Further, do
not assume 'user' is a person or the devlink command.

Since granularity is essential to running a valid command, it should be
an attribute for each resource.


> 
>>
>> And then on the reload:
>>
>> $ devlink reload pci/:03:00.0
>> Error: devlink: resources size validation failed.
>>
>> Since the reload is not doing any resource sizing that error message is
>> confusing. Maybe something like "Sum of the resource components exceeds
>> total size."
>>
> 
> No problem, sounds better.
> 
>>
>> Finally, I still contend a 1-line description of each of the resources
>> goes a long way to improving the user friendliness of this set.
>>
> 
> Strongly against it.
> 



Re: [patch net-next v2 00/10] Add support for resource abstraction

2018-01-04 Thread Andrew Lunn
> Double word is 64 bit, dont understand why this is confusing.

In an ASCI, the definition of a word can be quite flexible. I've seen
designs using 14 bit words, since 14 bits was all that was needed to
represent the data to be held. I've also seen a 16 bit word used to
hold a signed value, with the binary point before the last nibble, so
it could hold -2047.9375 to +2047.9375. I might have that wrong. It
gave me a headache at the time, but the synthesizer had no such
problems.

Why not use u8, u16, u14, u32, u64, which we can all understand
without confusion.

Andrew


Re: [patch net-next v2 00/10] Add support for resource abstraction

2018-01-04 Thread Arkadi Sharshevsky


On 01/04/2018 04:28 AM, David Ahern wrote:
> On 1/3/18 11:05 AM, Arkadi Sharshevsky wrote:
>>
>>
>> On 01/02/2018 08:05 PM, David Ahern wrote:
>>> On 1/1/18 7:58 AM, Arkadi Sharshevsky wrote:

 Just to summarize the current fixes required:

 1. ERIF dpipe table size is reporting wrong size. More precisely the
ERIF table does not take rifs, so it should not be linked to the rif
bank resource (is not part of this patchset, future extension).
 2. Extended ACK user-space bug.
 3. ABI documentation- Not sure we agreed upon it, Jiri?

 If I missed something please respond. Nothing of the fixes mentioned
 above is relevant for this patchset actually.

>>>
>>> Can you fix the userspace command and then we come back to what else is
>>> needed? Right now, it is hard to tell what is a user space bug and what
>>> is a kernel space bug.
>>>
>>> For example:
>>> $ devlink resource set pci/:03:00.0 path /kvd/linear size 1
>>> $ devlink resource show pci/:03:00.0
>>> pci/:03:00.0:
>>>   name kvd size 245760 size_valid true
>>>   resources:
>>> name linear size 98304 occ 0
>>> name hash_double size 60416
>>> name hash_single size 87040
>>>
>>> The set command did not fail, yet there is no size_new arg in the output
>>> like there is for this change:
>>>
>>> $ devlink resource set pci/:03:00.0 path /kvd/linear size 0
>>> $ devlink resource show pci/:03:00.0
>>> pci/:03:00.0:
>>>   name kvd size 245760 size_valid true
>>>   resources:
>>> name linear size 98304 size_new 0 occ 0
>>> name hash_double size 60416
>>> name hash_single size 87040
>>>
>>
>> As I stated this is a user-space bug which I fixed, and updated my repo
>> so please pull. Devlink uses mnl,and currently mnl does not support
>> extended ack. I added support for this in my local ver of libmnl:
>>
>> https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Farkadis%2Flibmnl.git=02%7C01%7Carkadis%40mellanox.com%7C9a369b54cdec48a5e1d208d5531adbdb%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636506297202356822=6KOkBz5PHqu6z8nlexSdggZj42LE4geiYVFA%2BgcgaPE%3D=0
>>
>> On branch master, so you can check it out. Besides this bugs, which were
>> userspace, can please specify what are the pending problems from your
>> point of view? Thanks!
>>
> 
> Again, my comments all stem from user experience.
> 
> Can you explain what "double_word" means for a unit? I would expect a
> units to be kB or count (or items or entries).
>

Double word is 64 bit, dont understand why this is confusing.

> $ devlink resource show pci/:03:00.0
> pci/:03:00.0:
>   name kvd size 245760 unit double_word size_valid true
>   resources:
> name linear size 98304 occ 0 unit double_word
> name hash_double size 60416 unit double_word
> name hash_single size 87040 unit double_word
> 
> While that is confusing here from the userspace command it goes hand in
> hand with patch 2 and:
> 
> +enum devlink_resource_unit {
> + DEVLINK_RESOURCE_UNIT_DOUBLE_WORD,
> +};
> 
> 
> Also, it seems like the occ of 0 is wrong since we know from past
> responses that if I set linear to 0 all of networking breaks.

You are clearly misunderstanding what is occupancy of the resource
and what kvd linear is good for. As I mentioned in the last response
kvd linear is mapped to adjacency table. So in case its 0 no nexthop
routes could be configured, this information is provided by the
dpipe<-->resource.

Occupancy means how much of the resource is used right now, why is
this wrong? and how its related to the size 0 exactly?

> 
> 
> 
> How does a user learn the granularity of a resource:
> 
> $ devlink resource set pci/:03:00.0 path /kvd/hash_double size 5
> Error: mlxsw_spectrum: resource set with wrong granularity.
> 
> Try again with 51000 and then 52000 and ... Why not export the
> granularity read-only? I don't see it in the proposed UAPI.
> 

I would like more adding the granularity size to the extack string
instead of adding this to UAPI. The user will try to update once
and will get the required granularity in the error message.

> 
> And then on the reload:
> 
> $ devlink reload pci/:03:00.0
> Error: devlink: resources size validation failed.
> 
> Since the reload is not doing any resource sizing that error message is
> confusing. Maybe something like "Sum of the resource components exceeds
> total size."
> 

No problem, sounds better.

> 
> Finally, I still contend a 1-line description of each of the resources
> goes a long way to improving the user friendliness of this set.
>

Strongly against it.



Re: [patch net-next v2 00/10] Add support for resource abstraction

2018-01-03 Thread David Ahern
On 1/3/18 11:05 AM, Arkadi Sharshevsky wrote:
> 
> 
> On 01/02/2018 08:05 PM, David Ahern wrote:
>> On 1/1/18 7:58 AM, Arkadi Sharshevsky wrote:
>>>
>>> Just to summarize the current fixes required:
>>>
>>> 1. ERIF dpipe table size is reporting wrong size. More precisely the
>>>ERIF table does not take rifs, so it should not be linked to the rif
>>>bank resource (is not part of this patchset, future extension).
>>> 2. Extended ACK user-space bug.
>>> 3. ABI documentation- Not sure we agreed upon it, Jiri?
>>>
>>> If I missed something please respond. Nothing of the fixes mentioned
>>> above is relevant for this patchset actually.
>>>
>>
>> Can you fix the userspace command and then we come back to what else is
>> needed? Right now, it is hard to tell what is a user space bug and what
>> is a kernel space bug.
>>
>> For example:
>> $ devlink resource set pci/:03:00.0 path /kvd/linear size 1
>> $ devlink resource show pci/:03:00.0
>> pci/:03:00.0:
>>   name kvd size 245760 size_valid true
>>   resources:
>> name linear size 98304 occ 0
>> name hash_double size 60416
>> name hash_single size 87040
>>
>> The set command did not fail, yet there is no size_new arg in the output
>> like there is for this change:
>>
>> $ devlink resource set pci/:03:00.0 path /kvd/linear size 0
>> $ devlink resource show pci/:03:00.0
>> pci/:03:00.0:
>>   name kvd size 245760 size_valid true
>>   resources:
>> name linear size 98304 size_new 0 occ 0
>> name hash_double size 60416
>> name hash_single size 87040
>>
> 
> As I stated this is a user-space bug which I fixed, and updated my repo
> so please pull. Devlink uses mnl,and currently mnl does not support
> extended ack. I added support for this in my local ver of libmnl:
> 
> https://github.com/arkadis/libmnl.git
> 
> On branch master, so you can check it out. Besides this bugs, which were
> userspace, can please specify what are the pending problems from your
> point of view? Thanks!
> 

Again, my comments all stem from user experience.

Can you explain what "double_word" means for a unit? I would expect a
units to be kB or count (or items or entries).

$ devlink resource show pci/:03:00.0
pci/:03:00.0:
  name kvd size 245760 unit double_word size_valid true
  resources:
name linear size 98304 occ 0 unit double_word
name hash_double size 60416 unit double_word
name hash_single size 87040 unit double_word

While that is confusing here from the userspace command it goes hand in
hand with patch 2 and:

+enum devlink_resource_unit {
+   DEVLINK_RESOURCE_UNIT_DOUBLE_WORD,
+};


Also, it seems like the occ of 0 is wrong since we know from past
responses that if I set linear to 0 all of networking breaks.



How does a user learn the granularity of a resource:

$ devlink resource set pci/:03:00.0 path /kvd/hash_double size 5
Error: mlxsw_spectrum: resource set with wrong granularity.

Try again with 51000 and then 52000 and ... Why not export the
granularity read-only? I don't see it in the proposed UAPI.


And then on the reload:

$ devlink reload pci/:03:00.0
Error: devlink: resources size validation failed.

Since the reload is not doing any resource sizing that error message is
confusing. Maybe something like "Sum of the resource components exceeds
total size."


Finally, I still contend a 1-line description of each of the resources
goes a long way to improving the user friendliness of this set.


Re: [patch net-next v2 00/10] Add support for resource abstraction

2018-01-03 Thread Arkadi Sharshevsky


On 01/03/2018 08:29 PM, David Ahern wrote:
> On 1/3/18 11:17 AM, Jiri Pirko wrote:
>> Wed, Jan 03, 2018 at 07:14:16PM CET, d...@cumulusnetworks.com wrote:
>>> On 1/3/18 11:05 AM, Arkadi Sharshevsky wrote:
 As I stated this is a user-space bug which I fixed, and updated my repo
 so please pull. Devlink uses mnl,and currently mnl does not support
 extended ack. I added support for this in my local ver of libmnl:

 https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Farkadis%2Flibmnl.git=02%7C01%7Carkadis%40mellanox.com%7C5c86b6240eb84459c6ae08d552d7f9a4%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636506009929977440=sgrNzMhPwe63BIVxexZTjl%2FXqW51kpuRiHVhTDNaa70%3D=0

 On branch master, so you can check it out. Besides this bugs, which were
 userspace, can please specify what are the pending problems from your
 point of view? Thanks!
>>>
>>> devlink is in iproute2 package and it has extack support. See 'git log
>>> lib/libnetlink.c'
>>
>> Dave, devlink uses libmnl.
>>
> 
> Now I remember. You wrote it independently and but needed iproute2 be a
> delivery vehicle. It uses none of the common infrastructure from
> iproute2. Could we make this more difficult 
> 
> Sometime in the next day I will jump through the hoops to get a proper
> devlink command.
> 

This actually was very confusing, I think the extack should be
handled by libmnl and iproute should use mnl_cb_run() routines
and not to implement its own. That way we could both benefit
from that.

You actually do use libmnl in libnetlink.c only for parsing
the headers, and its a dependency for extack handling.

I see this as a completely independent user space issue, which
doesn't have to do anything with this patchset. Not to mention
that everything is working right now.











Re: [patch net-next v2 00/10] Add support for resource abstraction

2018-01-03 Thread Jiri Pirko
Wed, Jan 03, 2018 at 07:29:46PM CET, d...@cumulusnetworks.com wrote:
>On 1/3/18 11:17 AM, Jiri Pirko wrote:
>> Wed, Jan 03, 2018 at 07:14:16PM CET, d...@cumulusnetworks.com wrote:
>>> On 1/3/18 11:05 AM, Arkadi Sharshevsky wrote:
 As I stated this is a user-space bug which I fixed, and updated my repo
 so please pull. Devlink uses mnl,and currently mnl does not support
 extended ack. I added support for this in my local ver of libmnl:

 https://github.com/arkadis/libmnl.git

 On branch master, so you can check it out. Besides this bugs, which were
 userspace, can please specify what are the pending problems from your
 point of view? Thanks!
>>>
>>> devlink is in iproute2 package and it has extack support. See 'git log
>>> lib/libnetlink.c'
>> 
>> Dave, devlink uses libmnl.
>> 
>
>Now I remember. You wrote it independently and but needed iproute2 be a
>delivery vehicle. It uses none of the common infrastructure from
>iproute2. Could we make this more difficult 

Feel free to rewrite it to use lib/libnetlink.c. Should not be that
hard. Note that at the time I was pushing devlink userspace, tipc also
used libmnl as a part of iproute2, so devlink was not the first one.
That is why I decided not to rewrite.

As of the rest of the "common infrastructure", what exactly do you
have in mind?

>
>Sometime in the next day I will jump through the hoops to get a proper
>devlink command.


Re: [patch net-next v2 00/10] Add support for resource abstraction

2018-01-03 Thread David Ahern
On 1/3/18 11:17 AM, Jiri Pirko wrote:
> Wed, Jan 03, 2018 at 07:14:16PM CET, d...@cumulusnetworks.com wrote:
>> On 1/3/18 11:05 AM, Arkadi Sharshevsky wrote:
>>> As I stated this is a user-space bug which I fixed, and updated my repo
>>> so please pull. Devlink uses mnl,and currently mnl does not support
>>> extended ack. I added support for this in my local ver of libmnl:
>>>
>>> https://github.com/arkadis/libmnl.git
>>>
>>> On branch master, so you can check it out. Besides this bugs, which were
>>> userspace, can please specify what are the pending problems from your
>>> point of view? Thanks!
>>
>> devlink is in iproute2 package and it has extack support. See 'git log
>> lib/libnetlink.c'
> 
> Dave, devlink uses libmnl.
> 

Now I remember. You wrote it independently and but needed iproute2 be a
delivery vehicle. It uses none of the common infrastructure from
iproute2. Could we make this more difficult 

Sometime in the next day I will jump through the hoops to get a proper
devlink command.


Re: [patch net-next v2 00/10] Add support for resource abstraction

2018-01-03 Thread Jiri Pirko
Wed, Jan 03, 2018 at 07:14:16PM CET, d...@cumulusnetworks.com wrote:
>On 1/3/18 11:05 AM, Arkadi Sharshevsky wrote:
>> As I stated this is a user-space bug which I fixed, and updated my repo
>> so please pull. Devlink uses mnl,and currently mnl does not support
>> extended ack. I added support for this in my local ver of libmnl:
>> 
>> https://github.com/arkadis/libmnl.git
>> 
>> On branch master, so you can check it out. Besides this bugs, which were
>> userspace, can please specify what are the pending problems from your
>> point of view? Thanks!
>
>devlink is in iproute2 package and it has extack support. See 'git log
>lib/libnetlink.c'

Dave, devlink uses libmnl.


Re: [patch net-next v2 00/10] Add support for resource abstraction

2018-01-03 Thread David Ahern
On 1/3/18 11:05 AM, Arkadi Sharshevsky wrote:
> As I stated this is a user-space bug which I fixed, and updated my repo
> so please pull. Devlink uses mnl,and currently mnl does not support
> extended ack. I added support for this in my local ver of libmnl:
> 
> https://github.com/arkadis/libmnl.git
> 
> On branch master, so you can check it out. Besides this bugs, which were
> userspace, can please specify what are the pending problems from your
> point of view? Thanks!

devlink is in iproute2 package and it has extack support. See 'git log
lib/libnetlink.c'

You do not need to modify libmnl to get extack output in devlink. Can
you update the command accordingly? Once it works I will come back to
this topic and take another look.


Re: [patch net-next v2 00/10] Add support for resource abstraction

2018-01-03 Thread Arkadi Sharshevsky


On 01/02/2018 08:05 PM, David Ahern wrote:
> On 1/1/18 7:58 AM, Arkadi Sharshevsky wrote:
>>
>> Just to summarize the current fixes required:
>>
>> 1. ERIF dpipe table size is reporting wrong size. More precisely the
>>ERIF table does not take rifs, so it should not be linked to the rif
>>bank resource (is not part of this patchset, future extension).
>> 2. Extended ACK user-space bug.
>> 3. ABI documentation- Not sure we agreed upon it, Jiri?
>>
>> If I missed something please respond. Nothing of the fixes mentioned
>> above is relevant for this patchset actually.
>>
> 
> Can you fix the userspace command and then we come back to what else is
> needed? Right now, it is hard to tell what is a user space bug and what
> is a kernel space bug.
> 
> For example:
> $ devlink resource set pci/:03:00.0 path /kvd/linear size 1
> $ devlink resource show pci/:03:00.0
> pci/:03:00.0:
>   name kvd size 245760 size_valid true
>   resources:
> name linear size 98304 occ 0
> name hash_double size 60416
> name hash_single size 87040
> 
> The set command did not fail, yet there is no size_new arg in the output
> like there is for this change:
> 
> $ devlink resource set pci/:03:00.0 path /kvd/linear size 0
> $ devlink resource show pci/:03:00.0
> pci/:03:00.0:
>   name kvd size 245760 size_valid true
>   resources:
> name linear size 98304 size_new 0 occ 0
> name hash_double size 60416
> name hash_single size 87040
> 

As I stated this is a user-space bug which I fixed, and updated my repo
so please pull. Devlink uses mnl,and currently mnl does not support
extended ack. I added support for this in my local ver of libmnl:

https://github.com/arkadis/libmnl.git

On branch master, so you can check it out. Besides this bugs, which were
userspace, can please specify what are the pending problems from your
point of view? Thanks!












Re: [patch net-next v2 00/10] Add support for resource abstraction

2018-01-02 Thread David Ahern
On 1/1/18 7:58 AM, Arkadi Sharshevsky wrote:
> 
> Just to summarize the current fixes required:
> 
> 1. ERIF dpipe table size is reporting wrong size. More precisely the
>ERIF table does not take rifs, so it should not be linked to the rif
>bank resource (is not part of this patchset, future extension).
> 2. Extended ACK user-space bug.
> 3. ABI documentation- Not sure we agreed upon it, Jiri?
> 
> If I missed something please respond. Nothing of the fixes mentioned
> above is relevant for this patchset actually.
> 

Can you fix the userspace command and then we come back to what else is
needed? Right now, it is hard to tell what is a user space bug and what
is a kernel space bug.

For example:
$ devlink resource set pci/:03:00.0 path /kvd/linear size 1
$ devlink resource show pci/:03:00.0
pci/:03:00.0:
  name kvd size 245760 size_valid true
  resources:
name linear size 98304 occ 0
name hash_double size 60416
name hash_single size 87040

The set command did not fail, yet there is no size_new arg in the output
like there is for this change:

$ devlink resource set pci/:03:00.0 path /kvd/linear size 0
$ devlink resource show pci/:03:00.0
pci/:03:00.0:
  name kvd size 245760 size_valid true
  resources:
name linear size 98304 size_new 0 occ 0
name hash_double size 60416
name hash_single size 87040



Re: [patch net-next v2 00/10] Add support for resource abstraction

2018-01-02 Thread Jiri Pirko
Tue, Jan 02, 2018 at 02:41:13PM CET, and...@lunn.ch wrote:
>> Question is where to put it. It is mlxsw-specific thing, moreover,
>> Spectrum-specific thing, same as dpipe tables etc. Not sure. Perhaps
>> Documentation/networking/mlxsw.txt ?
>
>Hi Jiri
>
>Documentation/ABI seems like the correct place. There is nothing in
>the README which says it is limited to files. You could use an name
>like devlink-driver-mlxsw.

Okay. Sounds sane. Thanks.


Re: [patch net-next v2 00/10] Add support for resource abstraction

2018-01-02 Thread Andrew Lunn
> Question is where to put it. It is mlxsw-specific thing, moreover,
> Spectrum-specific thing, same as dpipe tables etc. Not sure. Perhaps
> Documentation/networking/mlxsw.txt ?

Hi Jiri

Documentation/ABI seems like the correct place. There is nothing in
the README which says it is limited to files. You could use an name
like devlink-driver-mlxsw.

 Andrew


Re: [patch net-next v2 00/10] Add support for resource abstraction

2018-01-02 Thread Jiri Pirko
Mon, Jan 01, 2018 at 03:58:33PM CET, arka...@mellanox.com wrote:
>
>
>On 12/26/2017 01:23 PM, Jiri Pirko wrote:
>> From: Jiri Pirko 
>> 
>> Many of the ASIC's internal resources are limited and are shared between
>> several hardware procedures. For example, unified hash-based memory can
>> be used for many lookup purposes, like FDB and LPM. In many cases the user
>> can provide a partitioning scheme for such a resource in order to perform
>> fine tuning for his application. In such cases performing driver reload is
>> needed for the changes to take place, thus this patchset also adds support
>> for hot reload.
>> 
>> Such an abstraction can be coupled with devlink's dpipe interface, which
>> models the ASIC's pipeline as a graph of match/action tables. By modeling
>> the hardware resource object, and by coupling it to several dpipe tables,
>> further visibility can be achieved in order to debug ASIC-wide issues.
>> 
>> The proposed interface will provide the user the ability to understand the
>> limitations of the hardware, and receive notification regarding its 
>> occupancy.
>> Furthermore, monitoring the resource occupancy can be done in real-time and
>> can be useful in many cases.
>> ---
>> Userspace part prototype can be found at 
>> https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Farkadis%2Fiproute2%2F=02%7C01%7Carkadis%40mellanox.com%7C1ae3d8b4854a454e21e008d54c5329e3%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636498842440762657=7MC2BFQFxjnmHqy2sOOL9VEa4ZGq6e5Z2n2WvuNgyFk%3D=0
>> at resource_dev branch.
>> 
>> v1->v2
>> - Add resource size attribute.
>> - Fix split bug.
>> 
>
>Just to summarize the current fixes required:
>
>1. ERIF dpipe table size is reporting wrong size. More precisely the
>   ERIF table does not take rifs, so it should not be linked to the rif
>   bank resource (is not part of this patchset, future extension).
>2. Extended ACK user-space bug.
>3. ABI documentation- Not sure we agreed upon it, Jiri?

Question is where to put it. It is mlxsw-specific thing, moreover,
Spectrum-specific thing, same as dpipe tables etc. Not sure. Perhaps
Documentation/networking/mlxsw.txt ?


>
>If I missed something please respond. Nothing of the fixes mentioned
>above is relevant for this patchset actually.
>
>Couple of key-points:
>
>1. Constrains\trade off about setting the sizes - this can be obtained
>   trivially from the resource tree nested structure.
>2. Dpipe provides the mapping of hardware processes to resources.
>3. Units - each resource specifies his units, if dpipe table's size is
>   X and its related to some resource its size is normalized to that
>   resources basic unit.
>
>IMO this is the most hardware exact interaction, and this is the way it
>should be exported from the kernel, if something is not presented in
>'user' convenient way some utilities can be implemented in userspace
>to easily do it. Furthermore, some examples will be provided for the
>whole kvd tree partition for different cases (IPv6 heavy etc..).
>Advanced user will be able to tweak it as they like.
>
>Regarding the 'switchdev' layer I think that kernel's software tables
>like nexthops/neigh/routes should be mapped to dpipe tables and not
>to resources directly:

Sure. dpipe table -> resource mapping is the only one that makes sense.

>
>kernel_fdb--> dpipe_fdb -->/kvd/hash_single.
>
>> Arkadi Sharshevsky (10):
>>   devlink: Add per devlink instance lock
>>   devlink: Add support for resource abstraction
>>   devlink: Add support for reload
>>   devlink: Add relation between dpipe and resource
>>   mlxsw: pci: Add support for performing bus reset
>>   mlxsw: spectrum: Register KVD resources with devlink
>>   mlxsw: spectrum_dpipe: Connect dpipe tables to resources
>>   mlxsw: spectrum: Add support for getting kvdl occupancy
>>   mlxsw: pci: Add support for getting resource through devlink
>>   mlxsw: core: Add support for reload
>> 
>>  drivers/net/ethernet/mellanox/mlxsw/core.c |  85 ++-
>>  drivers/net/ethernet/mellanox/mlxsw/core.h |  16 +-
>>  drivers/net/ethernet/mellanox/mlxsw/i2c.c  |   5 +-
>>  drivers/net/ethernet/mellanox/mlxsw/pci.c  |  98 ++--
>>  drivers/net/ethernet/mellanox/mlxsw/spectrum.c | 205 
>>  drivers/net/ethernet/mellanox/mlxsw/spectrum.h |  13 +
>>  .../net/ethernet/mellanox/mlxsw/spectrum_dpipe.c   |  72 ++-
>>  .../net/ethernet/mellanox/mlxsw/spectrum_kvdl.c|  26 +
>>  include/net/devlink.h  |  97 
>>  include/uapi/linux/devlink.h   |  21 +
>>  net/core/devlink.c | 573 
>> ++---
>>  11 files changed, 1079 insertions(+), 132 deletions(-)
>> 


Re: [patch net-next v2 00/10] Add support for resource abstraction

2018-01-01 Thread Arkadi Sharshevsky


On 12/26/2017 01:23 PM, Jiri Pirko wrote:
> From: Jiri Pirko 
> 
> Many of the ASIC's internal resources are limited and are shared between
> several hardware procedures. For example, unified hash-based memory can
> be used for many lookup purposes, like FDB and LPM. In many cases the user
> can provide a partitioning scheme for such a resource in order to perform
> fine tuning for his application. In such cases performing driver reload is
> needed for the changes to take place, thus this patchset also adds support
> for hot reload.
> 
> Such an abstraction can be coupled with devlink's dpipe interface, which
> models the ASIC's pipeline as a graph of match/action tables. By modeling
> the hardware resource object, and by coupling it to several dpipe tables,
> further visibility can be achieved in order to debug ASIC-wide issues.
> 
> The proposed interface will provide the user the ability to understand the
> limitations of the hardware, and receive notification regarding its occupancy.
> Furthermore, monitoring the resource occupancy can be done in real-time and
> can be useful in many cases.
> ---
> Userspace part prototype can be found at 
> https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Farkadis%2Fiproute2%2F=02%7C01%7Carkadis%40mellanox.com%7C1ae3d8b4854a454e21e008d54c5329e3%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636498842440762657=7MC2BFQFxjnmHqy2sOOL9VEa4ZGq6e5Z2n2WvuNgyFk%3D=0
> at resource_dev branch.
> 
> v1->v2
> - Add resource size attribute.
> - Fix split bug.
> 

Just to summarize the current fixes required:

1. ERIF dpipe table size is reporting wrong size. More precisely the
   ERIF table does not take rifs, so it should not be linked to the rif
   bank resource (is not part of this patchset, future extension).
2. Extended ACK user-space bug.
3. ABI documentation- Not sure we agreed upon it, Jiri?

If I missed something please respond. Nothing of the fixes mentioned
above is relevant for this patchset actually.

Couple of key-points:

1. Constrains\trade off about setting the sizes - this can be obtained
   trivially from the resource tree nested structure.
2. Dpipe provides the mapping of hardware processes to resources.
3. Units - each resource specifies his units, if dpipe table's size is
   X and its related to some resource its size is normalized to that
   resources basic unit.

IMO this is the most hardware exact interaction, and this is the way it
should be exported from the kernel, if something is not presented in
'user' convenient way some utilities can be implemented in userspace
to easily do it. Furthermore, some examples will be provided for the
whole kvd tree partition for different cases (IPv6 heavy etc..).
Advanced user will be able to tweak it as they like.

Regarding the 'switchdev' layer I think that kernel's software tables
like nexthops/neigh/routes should be mapped to dpipe tables and not
to resources directly:

kernel_fdb--> dpipe_fdb -->/kvd/hash_single.

> Arkadi Sharshevsky (10):
>   devlink: Add per devlink instance lock
>   devlink: Add support for resource abstraction
>   devlink: Add support for reload
>   devlink: Add relation between dpipe and resource
>   mlxsw: pci: Add support for performing bus reset
>   mlxsw: spectrum: Register KVD resources with devlink
>   mlxsw: spectrum_dpipe: Connect dpipe tables to resources
>   mlxsw: spectrum: Add support for getting kvdl occupancy
>   mlxsw: pci: Add support for getting resource through devlink
>   mlxsw: core: Add support for reload
> 
>  drivers/net/ethernet/mellanox/mlxsw/core.c |  85 ++-
>  drivers/net/ethernet/mellanox/mlxsw/core.h |  16 +-
>  drivers/net/ethernet/mellanox/mlxsw/i2c.c  |   5 +-
>  drivers/net/ethernet/mellanox/mlxsw/pci.c  |  98 ++--
>  drivers/net/ethernet/mellanox/mlxsw/spectrum.c | 205 
>  drivers/net/ethernet/mellanox/mlxsw/spectrum.h |  13 +
>  .../net/ethernet/mellanox/mlxsw/spectrum_dpipe.c   |  72 ++-
>  .../net/ethernet/mellanox/mlxsw/spectrum_kvdl.c|  26 +
>  include/net/devlink.h  |  97 
>  include/uapi/linux/devlink.h   |  21 +
>  net/core/devlink.c | 573 
> ++---
>  11 files changed, 1079 insertions(+), 132 deletions(-)
> 


Re: [patch net-next v2 00/10] Add support for resource abstraction

2018-01-01 Thread Arkadi Sharshevsky


On 12/31/2017 05:46 PM, David Ahern wrote:
> On 12/31/17 3:52 AM, Arkadi Sharshevsky wrote:
>>> [1] This is allowed by the current patch set and perhaps it should not be:
>>>
>>> $ ip ro ls vrf vrf1101
>>> unreachable default metric 8192
>>> 11.2.51.0/24 dev swp1s0.51 proto kernel scope link src 11.2.51.1 offload
>>> 11.3.51.0/24 dev swp1s1.51 proto kernel scope link src 11.3.51.1 offload
>>> 11.4.51.0/24 dev swp1s2.51 proto kernel scope link src 11.4.51.1 offload
>>> 11.5.51.0/24 dev swp1s3.51 proto kernel scope link src 11.5.51.1 offload
>>> 11.6.51.0/24 dev swp3s0.51 proto kernel scope link src 11.6.51.1 offload
>>> 11.7.51.0/24 dev swp3s1.51 proto kernel scope link src 11.7.51.1 offload
>>> 11.8.51.0/24 dev swp3s2.51 proto kernel scope link src 11.8.51.1 offload
>>> 11.9.51.0/24 dev swp3s3.51 proto kernel scope link src 11.9.51.1 offload
>>>
>>> $ devlink resource set pci/:03:00.0 path /kvd/linear size 0
>>
>> This line actually did nothing, because size zero is not acceptable
>> see patch 6. This is pure userpsace problem that error is not shown.
> 
> Then perhaps you have a kernel side bug. After the reload I get this:
> 
> $ devlink resource show pci/:03:00.0
> pci/:03:00.0:
>   name kvd size 245760 size_valid true
>   resources:
> name linear size 0 occ 0
> name hash_double size 60416
> name hash_single size 87040
> 

Actually no bug here, the linear can be zero. This implies no nexthop
routes. The adj table uses it as you can see.

> 
>>
>> You can verify it by dumping the resources and see that there is no
>> pending change (only size and not size_new).
>>
>>> $ devlink reload pci/:03:00.0
>>> $ ip ro ls vrf vrf1101
>>> unreachable default metric 8192
>>>
>>
>> So you just performed full reload of the driver which includes
>> unregistration of all the netdevs and full init. KVD update requires
>> full teardown of the driver.
> 
> you are right, I forgot to do a networking reload. Because of the above
> (0 was actually taken) all kinds of errors are spewed on 'ifreload -av'
> and there is no change to the ro ls:
> 
> $  ip ro ls vrf vrf1101
> unreachable default metric 8192
> 
>>
>> The system will not get back to the same state after reloading,
>> It's should be done on init. But it doesn't have to be like this
>> this, each driver provides his own reload devlink op implementation
>> so in our case full blown reset is required.
>>
>>
>>> [2] Same exact result for setting hash_double to 0:
>>> $ ip ro ls vrf vrf1101
>>> unreachable default metric 8192
>>> 11.2.51.0/24 dev swp1s0.51 proto kernel scope link src 11.2.51.1 offload
>>> 11.3.51.0/24 dev swp1s1.51 proto kernel scope link src 11.3.51.1 offload
>>> 11.4.51.0/24 dev swp1s2.51 proto kernel scope link src 11.4.51.1 offload
>>> 11.5.51.0/24 dev swp1s3.51 proto kernel scope link src 11.5.51.1 offload
>>> 11.6.51.0/24 dev swp3s0.51 proto kernel scope link src 11.6.51.1 offload
>>> 11.7.51.0/24 dev swp3s1.51 proto kernel scope link src 11.7.51.1 offload
>>> 11.8.51.0/24 dev swp3s2.51 proto kernel scope link src 11.8.51.1 offload
>>> 11.9.51.0/24 dev swp3s3.51 proto kernel scope link src 11.9.51.1 offload
>>>
>>> $ devlink resource set pci/:03:00.0 path /kvd/hash_double size 0
> 
> On this command you are correct, 0 is not taken:
> $ devlink resource set pci/:03:00.0 path /kvd/hash_double size 0
> $ devlink resource show pci/:03:00.0
> pci/:03:00.0:
>   name kvd size 245760 size_valid true
>   resources:
> name linear size 0 occ 0
> name hash_double size 60416
> name hash_single size 87040
> 
> but the 'set' command did not fail with a proper extack based error
> message, so consider this another a bug report.
> 

Yeah, this is a bug, but a userspace one. I actually sniffed the nl
messages with nlmon and saw the extended ack packet with the required
string.


Re: [patch net-next v2 00/10] Add support for resource abstraction

2017-12-31 Thread David Ahern
On 12/31/17 3:52 AM, Arkadi Sharshevsky wrote:
>> [1] This is allowed by the current patch set and perhaps it should not be:
>>
>> $ ip ro ls vrf vrf1101
>> unreachable default metric 8192
>> 11.2.51.0/24 dev swp1s0.51 proto kernel scope link src 11.2.51.1 offload
>> 11.3.51.0/24 dev swp1s1.51 proto kernel scope link src 11.3.51.1 offload
>> 11.4.51.0/24 dev swp1s2.51 proto kernel scope link src 11.4.51.1 offload
>> 11.5.51.0/24 dev swp1s3.51 proto kernel scope link src 11.5.51.1 offload
>> 11.6.51.0/24 dev swp3s0.51 proto kernel scope link src 11.6.51.1 offload
>> 11.7.51.0/24 dev swp3s1.51 proto kernel scope link src 11.7.51.1 offload
>> 11.8.51.0/24 dev swp3s2.51 proto kernel scope link src 11.8.51.1 offload
>> 11.9.51.0/24 dev swp3s3.51 proto kernel scope link src 11.9.51.1 offload
>>
>> $ devlink resource set pci/:03:00.0 path /kvd/linear size 0
> 
> This line actually did nothing, because size zero is not acceptable
> see patch 6. This is pure userpsace problem that error is not shown.

Then perhaps you have a kernel side bug. After the reload I get this:

$ devlink resource show pci/:03:00.0
pci/:03:00.0:
  name kvd size 245760 size_valid true
  resources:
name linear size 0 occ 0
name hash_double size 60416
name hash_single size 87040


> 
> You can verify it by dumping the resources and see that there is no
> pending change (only size and not size_new).
> 
>> $ devlink reload pci/:03:00.0
>> $ ip ro ls vrf vrf1101
>> unreachable default metric 8192
>>
> 
> So you just performed full reload of the driver which includes
> unregistration of all the netdevs and full init. KVD update requires
> full teardown of the driver.

you are right, I forgot to do a networking reload. Because of the above
(0 was actually taken) all kinds of errors are spewed on 'ifreload -av'
and there is no change to the ro ls:

$  ip ro ls vrf vrf1101
unreachable default metric 8192

> 
> The system will not get back to the same state after reloading,
> It's should be done on init. But it doesn't have to be like this
> this, each driver provides his own reload devlink op implementation
> so in our case full blown reset is required.
> 
> 
>> [2] Same exact result for setting hash_double to 0:
>> $ ip ro ls vrf vrf1101
>> unreachable default metric 8192
>> 11.2.51.0/24 dev swp1s0.51 proto kernel scope link src 11.2.51.1 offload
>> 11.3.51.0/24 dev swp1s1.51 proto kernel scope link src 11.3.51.1 offload
>> 11.4.51.0/24 dev swp1s2.51 proto kernel scope link src 11.4.51.1 offload
>> 11.5.51.0/24 dev swp1s3.51 proto kernel scope link src 11.5.51.1 offload
>> 11.6.51.0/24 dev swp3s0.51 proto kernel scope link src 11.6.51.1 offload
>> 11.7.51.0/24 dev swp3s1.51 proto kernel scope link src 11.7.51.1 offload
>> 11.8.51.0/24 dev swp3s2.51 proto kernel scope link src 11.8.51.1 offload
>> 11.9.51.0/24 dev swp3s3.51 proto kernel scope link src 11.9.51.1 offload
>>
>> $ devlink resource set pci/:03:00.0 path /kvd/hash_double size 0

On this command you are correct, 0 is not taken:
$ devlink resource set pci/:03:00.0 path /kvd/hash_double size 0
$ devlink resource show pci/:03:00.0
pci/:03:00.0:
  name kvd size 245760 size_valid true
  resources:
name linear size 0 occ 0
name hash_double size 60416
name hash_single size 87040

but the 'set' command did not fail with a proper extack based error
message, so consider this another a bug report.


Re: [patch net-next v2 00/10] Add support for resource abstraction

2017-12-31 Thread Arkadi Sharshevsky


On 12/30/2017 11:15 PM, David Ahern wrote:
> On 12/28/17 1:21 AM, Yuval Mintz wrote:
>> I think it goes the other way around. The dpipe tables are the ones that
>> can be translated to functionality; The resources are internal and 
>> HW-specific
>> representing the possible internal division of resources -
>> but a given resource sn't necessarily mapped to a single networking feature.
>> [It might be in some cases, but not in the general case]
> 
> This is what I am getting at -- a single resource /kvd/linear is used
> for multiple networking features, and those networking features do map
> to well known entities -- fdb entries, ACL entries, ipv4/v6 host
> entries, LPM entries, etc.
> 
> Nothing about the output from devlink helps the user in any way to
> understand how to change the resource values. Saying that these

The current patchset adds the following dpipe table <--> resource
relation

host4 -- hash single
host6 -- hash double
adj -- linear

By dumping the resources via the 'resource show' you can the tree like
structure, you can see that you have a tradeoff between those subparts.
So for example if a user would like to increase the number of nexthops
with the expense of neighbors, it is pretty clear. As more dpipe table
will be introduced this relations will be more complete and the user
will get the complete view of the ASIC.

Just to summarize, the user gets the following info
1. Constrains\trade off about setting the sizes -  this you get
   by the tree structure.
2. Each hardware process which use this resource is mapped to it

By combining those two you can get the most accurate information
about what your change will do. Partitioning of the KVD is very delicate
process, because the hardware is complex. Many hardware processes are
pointing to this memory and size changes effect the whole ASIC, as I
mentioned as more of the pipeline will be exposed via dpipe the user
will get a more precise vision of the hardware.

We will provide some recommended and tested configuration of the whole
mlxsw resource tree for different user scenarios. A more experienced
user can do it for himself, if he got some very special scenario.


> resources, what they mean and how they are used is MLX proprietary and
> is known only to MLX employees and those with MLX agreements is not
> acceptable. Likewise, requiring some network admin to deep dive into the
> mlxsw driver to piece together how kvd/linear (for example) is used is
> not acceptable.
> 
> The cover letter touts "Many of the ASIC's internal resources are
> limited and are shared between several hardware procedures. For example,
> unified hash-based memory can be used for many lookup purposes, like FDB
> and LPM. In many cases the user can provide a partitioning scheme for
> such a resource in order to perform fine tuning for his application."
> 
> Great, now give the user some indication of how to do that. Is setting
> /kvd/linear to 0 acceptable? If not, why? What functionality is lost?
> (Apparently, everything [1].)
>
> The dpipe tables list some correlation between the kvd resources and
> tables but that is not a complete list and again there is nothing to
> tell a user that it is only a partial list of how a kvd resource is

This is work in progress, the LPM block will be exposed as the last
L3 part. Then we will start the l2 part of the ASIC.

> used. For example, it shows ipv4 host is in /kvd/hash_single and that is
> all it shows. So if I have an ipv6 only deployment can I conclude that I
> can set /kvd/hash_single to 0? Or the reverse, can I set hash_double to
> 0 for an ipv4 only deployment? From the limited information given, it is
> reasonable for a user to assume yes and has to learn through trial and
> error what can be done. [2]
> 

So you want to add min/max size attribute? I think this its not needed.

> -
> 
> [1] This is allowed by the current patch set and perhaps it should not be:
> 
> $ ip ro ls vrf vrf1101
> unreachable default metric 8192
> 11.2.51.0/24 dev swp1s0.51 proto kernel scope link src 11.2.51.1 offload
> 11.3.51.0/24 dev swp1s1.51 proto kernel scope link src 11.3.51.1 offload
> 11.4.51.0/24 dev swp1s2.51 proto kernel scope link src 11.4.51.1 offload
> 11.5.51.0/24 dev swp1s3.51 proto kernel scope link src 11.5.51.1 offload
> 11.6.51.0/24 dev swp3s0.51 proto kernel scope link src 11.6.51.1 offload
> 11.7.51.0/24 dev swp3s1.51 proto kernel scope link src 11.7.51.1 offload
> 11.8.51.0/24 dev swp3s2.51 proto kernel scope link src 11.8.51.1 offload
> 11.9.51.0/24 dev swp3s3.51 proto kernel scope link src 11.9.51.1 offload
> 
> $ devlink resource set pci/:03:00.0 path /kvd/linear size 0

This line actually did nothing, because size zero is not acceptable
see patch 6. This is pure userpsace problem that error is not shown.

You can verify it by dumping the resources and see that there is no
pending change (only size and not size_new).

> $ devlink reload pci/:03:00.0
> $ ip ro ls vrf vrf1101
> unreachable default 

Re: [patch net-next v2 00/10] Add support for resource abstraction

2017-12-30 Thread David Ahern
On 12/28/17 1:21 AM, Yuval Mintz wrote:
> I think it goes the other way around. The dpipe tables are the ones that
> can be translated to functionality; The resources are internal and HW-specific
> representing the possible internal division of resources -
> but a given resource sn't necessarily mapped to a single networking feature.
> [It might be in some cases, but not in the general case]

This is what I am getting at -- a single resource /kvd/linear is used
for multiple networking features, and those networking features do map
to well known entities -- fdb entries, ACL entries, ipv4/v6 host
entries, LPM entries, etc.

Nothing about the output from devlink helps the user in any way to
understand how to change the resource values. Saying that these
resources, what they mean and how they are used is MLX proprietary and
is known only to MLX employees and those with MLX agreements is not
acceptable. Likewise, requiring some network admin to deep dive into the
mlxsw driver to piece together how kvd/linear (for example) is used is
not acceptable.

The cover letter touts "Many of the ASIC's internal resources are
limited and are shared between several hardware procedures. For example,
unified hash-based memory can be used for many lookup purposes, like FDB
and LPM. In many cases the user can provide a partitioning scheme for
such a resource in order to perform fine tuning for his application."

Great, now give the user some indication of how to do that. Is setting
/kvd/linear to 0 acceptable? If not, why? What functionality is lost?
(Apparently, everything [1].)

The dpipe tables list some correlation between the kvd resources and
tables but that is not a complete list and again there is nothing to
tell a user that it is only a partial list of how a kvd resource is
used. For example, it shows ipv4 host is in /kvd/hash_single and that is
all it shows. So if I have an ipv6 only deployment can I conclude that I
can set /kvd/hash_single to 0? Or the reverse, can I set hash_double to
0 for an ipv4 only deployment? From the limited information given, it is
reasonable for a user to assume yes and has to learn through trial and
error what can be done. [2]

-

[1] This is allowed by the current patch set and perhaps it should not be:

$ ip ro ls vrf vrf1101
unreachable default metric 8192
11.2.51.0/24 dev swp1s0.51 proto kernel scope link src 11.2.51.1 offload
11.3.51.0/24 dev swp1s1.51 proto kernel scope link src 11.3.51.1 offload
11.4.51.0/24 dev swp1s2.51 proto kernel scope link src 11.4.51.1 offload
11.5.51.0/24 dev swp1s3.51 proto kernel scope link src 11.5.51.1 offload
11.6.51.0/24 dev swp3s0.51 proto kernel scope link src 11.6.51.1 offload
11.7.51.0/24 dev swp3s1.51 proto kernel scope link src 11.7.51.1 offload
11.8.51.0/24 dev swp3s2.51 proto kernel scope link src 11.8.51.1 offload
11.9.51.0/24 dev swp3s3.51 proto kernel scope link src 11.9.51.1 offload

$ devlink resource set pci/:03:00.0 path /kvd/linear size 0
$ devlink reload pci/:03:00.0
$ ip ro ls vrf vrf1101
unreachable default metric 8192

[2] Same exact result for setting hash_double to 0:
$ ip ro ls vrf vrf1101
unreachable default metric 8192
11.2.51.0/24 dev swp1s0.51 proto kernel scope link src 11.2.51.1 offload
11.3.51.0/24 dev swp1s1.51 proto kernel scope link src 11.3.51.1 offload
11.4.51.0/24 dev swp1s2.51 proto kernel scope link src 11.4.51.1 offload
11.5.51.0/24 dev swp1s3.51 proto kernel scope link src 11.5.51.1 offload
11.6.51.0/24 dev swp3s0.51 proto kernel scope link src 11.6.51.1 offload
11.7.51.0/24 dev swp3s1.51 proto kernel scope link src 11.7.51.1 offload
11.8.51.0/24 dev swp3s2.51 proto kernel scope link src 11.8.51.1 offload
11.9.51.0/24 dev swp3s3.51 proto kernel scope link src 11.9.51.1 offload

$ devlink resource set pci/:03:00.0 path /kvd/hash_double size 0
$ devlink reload pci/:03:00.0
$ ip ro ls vrf vrf1101
unreachable default metric 8192


Re: [patch net-next v2 00/10] Add support for resource abstraction

2017-12-30 Thread Andrew Lunn
> In my opinion it should not change. Unless there is a bug (like the one
> DaveA found in mlxsw erif table). Existing tables and resources should
> be only added. It is the driver's maintainer responsibility to not to
> break user scripts.

So we agree with is ABI. Great.

   Andrew


Re: [patch net-next v2 00/10] Add support for resource abstraction

2017-12-30 Thread Jiri Pirko
Sat, Dec 30, 2017 at 11:18:50AM CET, and...@lunn.ch wrote:
>> 1. Both dpipe and devlink resource are abstraction models for
>> hardware entities, and as a result they true to provide generic objects.
>> Each driver/ASIC should register his own and it absolutely proprietary
>> implementation. There is absolutely NO industry standard here, the only
>> thing that resembles a standard is that dpipe looks a bit like P4 only
>> because its proved to be useful for describing packet forwarding
>> pipelines. The host4 table is just a hardware process in the mellanox
>> spectrum ASIC pipeline and it should not be part of ABI, sorry I clearly
>> don't understand how this even came up.
>
>Why should it not be part of the ABI? Are you saying it will change
>from version to version? Are you trying to warning people not to write
>scripts using it, because its meaning will change and what works now
>will break in the next kernel version?

In my opinion it should not change. Unless there is a bug (like the one
DaveA found in mlxsw erif table). Existing tables and resources should
be only added. It is the driver's maintainer responsibility to not to
break user scripts.


Re: [patch net-next v2 00/10] Add support for resource abstraction

2017-12-30 Thread Andrew Lunn
> 1. Both dpipe and devlink resource are abstraction models for
> hardware entities, and as a result they true to provide generic objects.
> Each driver/ASIC should register his own and it absolutely proprietary
> implementation. There is absolutely NO industry standard here, the only
> thing that resembles a standard is that dpipe looks a bit like P4 only
> because its proved to be useful for describing packet forwarding
> pipelines. The host4 table is just a hardware process in the mellanox
> spectrum ASIC pipeline and it should not be part of ABI, sorry I clearly
> don't understand how this even came up.

Why should it not be part of the ABI? Are you saying it will change
from version to version? Are you trying to warning people not to write
scripts using it, because its meaning will change and what works now
will break in the next kernel version?

 Andrew


Re: [patch net-next v2 00/10] Add support for resource abstraction

2017-12-29 Thread Arkadi Sharshevsky


On 12/28/2017 06:33 PM, David Ahern wrote:
> On 12/28/17 10:23 AM, Jiri Pirko wrote:
>>> So there are 4 tables exported to userspace:
>>> 
>>> 1. mlxsw_erif table which is not in any of the kvd regions (no
>>> resource path is given) and it has a size of 1000. Does
>>> mlxsw_erif mean a rif as in Router Interfaces? So the switch
>>> supports up to 1000 router interfaces.
>>> 
>>> 2. mlxsw_host4 in /kvd/hash_single with a size of 62. Based on
>>> the
>> Size tells you the actual size. It cannot give you max size. The
>> reason is simple. The resources are shared among multiple tables.
>> That is exactly what this resource patch makes visible.
>> 
>> 
> 
> In the erif table, the 1000 is the max not current usage. I do not
> have 1000 interfaces:
> 
> $ ip -br li sh | wc -l 597
> 
> 
> $ devlink dpipe table dump pci/:03:00.0 name mlxsw_erif ... index
> 503 match_value: type field_exact header mlxsw_meta field erif_port
> mapping ifindex mapping_value 601 value 503 action_value: type
> field_modify header mlxsw_meta field l3_forward value 1
> 
> 
> The host4 table it is current size with no maximum.
> 
> The meaning of table size needs to be consistent across tables.
> 

You are right the egress RIF table size is not correct, I will
definitely fix it, but it is not what you think it should be. So in
order to clarify this point, just a reminder:

1. Both dpipe and devlink resource are abstraction models for
hardware entities, and as a result they true to provide generic objects.
Each driver/ASIC should register his own and it absolutely proprietary
implementation. There is absolutely NO industry standard here, the only
thing that resembles a standard is that dpipe looks a bit like P4 only
because its proved to be useful for describing packet forwarding
pipelines. The host4 table is just a hardware process in the mellanox
spectrum ASIC pipeline and it should not be part of ABI, sorry I clearly
don't understand how this even came up.

2. Dpipe table is a single hardware process, most of the time it uses
some resources (for example LPM algorithm uses hash memory).

3. ERIF table is a table that is located in the end of the L3 pipeline.
The current dpipe description is not complete and that why it caused
confusion. The table performs match on rif index and packet type
(UC/MC/BC) and performs forward/drop decision. As you can see, for each
rif the table can have several entries, which provide different
statistics for different traffic types per rif, currently only the UC
is exposed with forward.

4. ASICs use shared resource for many processes, this is exactly the
behavior we want to expose!

Again, the size of the ERIF table should NOT provide the number of
rifs which are in use, simply because dpipe tables do not describe
hardware resources.

In the future the RIF bank will be exported as resource object with size
of 1000, and in order to observe how much are in use you should check
its occupancy. This is the whole reason of this interface.



Re: [patch net-next v2 00/10] Add support for resource abstraction

2017-12-28 Thread Jiri Pirko
Thu, Dec 28, 2017 at 05:33:58PM CET, d...@cumulusnetworks.com wrote:
>On 12/28/17 10:23 AM, Jiri Pirko wrote:
>>> So there are 4 tables exported to userspace:
>>>
>>> 1. mlxsw_erif table which is not in any of the kvd regions (no resource
>>> path is given) and it has a size of 1000. Does mlxsw_erif mean a rif as
>>> in Router Interfaces? So the switch supports up to 1000 router interfaces.
>>>
>>> 2. mlxsw_host4 in /kvd/hash_single with a size of 62. Based on the
>> Size tells you the actual size. It cannot give you max size. The reason
>> is simple. The resources are shared among multiple tables. That is
>> exactly what this resource patch makes visible.
>> 
>> 
>
>In the erif table, the 1000 is the max not current usage. I do not have

I believe that is a bug in erif dpipe implementation
(mlxsw_sp_dpipe_table_erif_size_get) We'll fix that. Thanks!



>1000 interfaces:
>
>$ ip -br li sh | wc -l
>597
>
>
>$ devlink dpipe table dump pci/:03:00.0 name mlxsw_erif
>...
>  index 503
>  match_value:
>type field_exact header mlxsw_meta field erif_port mapping ifindex
>mapping_value 601 value 503
>  action_value:
>type field_modify header mlxsw_meta field l3_forward value 1
>
>
>The host4 table it is current size with no maximum.
>
>The meaning of table size needs to be consistent across tables.


Re: [patch net-next v2 00/10] Add support for resource abstraction

2017-12-28 Thread David Ahern
On 12/28/17 10:23 AM, Jiri Pirko wrote:
>> So there are 4 tables exported to userspace:
>>
>> 1. mlxsw_erif table which is not in any of the kvd regions (no resource
>> path is given) and it has a size of 1000. Does mlxsw_erif mean a rif as
>> in Router Interfaces? So the switch supports up to 1000 router interfaces.
>>
>> 2. mlxsw_host4 in /kvd/hash_single with a size of 62. Based on the
> Size tells you the actual size. It cannot give you max size. The reason
> is simple. The resources are shared among multiple tables. That is
> exactly what this resource patch makes visible.
> 
> 

In the erif table, the 1000 is the max not current usage. I do not have
1000 interfaces:

$ ip -br li sh | wc -l
597


$ devlink dpipe table dump pci/:03:00.0 name mlxsw_erif
...
  index 503
  match_value:
type field_exact header mlxsw_meta field erif_port mapping ifindex
mapping_value 601 value 503
  action_value:
type field_modify header mlxsw_meta field l3_forward value 1


The host4 table it is current size with no maximum.

The meaning of table size needs to be consistent across tables.


Re: [patch net-next v2 00/10] Add support for resource abstraction

2017-12-28 Thread Jiri Pirko
Thu, Dec 28, 2017 at 05:09:09PM CET, d...@cumulusnetworks.com wrote:
>On 12/28/17 2:25 AM, Yuval Mintz wrote:
> Again, I have no objections to kvd, linear, hash, etc terms as they do
> relate to Mellanox products. But kvd/linear, for example, does correlate
> to industry standard concepts in some way. My request is that the
> resource listing guide the user in some way, stating what these
> resources mean.

 So the showed relation to dpipe table would be enougn or you would still
 like to see some description? I don't like the description concept here
 as the relations to dpipe table should tell user exactly what he needs
 to know.
>>>
>>> I believe it is useful to have a 1-line, short description that gives
>>> the user some memory jogger as to what the resource is used for. It does
>>> not have to be an exhaustive list, but the user should not have to do
>>> mental jumping jacks running a bunch of commands to understand the
>>> resources for vendor specific asics.
>> 
>> Perhaps we can simply have devlink utility output the dpipe
>> table[s] associated with the resource when showing the resource?
>> It would contain live information as well as prevent the need for
>> 'mental jumping jacks'.
>> 
>
>My primary contention for this static partitioning is that the proposal
>is not giving the user the information they need to make decisions.
>
>As I mentioned earlier, the resource show command gives this:
>$ devlink resource show pci/:03:00.0
>pci/:03:00.0:
>  name kvd size 245760 size_valid true
>  resources:
>name linear size 98304 occ 0
>name hash_double size 60416
>name hash_single size 87040
>
>the paths /kvd/linear, /kvd/hash_single and /kvd/hash_double are
>essentially random names (nothing related to industry standard names)

Of course. There is no industry standard for internal ASIC
implementations. This is the same as for dpipe. There is no industry
standard for ASIC pipeline. dpipe visualizes it. This resource patch
visualizes the internal ASIC resources and their mapping to the
individual dpipe tables.


>and the listed sizes are random numbers (no units)[1]. There is nothing
>there to tell a user what they can adjust or why they would want to make
>an adjustment.
>
>
>Looking at 'dpipe table show':
>
>$ devlink dpipe table show pci/:03:00.0
>pci/:03:00.0:
>  name mlxsw_erif size 1000 counters_enabled false
>  match:
>type field_exact header mlxsw_meta field erif_port mapping ifindex
>  action:
>type field_modify header mlxsw_meta field l3_forward
>type field_modify header mlxsw_meta field l3_drop
>
>  resource_path /kvd/hash_single name mlxsw_host4 size 62
>counters_enabled false
>  match:
>type field_exact header mlxsw_meta field erif_port mapping ifindex
>type field_exact header ipv4 field destination ip
>  action:
>type field_modify header ethernet field destination mac
>
>  resource_path /kvd/hash_double name mlxsw_host6 size 0
>counters_enabled false
>  match:
>type field_exact header mlxsw_meta field erif_port mapping ifindex
>type field_exact header ipv6 field destination ip
>  action:
>type field_modify header ethernet field destination mac
>
>  resource_path /kvd/linear name mlxsw_adj size 0 counters_enabled false
>  match:
>type field_exact header mlxsw_meta field adj_index
>type field_exact header mlxsw_meta field adj_size
>type field_exact header mlxsw_meta field adj_hash_index
>  action:
>type field_modify header ethernet field destination mac
>type field_modify header mlxsw_meta field erif_port mapping ifindex
>
>
>So there are 4 tables exported to userspace:
>
>1. mlxsw_erif table which is not in any of the kvd regions (no resource
>path is given) and it has a size of 1000. Does mlxsw_erif mean a rif as
>in Router Interfaces? So the switch supports up to 1000 router interfaces.
>
>2. mlxsw_host4 in /kvd/hash_single with a size of 62. Based on the

Size tells you the actual size. It cannot give you max size. The reason
is simple. The resources are shared among multiple tables. That is
exactly what this resource patch makes visible.


>fields mlxsw_host4 means IPv4 host entries (neighbor entries). Why is
>the size set at 62? Seems really low.
>
>3. mlxsw_host6 in /kvd/hash_double with a size of 0. Based on the fields
>mlxsw_host6 means IPv6 host entries (neighbor entries). The size of 0 is
>concerning. I guess the switch is not configured to do IPv6?
>
>4. mlxsw_adj in /kvd/linear with a size of 0. Based on the fields I am
>going to guess it is an fdb entry


Re: [patch net-next v2 00/10] Add support for resource abstraction

2017-12-28 Thread David Ahern
On 12/28/17 2:25 AM, Yuval Mintz wrote:
 Again, I have no objections to kvd, linear, hash, etc terms as they do
 relate to Mellanox products. But kvd/linear, for example, does correlate
 to industry standard concepts in some way. My request is that the
 resource listing guide the user in some way, stating what these
 resources mean.
>>>
>>> So the showed relation to dpipe table would be enougn or you would still
>>> like to see some description? I don't like the description concept here
>>> as the relations to dpipe table should tell user exactly what he needs
>>> to know.
>>
>> I believe it is useful to have a 1-line, short description that gives
>> the user some memory jogger as to what the resource is used for. It does
>> not have to be an exhaustive list, but the user should not have to do
>> mental jumping jacks running a bunch of commands to understand the
>> resources for vendor specific asics.
> 
> Perhaps we can simply have devlink utility output the dpipe
> table[s] associated with the resource when showing the resource?
> It would contain live information as well as prevent the need for
> 'mental jumping jacks'.
> 

My primary contention for this static partitioning is that the proposal
is not giving the user the information they need to make decisions.

As I mentioned earlier, the resource show command gives this:
$ devlink resource show pci/:03:00.0
pci/:03:00.0:
  name kvd size 245760 size_valid true
  resources:
name linear size 98304 occ 0
name hash_double size 60416
name hash_single size 87040

the paths /kvd/linear, /kvd/hash_single and /kvd/hash_double are
essentially random names (nothing related to industry standard names)
and the listed sizes are random numbers (no units)[1]. There is nothing
there to tell a user what they can adjust or why they would want to make
an adjustment.


Looking at 'dpipe table show':

$ devlink dpipe table show pci/:03:00.0
pci/:03:00.0:
  name mlxsw_erif size 1000 counters_enabled false
  match:
type field_exact header mlxsw_meta field erif_port mapping ifindex
  action:
type field_modify header mlxsw_meta field l3_forward
type field_modify header mlxsw_meta field l3_drop

  resource_path /kvd/hash_single name mlxsw_host4 size 62
counters_enabled false
  match:
type field_exact header mlxsw_meta field erif_port mapping ifindex
type field_exact header ipv4 field destination ip
  action:
type field_modify header ethernet field destination mac

  resource_path /kvd/hash_double name mlxsw_host6 size 0
counters_enabled false
  match:
type field_exact header mlxsw_meta field erif_port mapping ifindex
type field_exact header ipv6 field destination ip
  action:
type field_modify header ethernet field destination mac

  resource_path /kvd/linear name mlxsw_adj size 0 counters_enabled false
  match:
type field_exact header mlxsw_meta field adj_index
type field_exact header mlxsw_meta field adj_size
type field_exact header mlxsw_meta field adj_hash_index
  action:
type field_modify header ethernet field destination mac
type field_modify header mlxsw_meta field erif_port mapping ifindex


So there are 4 tables exported to userspace:

1. mlxsw_erif table which is not in any of the kvd regions (no resource
path is given) and it has a size of 1000. Does mlxsw_erif mean a rif as
in Router Interfaces? So the switch supports up to 1000 router interfaces.

2. mlxsw_host4 in /kvd/hash_single with a size of 62. Based on the
fields mlxsw_host4 means IPv4 host entries (neighbor entries). Why is
the size set at 62? Seems really low.

3. mlxsw_host6 in /kvd/hash_double with a size of 0. Based on the fields
mlxsw_host6 means IPv6 host entries (neighbor entries). The size of 0 is
concerning. I guess the switch is not configured to do IPv6?

4. mlxsw_adj in /kvd/linear with a size of 0. Based on the fields I am
going to guess it is an fdb entry


So if I combine the output of "resource show" and "dpipe table show", I
can conclude:

1. /kvd/linear is only used for fdb entries. So if I want an L3 only use
case I can set /kvd/linear 0. Is that correct? I believe the answer is
no, but based on the information given it seems that way.

2. /kvd/hash_double has a size of 60416, but it is only used for
mlxsw_host6 table which has a size of 0. Now, I am confused.

3. /kvd/hash_single has a size of 87040, but it is only used for
mlxsw_host4 table which has a size of 62. Again confused.


What is a user to make of this output? How is it useful for making
decisions on whether to increase or decrease some resource?



[1] In a response, Jiri mentioned units are added by this patch set but
all I see in the patches is this:

@@ -245,4 +256,8 @@ enum devlink_dpipe_header_id {
DEVLINK_DPIPE_HEADER_IPV6,
 };

+enum devlink_resource_unit {
+   DEVLINK_RESOURCE_UNIT_DOUBLE_WORD,
+};
+
 #endif /* _UAPI_LINUX_DEVLINK_H_ */

DEVLINK_RESOURCE_UNIT_DOUBLE_WORD means what???


RE: [patch net-next v2 00/10] Add support for resource abstraction

2017-12-28 Thread Yuval Mintz
> >>> Many of the ASIC's internal resources are limited and are shared
> between
> >>> several hardware procedures. For example, unified hash-based memory
> can
> >>> be used for many lookup purposes, like FDB and LPM. In many cases the
> user
> >>> can provide a partitioning scheme for such a resource in order to
> perform
> >>> fine tuning for his application. In such cases performing driver reload is
> >>> needed for the changes to take place, thus this patchset also adds support
> >>> for hot reload.
> >>>
> >>> Such an abstraction can be coupled with devlink's dpipe interface, which
> >>> models the ASIC's pipeline as a graph of match/action tables. By
> modeling
> >>> the hardware resource object, and by coupling it to several dpipe tables,
> >>> further visibility can be achieved in order to debug ASIC-wide issues.
> >>>
> >>> The proposed interface will provide the user the ability to understand the
> >>> limitations of the hardware, and receive notification regarding its
> occupancy.
> >>> Furthermore, monitoring the resource occupancy can be done in real-
> time and
> >>> can be useful in many cases.
> >>
> >> In the last RFC (not v1, but RFC) I asked for some kind of description
> >> for each resource, and you and Arkadi have pushed back. Let's walk
> >> through an example to see what I mean:
> >>
> >> $ devlink resource show pci/:03:00.0
> >> pci/:03:00.0:
> >>  name kvd size 245760 size_valid true
> >>  resources:
> >>name linear size 98304 occ 0
> >>name hash_double size 60416
> >>name hash_single size 87040
> >>
> >> So this 2700 has 3 resources that can be managed -- some table or
> >> resource or something named 'kvd' with linear, hash_double and
> >> hash_single sub-resources. What are these names referring too? The
> above
> >> output gives no description, and 'kvd' is not an industry term. Further,
> >
> > This are internal resources specific to the ASIC. Would you like some
> > description to each or something like that?
> 
> devlink has some nice self-documenting capabilities. What's missing here
> is a description of what the resource is used for in standard terms --
> ipv4 host routes, fdb, nexthops, rifs, etc. Even if the description is a
> short list versus an exhaustive list of everything it is used for. e.g.,
> Why would a user decrease linear and increase hash_single or vice versa?
> 
> >
> >
> >> what are these sizes that a user can control? The output contains no
> >> units, no description, nothing. In short, the above output provides
> >> random numbers associated with random names.
> >
> > Units are now exposed from kernel, just this version of iproute2 patch
> > does not display it.
> 
> please provide an iproute2 patch that does so the full context if this
> patch set can be reviewed from a user perspective.
> 
> >
> >
> >>
> >> I can see dpipe tables exported by this device:
> >>
> >> $ devlink dpipe header show pci/:03:00.0
> >>
> >> pci/:03:00.0:
> >>  name mlxsw_meta
> >>  field:
> >>name erif_port bitwidth 32 mapping_type ifindex
> >>name l3_forward bitwidth 1
> >>name l3_drop bitwidth 1
> >>name adj_index bitwidth 32
> >>name adj_size bitwidth 32
> >>name adj_hash_index bitwidth 32
> >>
> >>  name ipv6
> >>  field:
> >>name destination ip bitwidth 128
> >>
> >>  name ipv4
> >>  field:
> >>name destination ip bitwidth 32
> >>
> >>  name ethernet
> >>  field:
> >>name destination mac bitwidth 48
> >>
> >> but none mention 'kvd' or 'linear' or 'hash" and none of the other
> >> various devlink options:
> >>
> >> $ devlink
> >> Usage: devlink [ OPTIONS ] OBJECT { COMMAND | help }
> >> where  OBJECT := { dev | port | sb | monitor | dpipe }
> >>
> >> seem to related to resources.
> >>
> >> So how does a user know what they are controlling by this 'resource'
> >> option? Is the user expected to have a PRM or user guide on hand for the
> >> specific device model that is being configured?
> >
> > The relation of specific dpipe table to specific resource is exposed by
> > the kernel as well. Probably the iproute2 patch just does not display
> > it.
> 
> please provide an iproute2 patch that does so the full context if this
> patch set can be reviewed from a user perspective.
> 
> >
> >
> >>
> >> Again, I have no objections to kvd, linear, hash, etc terms as they do
> >> relate to Mellanox products. But kvd/linear, for example, does correlate
> >> to industry standard concepts in some way. My request is that the
> >> resource listing guide the user in some way, stating what these
> >> resources mean.
> >
> > So the showed relation to dpipe table would be enougn or you would still
> > like to see some description? I don't like the description concept here
> > as the relations to dpipe table should tell user exactly what he needs
> > to know.
> 
> I believe it is useful to have a 1-line, short description that gives
> the user some memory jogger as to what the resource is used for. It does
> not have to be an 

RE: [patch net-next v2 00/10] Add support for resource abstraction

2017-12-28 Thread Yuval Mintz
>  Many of the ASIC's internal resources are limited and are shared
> between
>  several hardware procedures. For example, unified hash-based memory
> can
>  be used for many lookup purposes, like FDB and LPM. In many cases the
> user
>  can provide a partitioning scheme for such a resource in order to
> perform
>  fine tuning for his application. In such cases performing driver reload 
>  is
>  needed for the changes to take place, thus this patchset also adds
> support
>  for hot reload.
> 
>  Such an abstraction can be coupled with devlink's dpipe interface, which
>  models the ASIC's pipeline as a graph of match/action tables. By
> modeling
>  the hardware resource object, and by coupling it to several dpipe tables,
>  further visibility can be achieved in order to debug ASIC-wide issues.
> 
>  The proposed interface will provide the user the ability to understand
> the
>  limitations of the hardware, and receive notification regarding its
> occupancy.
>  Furthermore, monitoring the resource occupancy can be done in real-
> time and
>  can be useful in many cases.
> >>>
> >>> In the last RFC (not v1, but RFC) I asked for some kind of description
> >>> for each resource, and you and Arkadi have pushed back. Let's walk
> >>> through an example to see what I mean:
> >>>
> >>> $ devlink resource show pci/:03:00.0
> >>> pci/:03:00.0:
> >>>  name kvd size 245760 size_valid true
> >>>  resources:
> >>>name linear size 98304 occ 0
> >>>name hash_double size 60416
> >>>name hash_single size 87040
> >>>
> >>> So this 2700 has 3 resources that can be managed -- some table or
> >>> resource or something named 'kvd' with linear, hash_double and
> >>> hash_single sub-resources. What are these names referring too? The
> above
> >>> output gives no description, and 'kvd' is not an industry term. Further,
> >>
> >> This are internal resources specific to the ASIC. Would you like some
> >> description to each or something like that?
> >
> > devlink has some nice self-documenting capabilities. What's missing here
> > is a description of what the resource is used for in standard terms --
> > ipv4 host routes, fdb, nexthops, rifs, etc. Even if the description is a
> > short list versus an exhaustive list of everything it is used for. e.g.,
> > Why would a user decrease linear and increase hash_single or vice versa?
> 
> 
> Arkadi, on what david says above, can the resource names and ids not
> be driver specific, but moved up to the switchdev layer and just map
> to fdb, host routes, nexthops table sizes etc ?.
> Can these generic networking resources then in-turn be mapped to kvd
> sizes by the driver ?

I think it goes the other way around. The dpipe tables are the ones that
can be translated to functionality; The resources are internal and HW-specific
representing the possible internal division of resources -
but a given resource sn't necessarily mapped to a single networking feature.
[It might be in some cases, but not in the general case]

You could always move to a structured approach where each resource
in the hierarchy is further split to sub-resources until each leaf represents
a single network concepts - but that would stop be an abstraction of the
HW resources and become a SW implementation instead, as SW would
have to be the one to maintain and enforce the resource distribution.
And that's not what we're trying to achieve here.


Re: [patch net-next v2 00/10] Add support for resource abstraction

2017-12-27 Thread David Ahern
On 12/27/17 1:31 PM, Andrew Lunn wrote:
>> Hmm. That documents mainly sysfs. No mention of Netlink at all. But
>> maybe I missed it. Also, that defines the interface as is. However we
>> are talking about the data exchanged over the interface, not the
>> interface itself. I don't see how ASIC/HW specific thing, like for
>> example KVD in our case could be part of kernel ABI.
> 
> You need to be very careful here. As soon as somebody starts using it,
> it might become an ABI. Or you need to clearly document it is not ABI,
> there is no guarantee it will not disappear or change its meaning in
> the next kernel, and it should be used with extreme caution.
> 

+1

Once the names go in, people can write scripts that invoke devlink at
boot to partition resources. With the proposed patch set, the name
(e.g., kvd/linear) becomes part of the ABI.


Re: [patch net-next v2 00/10] Add support for resource abstraction

2017-12-27 Thread Arkadi Sharshevsky


On 12/27/2017 06:34 PM, David Ahern wrote:
> On 12/27/17 2:09 AM, Jiri Pirko wrote:
>> Wed, Dec 27, 2017 at 05:05:09AM CET, d...@cumulusnetworks.com wrote:
>>> On 12/26/17 5:23 AM, Jiri Pirko wrote:
 From: Jiri Pirko 

 Many of the ASIC's internal resources are limited and are shared between
 several hardware procedures. For example, unified hash-based memory can
 be used for many lookup purposes, like FDB and LPM. In many cases the user
 can provide a partitioning scheme for such a resource in order to perform
 fine tuning for his application. In such cases performing driver reload is
 needed for the changes to take place, thus this patchset also adds support
 for hot reload.

 Such an abstraction can be coupled with devlink's dpipe interface, which
 models the ASIC's pipeline as a graph of match/action tables. By modeling
 the hardware resource object, and by coupling it to several dpipe tables,
 further visibility can be achieved in order to debug ASIC-wide issues.

 The proposed interface will provide the user the ability to understand the
 limitations of the hardware, and receive notification regarding its 
 occupancy.
 Furthermore, monitoring the resource occupancy can be done in real-time and
 can be useful in many cases.
>>>
>>> In the last RFC (not v1, but RFC) I asked for some kind of description
>>> for each resource, and you and Arkadi have pushed back. Let's walk
>>> through an example to see what I mean:
>>>
>>> $ devlink resource show pci/:03:00.0
>>> pci/:03:00.0:
>>>  name kvd size 245760 size_valid true
>>>  resources:
>>>name linear size 98304 occ 0
>>>name hash_double size 60416
>>>name hash_single size 87040
>>>
>>> So this 2700 has 3 resources that can be managed -- some table or
>>> resource or something named 'kvd' with linear, hash_double and
>>> hash_single sub-resources. What are these names referring too? The above
>>> output gives no description, and 'kvd' is not an industry term. Further,
>>
>> This are internal resources specific to the ASIC. Would you like some
>> description to each or something like that?
> 
> devlink has some nice self-documenting capabilities. What's missing here
> is a description of what the resource is used for in standard terms --
> ipv4 host routes, fdb, nexthops, rifs, etc. Even if the description is a
> short list versus an exhaustive list of everything it is used for. e.g.,
> Why would a user decrease linear and increase hash_single or vice versa?
> 
>>
>>
>>> what are these sizes that a user can control? The output contains no
>>> units, no description, nothing. In short, the above output provides
>>> random numbers associated with random names.
>>
>> Units are now exposed from kernel, just this version of iproute2 patch
>> does not display it.
> 
> please provide an iproute2 patch that does so the full context if this
> patch set can be reviewed from a user perspective.
> 
>>
>>
>>>
>>> I can see dpipe tables exported by this device:
>>>
>>> $ devlink dpipe header show pci/:03:00.0
>>>
>>> pci/:03:00.0:
>>>  name mlxsw_meta
>>>  field:
>>>name erif_port bitwidth 32 mapping_type ifindex
>>>name l3_forward bitwidth 1
>>>name l3_drop bitwidth 1
>>>name adj_index bitwidth 32
>>>name adj_size bitwidth 32
>>>name adj_hash_index bitwidth 32
>>>
>>>  name ipv6
>>>  field:
>>>name destination ip bitwidth 128
>>>
>>>  name ipv4
>>>  field:
>>>name destination ip bitwidth 32
>>>
>>>  name ethernet
>>>  field:
>>>name destination mac bitwidth 48
>>>
>>> but none mention 'kvd' or 'linear' or 'hash" and none of the other
>>> various devlink options:
>>>
>>> $ devlink
>>> Usage: devlink [ OPTIONS ] OBJECT { COMMAND | help }
>>> where  OBJECT := { dev | port | sb | monitor | dpipe }
>>>
>>> seem to related to resources.
>>>
>>> So how does a user know what they are controlling by this 'resource'
>>> option? Is the user expected to have a PRM or user guide on hand for the
>>> specific device model that is being configured?
>>
>> The relation of specific dpipe table to specific resource is exposed by
>> the kernel as well. Probably the iproute2 patch just does not display
>> it.
> 
> please provide an iproute2 patch that does so the full context if this
> patch set can be reviewed from a user perspective.
>

As Yuval stated you are using the wrong command here.
You are printing the headers not the tables. On each dpipe
table you can see the resource  it is using (the resource
path aka host table uses /kvd/hash_single for example).

This is already working. Just try it.

>>
>>
>>>
>>> Again, I have no objections to kvd, linear, hash, etc terms as they do
>>> relate to Mellanox products. But kvd/linear, for example, does correlate
>>> to industry standard concepts in some way. My request is that the
>>> resource listing guide the user in some way, stating what these
>>> resources mean.
>>
>> So the 

Re: [patch net-next v2 00/10] Add support for resource abstraction

2017-12-27 Thread Roopa Prabhu
On Wed, Dec 27, 2017 at 8:34 AM, David Ahern  wrote:
> On 12/27/17 2:09 AM, Jiri Pirko wrote:
>> Wed, Dec 27, 2017 at 05:05:09AM CET, d...@cumulusnetworks.com wrote:
>>> On 12/26/17 5:23 AM, Jiri Pirko wrote:
 From: Jiri Pirko 

 Many of the ASIC's internal resources are limited and are shared between
 several hardware procedures. For example, unified hash-based memory can
 be used for many lookup purposes, like FDB and LPM. In many cases the user
 can provide a partitioning scheme for such a resource in order to perform
 fine tuning for his application. In such cases performing driver reload is
 needed for the changes to take place, thus this patchset also adds support
 for hot reload.

 Such an abstraction can be coupled with devlink's dpipe interface, which
 models the ASIC's pipeline as a graph of match/action tables. By modeling
 the hardware resource object, and by coupling it to several dpipe tables,
 further visibility can be achieved in order to debug ASIC-wide issues.

 The proposed interface will provide the user the ability to understand the
 limitations of the hardware, and receive notification regarding its 
 occupancy.
 Furthermore, monitoring the resource occupancy can be done in real-time and
 can be useful in many cases.
>>>
>>> In the last RFC (not v1, but RFC) I asked for some kind of description
>>> for each resource, and you and Arkadi have pushed back. Let's walk
>>> through an example to see what I mean:
>>>
>>> $ devlink resource show pci/:03:00.0
>>> pci/:03:00.0:
>>>  name kvd size 245760 size_valid true
>>>  resources:
>>>name linear size 98304 occ 0
>>>name hash_double size 60416
>>>name hash_single size 87040
>>>
>>> So this 2700 has 3 resources that can be managed -- some table or
>>> resource or something named 'kvd' with linear, hash_double and
>>> hash_single sub-resources. What are these names referring too? The above
>>> output gives no description, and 'kvd' is not an industry term. Further,
>>
>> This are internal resources specific to the ASIC. Would you like some
>> description to each or something like that?
>
> devlink has some nice self-documenting capabilities. What's missing here
> is a description of what the resource is used for in standard terms --
> ipv4 host routes, fdb, nexthops, rifs, etc. Even if the description is a
> short list versus an exhaustive list of everything it is used for. e.g.,
> Why would a user decrease linear and increase hash_single or vice versa?


Arkadi, on what david says above, can the resource names and ids not
be driver specific, but moved up to the switchdev layer and just map
to fdb, host routes, nexthops table sizes etc ?.
Can these generic networking resources then in-turn be mapped to kvd
sizes by the driver ?


Re: [patch net-next v2 00/10] Add support for resource abstraction

2017-12-27 Thread Andrew Lunn
> Hmm. That documents mainly sysfs. No mention of Netlink at all. But
> maybe I missed it. Also, that defines the interface as is. However we
> are talking about the data exchanged over the interface, not the
> interface itself. I don't see how ASIC/HW specific thing, like for
> example KVD in our case could be part of kernel ABI.

You need to be very careful here. As soon as somebody starts using it,
it might become an ABI. Or you need to clearly document it is not ABI,
there is no guarantee it will not disappear or change its meaning in
the next kernel, and it should be used with extreme caution.

Personally, if DSA drivers were to expose such settings, i would
consider them ABI, which people can rely on to remain stable.

Andrew


Re: [patch net-next v2 00/10] Add support for resource abstraction

2017-12-27 Thread David Ahern
On 12/27/17 7:15 AM, Jiri Pirko wrote:
> Wed, Dec 27, 2017 at 02:08:03PM CET, and...@lunn.ch wrote:
>>> This is misunderstanding I believe. This is not about ABI. That is well
>>> defined by the netlink attributes. This is about meaning of particular
>>> ASIC-specific internal resources.
>>
>> I would agree that the netlink attributed are clearly defined. But the
>> meta information, what this ASIC specific internal resource means when
>> you combine these attributes, is unclear. This meta information is
>> also part of the ABI, and documenting giving users a hit what it
>> means, and why they should change it, would be good practice.
>>
>> Look at sysfs. open/read/write are clearly defined, which is the
>> equivalent of the netlink attributes. The meta information we document
>> in Documentation/ABI/, what a file name means, what a value means,
>> what other values it can take, etc.
> 
> Hmm. That documents mainly sysfs. No mention of Netlink at all. But
> maybe I missed it. Also, that defines the interface as is. However we
> are talking about the data exchanged over the interface, not the
> interface itself. I don't see how ASIC/HW specific thing, like for
> example KVD in our case could be part of kernel ABI. That makes 0 sense
> to me, sorry.
> 

IMO, kernel documentation is not much better than vendor docs. A general
networking admin may not have access to a kernel tree or have vendor
docs at their finger tips. As I mentioned in the previous response,
devlink attributes have self-documenting capabilities making that
information available inline. That to me is the right place for a short,
but reasonably sized description. For in-depth details users can
cross-reference vendor docs.



Re: [patch net-next v2 00/10] Add support for resource abstraction

2017-12-27 Thread David Ahern
On 12/27/17 2:09 AM, Jiri Pirko wrote:
> Wed, Dec 27, 2017 at 05:05:09AM CET, d...@cumulusnetworks.com wrote:
>> On 12/26/17 5:23 AM, Jiri Pirko wrote:
>>> From: Jiri Pirko 
>>>
>>> Many of the ASIC's internal resources are limited and are shared between
>>> several hardware procedures. For example, unified hash-based memory can
>>> be used for many lookup purposes, like FDB and LPM. In many cases the user
>>> can provide a partitioning scheme for such a resource in order to perform
>>> fine tuning for his application. In such cases performing driver reload is
>>> needed for the changes to take place, thus this patchset also adds support
>>> for hot reload.
>>>
>>> Such an abstraction can be coupled with devlink's dpipe interface, which
>>> models the ASIC's pipeline as a graph of match/action tables. By modeling
>>> the hardware resource object, and by coupling it to several dpipe tables,
>>> further visibility can be achieved in order to debug ASIC-wide issues.
>>>
>>> The proposed interface will provide the user the ability to understand the
>>> limitations of the hardware, and receive notification regarding its 
>>> occupancy.
>>> Furthermore, monitoring the resource occupancy can be done in real-time and
>>> can be useful in many cases.
>>
>> In the last RFC (not v1, but RFC) I asked for some kind of description
>> for each resource, and you and Arkadi have pushed back. Let's walk
>> through an example to see what I mean:
>>
>> $ devlink resource show pci/:03:00.0
>> pci/:03:00.0:
>>  name kvd size 245760 size_valid true
>>  resources:
>>name linear size 98304 occ 0
>>name hash_double size 60416
>>name hash_single size 87040
>>
>> So this 2700 has 3 resources that can be managed -- some table or
>> resource or something named 'kvd' with linear, hash_double and
>> hash_single sub-resources. What are these names referring too? The above
>> output gives no description, and 'kvd' is not an industry term. Further,
> 
> This are internal resources specific to the ASIC. Would you like some
> description to each or something like that?

devlink has some nice self-documenting capabilities. What's missing here
is a description of what the resource is used for in standard terms --
ipv4 host routes, fdb, nexthops, rifs, etc. Even if the description is a
short list versus an exhaustive list of everything it is used for. e.g.,
Why would a user decrease linear and increase hash_single or vice versa?

> 
> 
>> what are these sizes that a user can control? The output contains no
>> units, no description, nothing. In short, the above output provides
>> random numbers associated with random names.
> 
> Units are now exposed from kernel, just this version of iproute2 patch
> does not display it.

please provide an iproute2 patch that does so the full context if this
patch set can be reviewed from a user perspective.

> 
> 
>>
>> I can see dpipe tables exported by this device:
>>
>> $ devlink dpipe header show pci/:03:00.0
>>
>> pci/:03:00.0:
>>  name mlxsw_meta
>>  field:
>>name erif_port bitwidth 32 mapping_type ifindex
>>name l3_forward bitwidth 1
>>name l3_drop bitwidth 1
>>name adj_index bitwidth 32
>>name adj_size bitwidth 32
>>name adj_hash_index bitwidth 32
>>
>>  name ipv6
>>  field:
>>name destination ip bitwidth 128
>>
>>  name ipv4
>>  field:
>>name destination ip bitwidth 32
>>
>>  name ethernet
>>  field:
>>name destination mac bitwidth 48
>>
>> but none mention 'kvd' or 'linear' or 'hash" and none of the other
>> various devlink options:
>>
>> $ devlink
>> Usage: devlink [ OPTIONS ] OBJECT { COMMAND | help }
>> where  OBJECT := { dev | port | sb | monitor | dpipe }
>>
>> seem to related to resources.
>>
>> So how does a user know what they are controlling by this 'resource'
>> option? Is the user expected to have a PRM or user guide on hand for the
>> specific device model that is being configured?
> 
> The relation of specific dpipe table to specific resource is exposed by
> the kernel as well. Probably the iproute2 patch just does not display
> it.

please provide an iproute2 patch that does so the full context if this
patch set can be reviewed from a user perspective.

> 
> 
>>
>> Again, I have no objections to kvd, linear, hash, etc terms as they do
>> relate to Mellanox products. But kvd/linear, for example, does correlate
>> to industry standard concepts in some way. My request is that the
>> resource listing guide the user in some way, stating what these
>> resources mean.
> 
> So the showed relation to dpipe table would be enougn or you would still
> like to see some description? I don't like the description concept here
> as the relations to dpipe table should tell user exactly what he needs
> to know.

I believe it is useful to have a 1-line, short description that gives
the user some memory jogger as to what the resource is used for. It does
not have to be an exhaustive list, but the user should not have to do

Re: [patch net-next v2 00/10] Add support for resource abstraction

2017-12-27 Thread Jiri Pirko
Wed, Dec 27, 2017 at 02:08:03PM CET, and...@lunn.ch wrote:
>> This is misunderstanding I believe. This is not about ABI. That is well
>> defined by the netlink attributes. This is about meaning of particular
>> ASIC-specific internal resources.
>
>I would agree that the netlink attributed are clearly defined. But the
>meta information, what this ASIC specific internal resource means when
>you combine these attributes, is unclear. This meta information is
>also part of the ABI, and documenting giving users a hit what it
>means, and why they should change it, would be good practice.
>
>Look at sysfs. open/read/write are clearly defined, which is the
>equivalent of the netlink attributes. The meta information we document
>in Documentation/ABI/, what a file name means, what a value means,
>what other values it can take, etc.

Hmm. That documents mainly sysfs. No mention of Netlink at all. But
maybe I missed it. Also, that defines the interface as is. However we
are talking about the data exchanged over the interface, not the
interface itself. I don't see how ASIC/HW specific thing, like for
example KVD in our case could be part of kernel ABI. That makes 0 sense
to me, sorry.


Re: [patch net-next v2 00/10] Add support for resource abstraction

2017-12-27 Thread Andrew Lunn
> This is misunderstanding I believe. This is not about ABI. That is well
> defined by the netlink attributes. This is about meaning of particular
> ASIC-specific internal resources.

I would agree that the netlink attributed are clearly defined. But the
meta information, what this ASIC specific internal resource means when
you combine these attributes, is unclear. This meta information is
also part of the ABI, and documenting giving users a hit what it
means, and why they should change it, would be good practice.

Look at sysfs. open/read/write are clearly defined, which is the
equivalent of the netlink attributes. The meta information we document
in Documentation/ABI/, what a file name means, what a value means,
what other values it can take, etc.

 Andrew



Re: [patch net-next v2 00/10] Add support for resource abstraction

2017-12-27 Thread Jiri Pirko
Wed, Dec 27, 2017 at 09:23:31AM CET, and...@lunn.ch wrote:
>> >$ devlink resource show pci/:03:00.0
>> >pci/:03:00.0:
>> >  name kvd size 245760 size_valid true
>> >  resources:
>> >name linear size 98304 occ 0
>> >name hash_double size 60416
>> >name hash_single size 87040
>> >
>> >So this 2700 has 3 resources that can be managed -- some table or
>> >resource or something named 'kvd' with linear, hash_double and
>> >hash_single sub-resources. What are these names referring too? The above
>> >output gives no description, and 'kvd' is not an industry term. Further,
>> 
>> This are internal resources specific to the ASIC. Would you like some
>> description to each or something like that?
>
>The fact you have decided to expose them means you expect people to
>change them. So yes, they need to be documented. Maybe add something
>to Documentation/ABI/stable/
>
>> So the showed relation to dpipe table would be enougn or you would still
>> like to see some description? I don't like the description concept here
>> as the relations to dpipe table should tell user exactly what he needs
>> to know.
>
>Documenting the ABI is good practice.

This is misunderstanding I believe. This is not about ABI. That is well
defined by the netlink attributes. This is about meaning of particular
ASIC-specific internal resources.


>
>   Andrew
>


RE: [patch net-next v2 00/10] Add support for resource abstraction

2017-12-27 Thread Yuval Mintz
> >> Many of the ASIC's internal resources are limited and are shared between
> >> several hardware procedures. For example, unified hash-based memory
> can
> >> be used for many lookup purposes, like FDB and LPM. In many cases the
> user
> >> can provide a partitioning scheme for such a resource in order to perform
> >> fine tuning for his application. In such cases performing driver reload is
> >> needed for the changes to take place, thus this patchset also adds support
> >> for hot reload.
> >>
> >> Such an abstraction can be coupled with devlink's dpipe interface, which
> >> models the ASIC's pipeline as a graph of match/action tables. By modeling
> >> the hardware resource object, and by coupling it to several dpipe tables,
> >> further visibility can be achieved in order to debug ASIC-wide issues.
> >>
> >> The proposed interface will provide the user the ability to understand the
> >> limitations of the hardware, and receive notification regarding its
> occupancy.
> >> Furthermore, monitoring the resource occupancy can be done in real-time
> and
> >> can be useful in many cases.
> >
> >In the last RFC (not v1, but RFC) I asked for some kind of description
> >for each resource, and you and Arkadi have pushed back. Let's walk
> >through an example to see what I mean:
> >
> >$ devlink resource show pci/:03:00.0
> >pci/:03:00.0:
> >  name kvd size 245760 size_valid true
> >  resources:
> >name linear size 98304 occ 0
> >name hash_double size 60416
> >name hash_single size 87040
> >
> >So this 2700 has 3 resources that can be managed -- some table or
> >resource or something named 'kvd' with linear, hash_double and
> >hash_single sub-resources. What are these names referring too? The above
> >output gives no description, and 'kvd' is not an industry term. Further,
> 
> This are internal resources specific to the ASIC. Would you like some
> description to each or something like that?
> 
> 
> >what are these sizes that a user can control? The output contains no
> >units, no description, nothing. In short, the above output provides
> >random numbers associated with random names.
> 
> Units are now exposed from kernel, just this version of iproute2 patch
> does not display it.
> 
> 
> >
> >I can see dpipe tables exported by this device:
> >
> >$ devlink dpipe header show pci/:03:00.0
> >
> >pci/:03:00.0:
> >  name mlxsw_meta
> >  field:
> >name erif_port bitwidth 32 mapping_type ifindex
> >name l3_forward bitwidth 1
> >name l3_drop bitwidth 1
> >name adj_index bitwidth 32
> >name adj_size bitwidth 32
> >name adj_hash_index bitwidth 32
> >
> >  name ipv6
> >  field:
> >name destination ip bitwidth 128
> >
> >  name ipv4
> >  field:
> >name destination ip bitwidth 32
> >
> >  name ethernet
> >  field:
> >name destination mac bitwidth 48
> >
> >but none mention 'kvd' or 'linear' or 'hash" and none of the other
> >various devlink options:
> >
> >$ devlink
> >Usage: devlink [ OPTIONS ] OBJECT { COMMAND | help }
> >where  OBJECT := { dev | port | sb | monitor | dpipe }
> >
> >seem to related to resources.
> >
> >So how does a user know what they are controlling by this 'resource'
> >option? Is the user expected to have a PRM or user guide on hand for the
> >specific device model that is being configured?
> 
> The relation of specific dpipe table to specific resource is exposed by
> the kernel as well. Probably the iproute2 patch just does not display
> it.

It does, just under 'table' and not under 'header'. E.g.:

# ./devlink/devlink dpipe -j -p table show pci/:03:00.0
...
},{
"resource_path": "/kvd/hash_single",
"name": "mlxsw_host4",
"size": 0,
"counters_enabled": "false",
"match": [{
"type": "field_exact",
"header": "mlxsw_meta",
"field": "erif_port",
"mapping": "ifindex"
},{
"type": "field_exact",
"header": "ipv4",
"field": "destination ip"
}
],
"action": [{
"type": "field_modify",
"header": "ethernet",
"field": "destination mac"
}
]
},{
...

> 
> 
> >
> >Again, I have no objections to kvd, linear, hash, etc terms as they do
> >relate to Mellanox products. But kvd/linear, for example, does correlate
> >to industry standard concepts in some way. My request is that the
> >resource listing guide the user in some way, stating what these
> >resources mean.
> 
> So the showed relation to dpipe table would be enougn or you would still
> like to see some description? I don't like the description concept here
> as the relations to dpipe table should tell user exactly what he needs
> to know.
> 
> 
> >
> 

Re: [patch net-next v2 00/10] Add support for resource abstraction

2017-12-27 Thread Andrew Lunn
> >$ devlink resource show pci/:03:00.0
> >pci/:03:00.0:
> >  name kvd size 245760 size_valid true
> >  resources:
> >name linear size 98304 occ 0
> >name hash_double size 60416
> >name hash_single size 87040
> >
> >So this 2700 has 3 resources that can be managed -- some table or
> >resource or something named 'kvd' with linear, hash_double and
> >hash_single sub-resources. What are these names referring too? The above
> >output gives no description, and 'kvd' is not an industry term. Further,
> 
> This are internal resources specific to the ASIC. Would you like some
> description to each or something like that?

The fact you have decided to expose them means you expect people to
change them. So yes, they need to be documented. Maybe add something
to Documentation/ABI/stable/

> So the showed relation to dpipe table would be enougn or you would still
> like to see some description? I don't like the description concept here
> as the relations to dpipe table should tell user exactly what he needs
> to know.

Documenting the ABI is good practice.

   Andrew



Re: [patch net-next v2 00/10] Add support for resource abstraction

2017-12-27 Thread Jiri Pirko
Wed, Dec 27, 2017 at 05:05:09AM CET, d...@cumulusnetworks.com wrote:
>On 12/26/17 5:23 AM, Jiri Pirko wrote:
>> From: Jiri Pirko 
>> 
>> Many of the ASIC's internal resources are limited and are shared between
>> several hardware procedures. For example, unified hash-based memory can
>> be used for many lookup purposes, like FDB and LPM. In many cases the user
>> can provide a partitioning scheme for such a resource in order to perform
>> fine tuning for his application. In such cases performing driver reload is
>> needed for the changes to take place, thus this patchset also adds support
>> for hot reload.
>> 
>> Such an abstraction can be coupled with devlink's dpipe interface, which
>> models the ASIC's pipeline as a graph of match/action tables. By modeling
>> the hardware resource object, and by coupling it to several dpipe tables,
>> further visibility can be achieved in order to debug ASIC-wide issues.
>> 
>> The proposed interface will provide the user the ability to understand the
>> limitations of the hardware, and receive notification regarding its 
>> occupancy.
>> Furthermore, monitoring the resource occupancy can be done in real-time and
>> can be useful in many cases.
>
>In the last RFC (not v1, but RFC) I asked for some kind of description
>for each resource, and you and Arkadi have pushed back. Let's walk
>through an example to see what I mean:
>
>$ devlink resource show pci/:03:00.0
>pci/:03:00.0:
>  name kvd size 245760 size_valid true
>  resources:
>name linear size 98304 occ 0
>name hash_double size 60416
>name hash_single size 87040
>
>So this 2700 has 3 resources that can be managed -- some table or
>resource or something named 'kvd' with linear, hash_double and
>hash_single sub-resources. What are these names referring too? The above
>output gives no description, and 'kvd' is not an industry term. Further,

This are internal resources specific to the ASIC. Would you like some
description to each or something like that?


>what are these sizes that a user can control? The output contains no
>units, no description, nothing. In short, the above output provides
>random numbers associated with random names.

Units are now exposed from kernel, just this version of iproute2 patch
does not display it.


>
>I can see dpipe tables exported by this device:
>
>$ devlink dpipe header show pci/:03:00.0
>
>pci/:03:00.0:
>  name mlxsw_meta
>  field:
>name erif_port bitwidth 32 mapping_type ifindex
>name l3_forward bitwidth 1
>name l3_drop bitwidth 1
>name adj_index bitwidth 32
>name adj_size bitwidth 32
>name adj_hash_index bitwidth 32
>
>  name ipv6
>  field:
>name destination ip bitwidth 128
>
>  name ipv4
>  field:
>name destination ip bitwidth 32
>
>  name ethernet
>  field:
>name destination mac bitwidth 48
>
>but none mention 'kvd' or 'linear' or 'hash" and none of the other
>various devlink options:
>
>$ devlink
>Usage: devlink [ OPTIONS ] OBJECT { COMMAND | help }
>where  OBJECT := { dev | port | sb | monitor | dpipe }
>
>seem to related to resources.
>
>So how does a user know what they are controlling by this 'resource'
>option? Is the user expected to have a PRM or user guide on hand for the
>specific device model that is being configured?

The relation of specific dpipe table to specific resource is exposed by
the kernel as well. Probably the iproute2 patch just does not display
it.


>
>Again, I have no objections to kvd, linear, hash, etc terms as they do
>relate to Mellanox products. But kvd/linear, for example, does correlate
>to industry standard concepts in some way. My request is that the
>resource listing guide the user in some way, stating what these
>resources mean.

So the showed relation to dpipe table would be enougn or you would still
like to see some description? I don't like the description concept here
as the relations to dpipe table should tell user exactly what he needs
to know.


>
>IMO the above output is not user friendly and having to keep a PRM on
>hand for each device model is not a realistic solution.


Re: [patch net-next v2 00/10] Add support for resource abstraction

2017-12-26 Thread David Ahern
On 12/26/17 5:23 AM, Jiri Pirko wrote:
> From: Jiri Pirko 
> 
> Many of the ASIC's internal resources are limited and are shared between
> several hardware procedures. For example, unified hash-based memory can
> be used for many lookup purposes, like FDB and LPM. In many cases the user
> can provide a partitioning scheme for such a resource in order to perform
> fine tuning for his application. In such cases performing driver reload is
> needed for the changes to take place, thus this patchset also adds support
> for hot reload.
> 
> Such an abstraction can be coupled with devlink's dpipe interface, which
> models the ASIC's pipeline as a graph of match/action tables. By modeling
> the hardware resource object, and by coupling it to several dpipe tables,
> further visibility can be achieved in order to debug ASIC-wide issues.
> 
> The proposed interface will provide the user the ability to understand the
> limitations of the hardware, and receive notification regarding its occupancy.
> Furthermore, monitoring the resource occupancy can be done in real-time and
> can be useful in many cases.

In the last RFC (not v1, but RFC) I asked for some kind of description
for each resource, and you and Arkadi have pushed back. Let's walk
through an example to see what I mean:

$ devlink resource show pci/:03:00.0
pci/:03:00.0:
  name kvd size 245760 size_valid true
  resources:
name linear size 98304 occ 0
name hash_double size 60416
name hash_single size 87040

So this 2700 has 3 resources that can be managed -- some table or
resource or something named 'kvd' with linear, hash_double and
hash_single sub-resources. What are these names referring too? The above
output gives no description, and 'kvd' is not an industry term. Further,
what are these sizes that a user can control? The output contains no
units, no description, nothing. In short, the above output provides
random numbers associated with random names.

I can see dpipe tables exported by this device:

$ devlink dpipe header show pci/:03:00.0

pci/:03:00.0:
  name mlxsw_meta
  field:
name erif_port bitwidth 32 mapping_type ifindex
name l3_forward bitwidth 1
name l3_drop bitwidth 1
name adj_index bitwidth 32
name adj_size bitwidth 32
name adj_hash_index bitwidth 32

  name ipv6
  field:
name destination ip bitwidth 128

  name ipv4
  field:
name destination ip bitwidth 32

  name ethernet
  field:
name destination mac bitwidth 48

but none mention 'kvd' or 'linear' or 'hash" and none of the other
various devlink options:

$ devlink
Usage: devlink [ OPTIONS ] OBJECT { COMMAND | help }
where  OBJECT := { dev | port | sb | monitor | dpipe }

seem to related to resources.

So how does a user know what they are controlling by this 'resource'
option? Is the user expected to have a PRM or user guide on hand for the
specific device model that is being configured?

Again, I have no objections to kvd, linear, hash, etc terms as they do
relate to Mellanox products. But kvd/linear, for example, does correlate
to industry standard concepts in some way. My request is that the
resource listing guide the user in some way, stating what these
resources mean.

IMO the above output is not user friendly and having to keep a PRM on
hand for each device model is not a realistic solution.


[patch net-next v2 00/10] Add support for resource abstraction

2017-12-26 Thread Jiri Pirko
From: Jiri Pirko 

Many of the ASIC's internal resources are limited and are shared between
several hardware procedures. For example, unified hash-based memory can
be used for many lookup purposes, like FDB and LPM. In many cases the user
can provide a partitioning scheme for such a resource in order to perform
fine tuning for his application. In such cases performing driver reload is
needed for the changes to take place, thus this patchset also adds support
for hot reload.

Such an abstraction can be coupled with devlink's dpipe interface, which
models the ASIC's pipeline as a graph of match/action tables. By modeling
the hardware resource object, and by coupling it to several dpipe tables,
further visibility can be achieved in order to debug ASIC-wide issues.

The proposed interface will provide the user the ability to understand the
limitations of the hardware, and receive notification regarding its occupancy.
Furthermore, monitoring the resource occupancy can be done in real-time and
can be useful in many cases.
---
Userspace part prototype can be found at https://github.com/arkadis/iproute2/
at resource_dev branch.

v1->v2
- Add resource size attribute.
- Fix split bug.

Arkadi Sharshevsky (10):
  devlink: Add per devlink instance lock
  devlink: Add support for resource abstraction
  devlink: Add support for reload
  devlink: Add relation between dpipe and resource
  mlxsw: pci: Add support for performing bus reset
  mlxsw: spectrum: Register KVD resources with devlink
  mlxsw: spectrum_dpipe: Connect dpipe tables to resources
  mlxsw: spectrum: Add support for getting kvdl occupancy
  mlxsw: pci: Add support for getting resource through devlink
  mlxsw: core: Add support for reload

 drivers/net/ethernet/mellanox/mlxsw/core.c |  85 ++-
 drivers/net/ethernet/mellanox/mlxsw/core.h |  16 +-
 drivers/net/ethernet/mellanox/mlxsw/i2c.c  |   5 +-
 drivers/net/ethernet/mellanox/mlxsw/pci.c  |  98 ++--
 drivers/net/ethernet/mellanox/mlxsw/spectrum.c | 205 
 drivers/net/ethernet/mellanox/mlxsw/spectrum.h |  13 +
 .../net/ethernet/mellanox/mlxsw/spectrum_dpipe.c   |  72 ++-
 .../net/ethernet/mellanox/mlxsw/spectrum_kvdl.c|  26 +
 include/net/devlink.h  |  97 
 include/uapi/linux/devlink.h   |  21 +
 net/core/devlink.c | 573 ++---
 11 files changed, 1079 insertions(+), 132 deletions(-)

-- 
2.9.5