Re: [ovs-dev] [RFC 1/3] dpdk: Unify passing of DPDK EAL arguments.
On 08.02.2019 13:36, Kevin Traynor wrote: > Hi Ilya, > > Thanks for raising this and explaining in detail so it can be discussed. Thanks for sharing your thoughts. > Comments below, > > On 02/01/2019 01:11 PM, Ilya Maximets wrote: >> This patch deprecates the usage of current configuration knobs for >> DPDK EAL arguments: >> >> - dpdk-alloc-mem >> - dpdk-socket-mem >> - dpdk-socket-limit >> - dpdk-lcore-mask >> - dpdk-hugepage-dir >> - dpdk-extra >> >> All above configuration options replaced with single 'dpdk-options' >> which has same format as old 'dpdk-extra', i.e. just a string with >> all the DPDK EAL arguments. >> >> All the documentation about old configuration knobs removed. Users >> could still use an old interface, but the deprecation warning will be >> printed. In case 'dpdk-options' provided, values of old options will >> be ignored. New users will start using new 'dpdk-options' as this is >> the only documented way providing EAL arguments. >> >> Rationale: >> >> From release to release DPDK becomes more complex. New EAL arguments >> appears, old arguments becomes deprecated. Some changes their meaning >> and behaviour. It's not easy task to manage all this and current >> code in OVS that tries to manage DPDK options is not flexible/extendable >> enough even to track all the dependencies between options in DPDK 18.11. >> For example, '--socket-mem' changed its meaning, '--legacy-mem' and >> '--socket-limit' appeared. But '--legacy-mem' and '--socket-limit' >> could not be used at the same time and, also, we want to provide >> good default value for '--socket-limit' to keep the consistent >> behaviour of OVS memory usage. And this will be worse in the future. >> > > I think it is an exception that shows it is quite stable. There's > probably been something like 10 DPDK releases (didn't check exactly) > since these params were introduced and this seems to be first time that > there was an issue with an argument. > > It was also a very rare change in DPDK where the memory mgmt was > re-designed and re-written from the ground up. That meant an extra flag > was needed to keep the same behavior. Perhaps if the existing flag was > not reused, there wouldn't have been an issue at all (or at least it > would have been much clearer). I want to clarify that the 'socket-mem' related issue is not fully solved in current OVS. It's still possible for user to specify the 'dpdk-socket-limit' knob and use '--legacy-mem' in 'dpdk-extra'. OVS is not able to track this in current parsing code without introducing nasty workarounds. > >> Another point is that even now we have mutually exclusive database >> configs in OVS and we have to handle them. i.e we're providing >> 'dpdk-alloc-mem' and 'dpdk-socket-mem' at the same time. Additionally, >> user allowed to configure same options in 'dpdk-extra'. So, we have >> similar/equal options in a three places in ovsdb and we need to >> choose which one to use. It's pretty easy for user to get into >> inconsistent database configuration, because users could update >> one field without removing others, and OVS has to resolve all the >> conflicts somehow constructing the EAL args. But we do not know in >> practice, if the resulted arguments are the arguments that user wanted >> to see or just forget to remove the higher priority knob. >> > > That's true about dpdk-alloc-mem and dpdk-socket-mem. I'm not sure if > the docs give some direction on that. IIRC, it is documented and checked > about precedence when using dpdk-extra's. > >> Next point is that we have 'dpdk-lcore-mask' (-c) which is actually not >> so user-friendly as '-l', but we have no option for it. Will we create >> additional 'dpdk-lcore-list' ? I guess, no, because it's not really >> important thing. But users will stuck with this not so user-friendly >> masks. >> > > I'm not sure which is more user-friendly but I agree adding > dpdk-lcore-list is not needed, and probably could add confusion > (especially as dpdk-lcore-mask doesn't need '0x' prefix). IMHO, numbers are more visual than masks. But yes, this will add more confusion. > >> Another thing is that OVS is not really need to check the consistency >> of the EAL arguments because DPDK could check it instead. DPDK will >> always check the args and it will do it better. There is no real >> need to duplicate this functionality. >> >> Keeping all the options in a single string 'dpdk-options' allows: >> >> * To have only one place for all the configs, i.e. it will be harder >> for user to forget to remove something from the single string while >> updating it. >> * Not to check the consistency between different configs, DPDK could >> check the cmdline itself. >> * Not to document every DPDK EAL argument in OVS. We can just >> describe few of them and point to DPDK docs for more information. >> * OVS still able to provide some minimal default arguments. >> Thanks to DPDK 18.11 we only need to specify an
Re: [ovs-dev] [RFC 1/3] dpdk: Unify passing of DPDK EAL arguments.
On 07.02.2019 19:56, Flavio Leitner wrote: > On Wed, Feb 06, 2019 at 11:33:04AM +0300, Ilya Maximets wrote: >> On 05.02.2019 23:19, Flavio Leitner wrote: >>> On Tue, Feb 05, 2019 at 06:43:08PM +0300, Ilya Maximets wrote: Hi Flavio. Thanks for taking a look. On 05.02.2019 15:38, Flavio Leitner wrote: > > Hi Ilya, > > Thanks for the patch and I think we knew about that pain when we > exposed the very first parameter. I still remember Aaron struggling > to find the best path forward. Time flies and here we are. > > The problem is that this change is not friendly from the community > perspective to its users. That is like an exposed API which we should > avoid break as much as possible. > > For instance, there are users (OpenStack) with support life cycle of > 5+ years that cannot keep up with this kind of change but they expect > to be able to update to newer OVS. Sure, there are users that wants stable API that will never change. But this is really impossible in practice. I'm working with OpenStack too and will have to fixup deployment tools with these changes. BTW, from the whole OpenStack only few deployment projects (like TripleO) will need to make changes and these changes are not very big. >>> >>> That's only part of the work. There will be work on QA, documentation >>> and even migration path from one to another. And we can't change the >>> past for existing deployments. >> >> Sure. But incompatible API changes are almost unavoidable for a young >> projects that wants to be better. > > My point here is that the associated impact and cost of this change is > far from trivial. I agree. > >>> I don't think we should remove the docs if the parameters are there as >>> a first step. I mean, assume an existing deployment, there is a parameter >>> which might be in use but there is no documentation available. That >>> doesn't sound like a good user experience to me. >> >> Maybe we could save a man pages while removing the guides. There is no much >> information in Documentation/intro/install/dpdk.rst anyway. > > Agreed. > >>> On another hand, you could introduce the new interface and update the >>> docs to recommend using the new one because the old one will be removed >>> in the future. Warning messages come next, and then finally its removal. >> >> I'd prefer to have warning messages to be there right from the start to >> push users to migrate to the new interfaces as early as possible. > > If it's a WARN level message then I can tell you it will break > environments over here. If that is a INFO level message, then it might > be okay. Though I still think there could be a period of time where > both could co-exist before we announce/warn about the deprecation of > the current interface. Yeah, most testing tools wants to have no warning at all. INFO level messages could be fine for the period when we're still supporting old knobs. I'm not sure about co-existence without any messages at all. I think that we need something that will push users to migrate to the new interface. > >> How about this: >> >> First stage (apply now, release in 2.12): >> - Introduce new interface 'dpdk-options'. >> - Rewrite installation guide with new interface fully removing the old one. >> - Add new interface to man pages (vswitch.xml) and mark all the old options >> as deprecated by adding something like: "Deprecated. 'dpdk-options' >> should >> be used instead. Will be ignored in the future." >> - Add a runtime deprecation warning if old interface is in use. > > ^^^ this is bad as an initial step as explained above. Sure. Could be changed to INFO. > >> - Ignore values of old knobs if 'dpdk-options' provided. > > Sounds like a good transition barrier. > > >> Second stage (release in 2.13 or 2.14, could wait longer if required): >> - Remove old interfaces wile keeping the warnings. (i.e. values always >> ignored) >> - Remove old knobs from the man pages. >> >> Third stage (optional): >> - Remove warnings. >> >> So the main difference from the current patches is delaying removal of the >> man >> pages to the second stage. > [...] We're keeping few sane defaults like providing default lcore and setting the socket-limit if needed. And we'll try to do that in the future. The thing this patch tries to eliminate is the dependency tracking between different EAL arguments, i.e. duplicating the work with rte_eal_init() and having too many configuration knobs with similar meanings what does not work at the same time. Right now there are no critical arguments that user must provide. So, IMHO, having special configs for them is really unnecessary. >>> >>> Do you think the defaults work for the majority of DPDK users? >> >> My personal experience is opposite. Most of my deployments and deployments >> that >> I had to work with had massive
Re: [ovs-dev] [RFC 1/3] dpdk: Unify passing of DPDK EAL arguments.
Hi Ilya, Thanks for raising this and explaining in detail so it can be discussed. Comments below, On 02/01/2019 01:11 PM, Ilya Maximets wrote: > This patch deprecates the usage of current configuration knobs for > DPDK EAL arguments: > > - dpdk-alloc-mem > - dpdk-socket-mem > - dpdk-socket-limit > - dpdk-lcore-mask > - dpdk-hugepage-dir > - dpdk-extra > > All above configuration options replaced with single 'dpdk-options' > which has same format as old 'dpdk-extra', i.e. just a string with > all the DPDK EAL arguments. > > All the documentation about old configuration knobs removed. Users > could still use an old interface, but the deprecation warning will be > printed. In case 'dpdk-options' provided, values of old options will > be ignored. New users will start using new 'dpdk-options' as this is > the only documented way providing EAL arguments. > > Rationale: > > From release to release DPDK becomes more complex. New EAL arguments > appears, old arguments becomes deprecated. Some changes their meaning > and behaviour. It's not easy task to manage all this and current > code in OVS that tries to manage DPDK options is not flexible/extendable > enough even to track all the dependencies between options in DPDK 18.11. > For example, '--socket-mem' changed its meaning, '--legacy-mem' and > '--socket-limit' appeared. But '--legacy-mem' and '--socket-limit' > could not be used at the same time and, also, we want to provide > good default value for '--socket-limit' to keep the consistent > behaviour of OVS memory usage. And this will be worse in the future. > I think it is an exception that shows it is quite stable. There's probably been something like 10 DPDK releases (didn't check exactly) since these params were introduced and this seems to be first time that there was an issue with an argument. It was also a very rare change in DPDK where the memory mgmt was re-designed and re-written from the ground up. That meant an extra flag was needed to keep the same behavior. Perhaps if the existing flag was not reused, there wouldn't have been an issue at all (or at least it would have been much clearer). > Another point is that even now we have mutually exclusive database > configs in OVS and we have to handle them. i.e we're providing > 'dpdk-alloc-mem' and 'dpdk-socket-mem' at the same time. Additionally, > user allowed to configure same options in 'dpdk-extra'. So, we have > similar/equal options in a three places in ovsdb and we need to > choose which one to use. It's pretty easy for user to get into > inconsistent database configuration, because users could update > one field without removing others, and OVS has to resolve all the > conflicts somehow constructing the EAL args. But we do not know in > practice, if the resulted arguments are the arguments that user wanted > to see or just forget to remove the higher priority knob. > That's true about dpdk-alloc-mem and dpdk-socket-mem. I'm not sure if the docs give some direction on that. IIRC, it is documented and checked about precedence when using dpdk-extra's. > Next point is that we have 'dpdk-lcore-mask' (-c) which is actually not > so user-friendly as '-l', but we have no option for it. Will we create > additional 'dpdk-lcore-list' ? I guess, no, because it's not really > important thing. But users will stuck with this not so user-friendly > masks. > I'm not sure which is more user-friendly but I agree adding dpdk-lcore-list is not needed, and probably could add confusion (especially as dpdk-lcore-mask doesn't need '0x' prefix). > Another thing is that OVS is not really need to check the consistency > of the EAL arguments because DPDK could check it instead. DPDK will > always check the args and it will do it better. There is no real > need to duplicate this functionality. > > Keeping all the options in a single string 'dpdk-options' allows: > > * To have only one place for all the configs, i.e. it will be harder > for user to forget to remove something from the single string while > updating it. > * Not to check the consistency between different configs, DPDK could > check the cmdline itself. > * Not to document every DPDK EAL argument in OVS. We can just > describe few of them and point to DPDK docs for more information. > * OVS still able to provide some minimal default arguments. > Thanks to DPDK 18.11 we only need to specify an lcore. All other > arguments are not necessary. (We still also providing default > --socket-limit in case --socket-mem passed without it and without > --legacy-mem) > It would seem wrong to tell the OVS user to fill in the DPDK EAL arguments in a type of passthrough string to DPDK but then also insert defaults to that. > Usage example: > > ovs-vsctl set Open_vSwitch . \ > other_config:dpdk-options="-l 1 -n 4 --socket-mem 1024,1024" > It is possible to do that now. > Signed-off-by: Ilya Maximets What I like about the current scheme is
Re: [ovs-dev] [RFC 1/3] dpdk: Unify passing of DPDK EAL arguments.
On Wed, Feb 06, 2019 at 11:33:04AM +0300, Ilya Maximets wrote: > On 05.02.2019 23:19, Flavio Leitner wrote: > > On Tue, Feb 05, 2019 at 06:43:08PM +0300, Ilya Maximets wrote: > >> Hi Flavio. > >> Thanks for taking a look. > >> > >> On 05.02.2019 15:38, Flavio Leitner wrote: > >>> > >>> Hi Ilya, > >>> > >>> Thanks for the patch and I think we knew about that pain when we > >>> exposed the very first parameter. I still remember Aaron struggling > >>> to find the best path forward. Time flies and here we are. > >>> > >>> The problem is that this change is not friendly from the community > >>> perspective to its users. That is like an exposed API which we should > >>> avoid break as much as possible. > >>> > >>> For instance, there are users (OpenStack) with support life cycle of > >>> 5+ years that cannot keep up with this kind of change but they expect > >>> to be able to update to newer OVS. > >> > >> Sure, there are users that wants stable API that will never change. > >> But this is really impossible in practice. I'm working with OpenStack > >> too and will have to fixup deployment tools with these changes. BTW, from > >> the whole OpenStack only few deployment projects (like TripleO) will need > >> to make changes and these changes are not very big. > > > > That's only part of the work. There will be work on QA, documentation > > and even migration path from one to another. And we can't change the > > past for existing deployments. > > Sure. But incompatible API changes are almost unavoidable for a young > projects that wants to be better. My point here is that the associated impact and cost of this change is far from trivial. > > I don't think we should remove the docs if the parameters are there as > > a first step. I mean, assume an existing deployment, there is a parameter > > which might be in use but there is no documentation available. That > > doesn't sound like a good user experience to me. > > Maybe we could save a man pages while removing the guides. There is no much > information in Documentation/intro/install/dpdk.rst anyway. Agreed. > > On another hand, you could introduce the new interface and update the > > docs to recommend using the new one because the old one will be removed > > in the future. Warning messages come next, and then finally its removal. > > I'd prefer to have warning messages to be there right from the start to > push users to migrate to the new interfaces as early as possible. If it's a WARN level message then I can tell you it will break environments over here. If that is a INFO level message, then it might be okay. Though I still think there could be a period of time where both could co-exist before we announce/warn about the deprecation of the current interface. > How about this: > > First stage (apply now, release in 2.12): > - Introduce new interface 'dpdk-options'. > - Rewrite installation guide with new interface fully removing the old one. > - Add new interface to man pages (vswitch.xml) and mark all the old options > as deprecated by adding something like: "Deprecated. 'dpdk-options' should > be used instead. Will be ignored in the future." > - Add a runtime deprecation warning if old interface is in use. ^^^ this is bad as an initial step as explained above. > - Ignore values of old knobs if 'dpdk-options' provided. Sounds like a good transition barrier. > Second stage (release in 2.13 or 2.14, could wait longer if required): > - Remove old interfaces wile keeping the warnings. (i.e. values always > ignored) > - Remove old knobs from the man pages. > > Third stage (optional): > - Remove warnings. > > So the main difference from the current patches is delaying removal of the man > pages to the second stage. [...] > >> We're keeping few sane defaults like providing default lcore and setting > >> the > >> socket-limit if needed. And we'll try to do that in the future. The thing > >> this patch tries to eliminate is the dependency tracking between different > >> EAL arguments, i.e. duplicating the work with rte_eal_init() and having too > >> many configuration knobs with similar meanings what does not work at the > >> same time. > >> > >> Right now there are no critical arguments that user must provide. > >> So, IMHO, having special configs for them is really unnecessary. > > > > Do you think the defaults work for the majority of DPDK users? > > My personal experience is opposite. Most of my deployments and deployments > that > I had to work with had massive 'dpdk-extra' arguments. These are mostly memory > tuning options, pci whitelists and various virtual devices. And these > 'dpdk-extra' > strings only grows over time with '--socket-limit', '--legacy-mem', > '--single-file-segments', '--in-memory', '--file-prefix' and so on. That is indeed a problem, but moving to 'dpdk-options' will not solve it. > Defaults are OK if your application is the only application you want to see on > a system. Otherwise, you need
Re: [ovs-dev] [RFC 1/3] dpdk: Unify passing of DPDK EAL arguments.
On 06.02.2019 18:05, Aaron Conole wrote: > Ilya Maximets writes: > >> On 05.02.2019 23:19, Flavio Leitner wrote: >>> On Tue, Feb 05, 2019 at 06:43:08PM +0300, Ilya Maximets wrote: Hi Flavio. Thanks for taking a look. On 05.02.2019 15:38, Flavio Leitner wrote: > > Hi Ilya, > > Thanks for the patch and I think we knew about that pain when we > exposed the very first parameter. I still remember Aaron struggling > to find the best path forward. Time flies and here we are. > > The problem is that this change is not friendly from the community > perspective to its users. That is like an exposed API which we should > avoid break as much as possible. > > For instance, there are users (OpenStack) with support life cycle of > 5+ years that cannot keep up with this kind of change but they expect > to be able to update to newer OVS. Sure, there are users that wants stable API that will never change. But this is really impossible in practice. I'm working with OpenStack too and will have to fixup deployment tools with these changes. BTW, from the whole OpenStack only few deployment projects (like TripleO) will need to make changes and these changes are not very big. >>> >>> That's only part of the work. There will be work on QA, documentation >>> and even migration path from one to another. And we can't change the >>> past for existing deployments. >> >> Sure. But incompatible API changes are almost unavoidable for a young >> projects that wants to be better. >> >>> > One idea is to freeze whatever we have now and update the documentation > to recommend the new way. We give like a couple OVS releases and only > then ignore (or remove?) those parameters. Yes, In cover letter I proposed these patches to be applied one per release. And current (first) patch does not remove the functionality, only docs. Users still will be able to use old interface, but will have warnings in the log. In the next release cycle we'll start ignore the values while still printing the warnings. This should give enough time for adaptation. If you feel that we need more time, we could apply the second patch to 2.14 (or whatever number will be in 2 releases from now). >>> >>> I don't think we should remove the docs if the parameters are there as >>> a first step. I mean, assume an existing deployment, there is a parameter >>> which might be in use but there is no documentation available. That >>> doesn't sound like a good user experience to me. >> >> Maybe we could save a man pages while removing the guides. There is no much >> information in Documentation/intro/install/dpdk.rst anyway. >> >>> >>> On another hand, you could introduce the new interface and update the >>> docs to recommend using the new one because the old one will be removed >>> in the future. Warning messages come next, and then finally its removal. >> >> I'd prefer to have warning messages to be there right from the start to >> push users to migrate to the new interfaces as early as possible. >> >> How about this: >> >> First stage (apply now, release in 2.12): >> - Introduce new interface 'dpdk-options'. >> - Rewrite installation guide with new interface fully removing the old one. >> - Add new interface to man pages (vswitch.xml) and mark all the old options >> as deprecated by adding something like: "Deprecated. 'dpdk-options' >> should >> be used instead. Will be ignored in the future." >> - Add a runtime deprecation warning if old interface is in use. >> - Ignore values of old knobs if 'dpdk-options' provided. >> >> Second stage (release in 2.13 or 2.14, could wait longer if required): >> - Remove old interfaces wile keeping the warnings. (i.e. values always >> ignored) >> - Remove old knobs from the man pages. >> >> Third stage (optional): >> - Remove warnings. >> >> So the main difference from the current patches is delaying removal of the >> man >> pages to the second stage. >> >>> >>> > IMO in the end we are moving the problem from one place to another > because even with a single string, OVS users will be caught off guard > when DPDK changes. Yes, less pain to OVS community in the sense that > we don't have to add/remove/deprecate stuff, but it is still a bad > user experience regardless, which is not what OVS is known for. Unfortunately, DPDK was never user-friendly enough. It tries, but still not. >>> >>> Agreed. >>> We're keeping few sane defaults like providing default lcore and setting the socket-limit if needed. And we'll try to do that in the future. The thing this patch tries to eliminate is the dependency tracking between different EAL arguments, i.e. duplicating the work with rte_eal_init() and having too many configuration knobs with similar meanings what does not work at the same time.
Re: [ovs-dev] [RFC 1/3] dpdk: Unify passing of DPDK EAL arguments.
Ilya Maximets writes: > On 05.02.2019 23:19, Flavio Leitner wrote: >> On Tue, Feb 05, 2019 at 06:43:08PM +0300, Ilya Maximets wrote: >>> Hi Flavio. >>> Thanks for taking a look. >>> >>> On 05.02.2019 15:38, Flavio Leitner wrote: Hi Ilya, Thanks for the patch and I think we knew about that pain when we exposed the very first parameter. I still remember Aaron struggling to find the best path forward. Time flies and here we are. The problem is that this change is not friendly from the community perspective to its users. That is like an exposed API which we should avoid break as much as possible. For instance, there are users (OpenStack) with support life cycle of 5+ years that cannot keep up with this kind of change but they expect to be able to update to newer OVS. >>> >>> Sure, there are users that wants stable API that will never change. >>> But this is really impossible in practice. I'm working with OpenStack >>> too and will have to fixup deployment tools with these changes. BTW, from >>> the whole OpenStack only few deployment projects (like TripleO) will need >>> to make changes and these changes are not very big. >> >> That's only part of the work. There will be work on QA, documentation >> and even migration path from one to another. And we can't change the >> past for existing deployments. > > Sure. But incompatible API changes are almost unavoidable for a young > projects that wants to be better. > >> One idea is to freeze whatever we have now and update the documentation to recommend the new way. We give like a couple OVS releases and only then ignore (or remove?) those parameters. >>> >>> Yes, In cover letter I proposed these patches to be applied one per release. >>> And current (first) patch does not remove the functionality, only docs. >>> Users still will be able to use old interface, but will have warnings >>> in the log. In the next release cycle we'll start ignore the values >>> while still printing the warnings. This should give enough time for >>> adaptation. >>> If you feel that we need more time, we could apply the second patch to 2.14 >>> (or whatever number will be in 2 releases from now). >> >> I don't think we should remove the docs if the parameters are there as >> a first step. I mean, assume an existing deployment, there is a parameter >> which might be in use but there is no documentation available. That >> doesn't sound like a good user experience to me. > > Maybe we could save a man pages while removing the guides. There is no much > information in Documentation/intro/install/dpdk.rst anyway. > >> >> On another hand, you could introduce the new interface and update the >> docs to recommend using the new one because the old one will be removed >> in the future. Warning messages come next, and then finally its removal. > > I'd prefer to have warning messages to be there right from the start to > push users to migrate to the new interfaces as early as possible. > > How about this: > > First stage (apply now, release in 2.12): > - Introduce new interface 'dpdk-options'. > - Rewrite installation guide with new interface fully removing the old one. > - Add new interface to man pages (vswitch.xml) and mark all the old options > as deprecated by adding something like: "Deprecated. 'dpdk-options' should > be used instead. Will be ignored in the future." > - Add a runtime deprecation warning if old interface is in use. > - Ignore values of old knobs if 'dpdk-options' provided. > > Second stage (release in 2.13 or 2.14, could wait longer if required): > - Remove old interfaces wile keeping the warnings. (i.e. values always > ignored) > - Remove old knobs from the man pages. > > Third stage (optional): > - Remove warnings. > > So the main difference from the current patches is delaying removal of the man > pages to the second stage. > >> >> IMO in the end we are moving the problem from one place to another because even with a single string, OVS users will be caught off guard when DPDK changes. Yes, less pain to OVS community in the sense that we don't have to add/remove/deprecate stuff, but it is still a bad user experience regardless, which is not what OVS is known for. >>> >>> Unfortunately, DPDK was never user-friendly enough. It tries, but still not. >> >> Agreed. >> >>> We're keeping few sane defaults like providing default lcore and setting the >>> socket-limit if needed. And we'll try to do that in the future. The thing >>> this patch tries to eliminate is the dependency tracking between different >>> EAL arguments, i.e. duplicating the work with rte_eal_init() and having too >>> many configuration knobs with similar meanings what does not work at the >>> same time. >>> >>> Right now there are no critical arguments that user must provide. >>> So, IMHO, having special configs for them is really unnecessary. >> >> Do you think the
Re: [ovs-dev] [RFC 1/3] dpdk: Unify passing of DPDK EAL arguments.
On 05.02.2019 23:19, Flavio Leitner wrote: > On Tue, Feb 05, 2019 at 06:43:08PM +0300, Ilya Maximets wrote: >> Hi Flavio. >> Thanks for taking a look. >> >> On 05.02.2019 15:38, Flavio Leitner wrote: >>> >>> Hi Ilya, >>> >>> Thanks for the patch and I think we knew about that pain when we >>> exposed the very first parameter. I still remember Aaron struggling >>> to find the best path forward. Time flies and here we are. >>> >>> The problem is that this change is not friendly from the community >>> perspective to its users. That is like an exposed API which we should >>> avoid break as much as possible. >>> >>> For instance, there are users (OpenStack) with support life cycle of >>> 5+ years that cannot keep up with this kind of change but they expect >>> to be able to update to newer OVS. >> >> Sure, there are users that wants stable API that will never change. >> But this is really impossible in practice. I'm working with OpenStack >> too and will have to fixup deployment tools with these changes. BTW, from >> the whole OpenStack only few deployment projects (like TripleO) will need >> to make changes and these changes are not very big. > > That's only part of the work. There will be work on QA, documentation > and even migration path from one to another. And we can't change the > past for existing deployments. Sure. But incompatible API changes are almost unavoidable for a young projects that wants to be better. > >>> One idea is to freeze whatever we have now and update the documentation >>> to recommend the new way. We give like a couple OVS releases and only >>> then ignore (or remove?) those parameters. >> >> Yes, In cover letter I proposed these patches to be applied one per release. >> And current (first) patch does not remove the functionality, only docs. >> Users still will be able to use old interface, but will have warnings >> in the log. In the next release cycle we'll start ignore the values >> while still printing the warnings. This should give enough time for >> adaptation. >> If you feel that we need more time, we could apply the second patch to 2.14 >> (or whatever number will be in 2 releases from now). > > I don't think we should remove the docs if the parameters are there as > a first step. I mean, assume an existing deployment, there is a parameter > which might be in use but there is no documentation available. That > doesn't sound like a good user experience to me. Maybe we could save a man pages while removing the guides. There is no much information in Documentation/intro/install/dpdk.rst anyway. > > On another hand, you could introduce the new interface and update the > docs to recommend using the new one because the old one will be removed > in the future. Warning messages come next, and then finally its removal. I'd prefer to have warning messages to be there right from the start to push users to migrate to the new interfaces as early as possible. How about this: First stage (apply now, release in 2.12): - Introduce new interface 'dpdk-options'. - Rewrite installation guide with new interface fully removing the old one. - Add new interface to man pages (vswitch.xml) and mark all the old options as deprecated by adding something like: "Deprecated. 'dpdk-options' should be used instead. Will be ignored in the future." - Add a runtime deprecation warning if old interface is in use. - Ignore values of old knobs if 'dpdk-options' provided. Second stage (release in 2.13 or 2.14, could wait longer if required): - Remove old interfaces wile keeping the warnings. (i.e. values always ignored) - Remove old knobs from the man pages. Third stage (optional): - Remove warnings. So the main difference from the current patches is delaying removal of the man pages to the second stage. > > >>> IMO in the end we are moving the problem from one place to another >>> because even with a single string, OVS users will be caught off guard >>> when DPDK changes. Yes, less pain to OVS community in the sense that >>> we don't have to add/remove/deprecate stuff, but it is still a bad >>> user experience regardless, which is not what OVS is known for. >> >> Unfortunately, DPDK was never user-friendly enough. It tries, but still not. > > Agreed. > >> We're keeping few sane defaults like providing default lcore and setting the >> socket-limit if needed. And we'll try to do that in the future. The thing >> this patch tries to eliminate is the dependency tracking between different >> EAL arguments, i.e. duplicating the work with rte_eal_init() and having too >> many configuration knobs with similar meanings what does not work at the >> same time. >> >> Right now there are no critical arguments that user must provide. >> So, IMHO, having special configs for them is really unnecessary. > > Do you think the defaults work for the majority of DPDK users? My personal experience is opposite. Most of my deployments and deployments that I had to work with had massive
Re: [ovs-dev] [RFC 1/3] dpdk: Unify passing of DPDK EAL arguments.
On Tue, Feb 05, 2019 at 06:43:08PM +0300, Ilya Maximets wrote: > Hi Flavio. > Thanks for taking a look. > > On 05.02.2019 15:38, Flavio Leitner wrote: > > > > Hi Ilya, > > > > Thanks for the patch and I think we knew about that pain when we > > exposed the very first parameter. I still remember Aaron struggling > > to find the best path forward. Time flies and here we are. > > > > The problem is that this change is not friendly from the community > > perspective to its users. That is like an exposed API which we should > > avoid break as much as possible. > > > > For instance, there are users (OpenStack) with support life cycle of > > 5+ years that cannot keep up with this kind of change but they expect > > to be able to update to newer OVS. > > Sure, there are users that wants stable API that will never change. > But this is really impossible in practice. I'm working with OpenStack > too and will have to fixup deployment tools with these changes. BTW, from > the whole OpenStack only few deployment projects (like TripleO) will need > to make changes and these changes are not very big. That's only part of the work. There will be work on QA, documentation and even migration path from one to another. And we can't change the past for existing deployments. > > One idea is to freeze whatever we have now and update the documentation > > to recommend the new way. We give like a couple OVS releases and only > > then ignore (or remove?) those parameters. > > Yes, In cover letter I proposed these patches to be applied one per release. > And current (first) patch does not remove the functionality, only docs. > Users still will be able to use old interface, but will have warnings > in the log. In the next release cycle we'll start ignore the values > while still printing the warnings. This should give enough time for > adaptation. > If you feel that we need more time, we could apply the second patch to 2.14 > (or whatever number will be in 2 releases from now). I don't think we should remove the docs if the parameters are there as a first step. I mean, assume an existing deployment, there is a parameter which might be in use but there is no documentation available. That doesn't sound like a good user experience to me. On another hand, you could introduce the new interface and update the docs to recommend using the new one because the old one will be removed in the future. Warning messages come next, and then finally its removal. > > IMO in the end we are moving the problem from one place to another > > because even with a single string, OVS users will be caught off guard > > when DPDK changes. Yes, less pain to OVS community in the sense that > > we don't have to add/remove/deprecate stuff, but it is still a bad > > user experience regardless, which is not what OVS is known for. > > Unfortunately, DPDK was never user-friendly enough. It tries, but still not. Agreed. > We're keeping few sane defaults like providing default lcore and setting the > socket-limit if needed. And we'll try to do that in the future. The thing > this patch tries to eliminate is the dependency tracking between different > EAL arguments, i.e. duplicating the work with rte_eal_init() and having too > many configuration knobs with similar meanings what does not work at the > same time. > > Right now there are no critical arguments that user must provide. > So, IMHO, having special configs for them is really unnecessary. Do you think the defaults work for the majority of DPDK users? If we expose only the necessary things to consume DPDK it means we have tight control (less combinations to worry about), otherwise if the user can pass anything, a can of worms is opened and we have no control of all things the user can do. fbl ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [RFC 1/3] dpdk: Unify passing of DPDK EAL arguments.
Hi Flavio. Thanks for taking a look. On 05.02.2019 15:38, Flavio Leitner wrote: > > Hi Ilya, > > Thanks for the patch and I think we knew about that pain when we > exposed the very first parameter. I still remember Aaron struggling > to find the best path forward. Time flies and here we are. > > The problem is that this change is not friendly from the community > perspective to its users. That is like an exposed API which we should > avoid break as much as possible. > > For instance, there are users (OpenStack) with support life cycle of > 5+ years that cannot keep up with this kind of change but they expect > to be able to update to newer OVS. Sure, there are users that wants stable API that will never change. But this is really impossible in practice. I'm working with OpenStack too and will have to fixup deployment tools with these changes. BTW, from the whole OpenStack only few deployment projects (like TripleO) will need to make changes and these changes are not very big. > > One idea is to freeze whatever we have now and update the documentation > to recommend the new way. We give like a couple OVS releases and only > then ignore (or remove?) those parameters. Yes, In cover letter I proposed these patches to be applied one per release. And current (first) patch does not remove the functionality, only docs. Users still will be able to use old interface, but will have warnings in the log. In the next release cycle we'll start ignore the values while still printing the warnings. This should give enough time for adaptation. If you feel that we need more time, we could apply the second patch to 2.14 (or whatever number will be in 2 releases from now). > > IMO in the end we are moving the problem from one place to another > because even with a single string, OVS users will be caught off guard > when DPDK changes. Yes, less pain to OVS community in the sense that > we don't have to add/remove/deprecate stuff, but it is still a bad > user experience regardless, which is not what OVS is known for. Unfortunately, DPDK was never user-friendly enough. It tries, but still not. We're keeping few sane defaults like providing default lcore and setting the socket-limit if needed. And we'll try to do that in the future. The thing this patch tries to eliminate is the dependency tracking between different EAL arguments, i.e. duplicating the work with rte_eal_init() and having too many configuration knobs with similar meanings what does not work at the same time. Right now there are no critical arguments that user must provide. So, IMHO, having special configs for them is really unnecessary. > > Thanks, > fbl > > > On Fri, Feb 01, 2019 at 04:11:12PM +0300, Ilya Maximets wrote: >> This patch deprecates the usage of current configuration knobs for >> DPDK EAL arguments: >> >> - dpdk-alloc-mem >> - dpdk-socket-mem >> - dpdk-socket-limit >> - dpdk-lcore-mask >> - dpdk-hugepage-dir >> - dpdk-extra >> >> All above configuration options replaced with single 'dpdk-options' >> which has same format as old 'dpdk-extra', i.e. just a string with >> all the DPDK EAL arguments. >> >> All the documentation about old configuration knobs removed. Users >> could still use an old interface, but the deprecation warning will be >> printed. In case 'dpdk-options' provided, values of old options will >> be ignored. New users will start using new 'dpdk-options' as this is >> the only documented way providing EAL arguments. >> >> Rationale: >> >> From release to release DPDK becomes more complex. New EAL arguments >> appears, old arguments becomes deprecated. Some changes their meaning >> and behaviour. It's not easy task to manage all this and current >> code in OVS that tries to manage DPDK options is not flexible/extendable >> enough even to track all the dependencies between options in DPDK 18.11. >> For example, '--socket-mem' changed its meaning, '--legacy-mem' and >> '--socket-limit' appeared. But '--legacy-mem' and '--socket-limit' >> could not be used at the same time and, also, we want to provide >> good default value for '--socket-limit' to keep the consistent >> behaviour of OVS memory usage. And this will be worse in the future. >> >> Another point is that even now we have mutually exclusive database >> configs in OVS and we have to handle them. i.e we're providing >> 'dpdk-alloc-mem' and 'dpdk-socket-mem' at the same time. Additionally, >> user allowed to configure same options in 'dpdk-extra'. So, we have >> similar/equal options in a three places in ovsdb and we need to >> choose which one to use. It's pretty easy for user to get into >> inconsistent database configuration, because users could update >> one field without removing others, and OVS has to resolve all the >> conflicts somehow constructing the EAL args. But we do not know in >> practice, if the resulted arguments are the arguments that user wanted >> to see or just forget to remove the higher priority knob. >> >> Next point is that we
Re: [ovs-dev] [RFC 1/3] dpdk: Unify passing of DPDK EAL arguments.
Hi Ilya, Thanks for the patch and I think we knew about that pain when we exposed the very first parameter. I still remember Aaron struggling to find the best path forward. Time flies and here we are. The problem is that this change is not friendly from the community perspective to its users. That is like an exposed API which we should avoid break as much as possible. For instance, there are users (OpenStack) with support life cycle of 5+ years that cannot keep up with this kind of change but they expect to be able to update to newer OVS. One idea is to freeze whatever we have now and update the documentation to recommend the new way. We give like a couple OVS releases and only then ignore (or remove?) those parameters. IMO in the end we are moving the problem from one place to another because even with a single string, OVS users will be caught off guard when DPDK changes. Yes, less pain to OVS community in the sense that we don't have to add/remove/deprecate stuff, but it is still a bad user experience regardless, which is not what OVS is known for. Thanks, fbl On Fri, Feb 01, 2019 at 04:11:12PM +0300, Ilya Maximets wrote: > This patch deprecates the usage of current configuration knobs for > DPDK EAL arguments: > > - dpdk-alloc-mem > - dpdk-socket-mem > - dpdk-socket-limit > - dpdk-lcore-mask > - dpdk-hugepage-dir > - dpdk-extra > > All above configuration options replaced with single 'dpdk-options' > which has same format as old 'dpdk-extra', i.e. just a string with > all the DPDK EAL arguments. > > All the documentation about old configuration knobs removed. Users > could still use an old interface, but the deprecation warning will be > printed. In case 'dpdk-options' provided, values of old options will > be ignored. New users will start using new 'dpdk-options' as this is > the only documented way providing EAL arguments. > > Rationale: > > From release to release DPDK becomes more complex. New EAL arguments > appears, old arguments becomes deprecated. Some changes their meaning > and behaviour. It's not easy task to manage all this and current > code in OVS that tries to manage DPDK options is not flexible/extendable > enough even to track all the dependencies between options in DPDK 18.11. > For example, '--socket-mem' changed its meaning, '--legacy-mem' and > '--socket-limit' appeared. But '--legacy-mem' and '--socket-limit' > could not be used at the same time and, also, we want to provide > good default value for '--socket-limit' to keep the consistent > behaviour of OVS memory usage. And this will be worse in the future. > > Another point is that even now we have mutually exclusive database > configs in OVS and we have to handle them. i.e we're providing > 'dpdk-alloc-mem' and 'dpdk-socket-mem' at the same time. Additionally, > user allowed to configure same options in 'dpdk-extra'. So, we have > similar/equal options in a three places in ovsdb and we need to > choose which one to use. It's pretty easy for user to get into > inconsistent database configuration, because users could update > one field without removing others, and OVS has to resolve all the > conflicts somehow constructing the EAL args. But we do not know in > practice, if the resulted arguments are the arguments that user wanted > to see or just forget to remove the higher priority knob. > > Next point is that we have 'dpdk-lcore-mask' (-c) which is actually not > so user-friendly as '-l', but we have no option for it. Will we create > additional 'dpdk-lcore-list' ? I guess, no, because it's not really > important thing. But users will stuck with this not so user-friendly > masks. > > Another thing is that OVS is not really need to check the consistency > of the EAL arguments because DPDK could check it instead. DPDK will > always check the args and it will do it better. There is no real > need to duplicate this functionality. > > Keeping all the options in a single string 'dpdk-options' allows: > > * To have only one place for all the configs, i.e. it will be harder > for user to forget to remove something from the single string while > updating it. > * Not to check the consistency between different configs, DPDK could > check the cmdline itself. > * Not to document every DPDK EAL argument in OVS. We can just > describe few of them and point to DPDK docs for more information. > * OVS still able to provide some minimal default arguments. > Thanks to DPDK 18.11 we only need to specify an lcore. All other > arguments are not necessary. (We still also providing default > --socket-limit in case --socket-mem passed without it and without > --legacy-mem) > > Usage example: > > ovs-vsctl set Open_vSwitch . \ > other_config:dpdk-options="-l 1 -n 4 --socket-mem 1024,1024" > > Signed-off-by: Ilya Maximets > --- > Documentation/intro/install/dpdk.rst | 40 +--- > NEWS | 7 +- > lib/dpdk.c |
[ovs-dev] [RFC 1/3] dpdk: Unify passing of DPDK EAL arguments.
This patch deprecates the usage of current configuration knobs for DPDK EAL arguments: - dpdk-alloc-mem - dpdk-socket-mem - dpdk-socket-limit - dpdk-lcore-mask - dpdk-hugepage-dir - dpdk-extra All above configuration options replaced with single 'dpdk-options' which has same format as old 'dpdk-extra', i.e. just a string with all the DPDK EAL arguments. All the documentation about old configuration knobs removed. Users could still use an old interface, but the deprecation warning will be printed. In case 'dpdk-options' provided, values of old options will be ignored. New users will start using new 'dpdk-options' as this is the only documented way providing EAL arguments. Rationale: >From release to release DPDK becomes more complex. New EAL arguments appears, old arguments becomes deprecated. Some changes their meaning and behaviour. It's not easy task to manage all this and current code in OVS that tries to manage DPDK options is not flexible/extendable enough even to track all the dependencies between options in DPDK 18.11. For example, '--socket-mem' changed its meaning, '--legacy-mem' and '--socket-limit' appeared. But '--legacy-mem' and '--socket-limit' could not be used at the same time and, also, we want to provide good default value for '--socket-limit' to keep the consistent behaviour of OVS memory usage. And this will be worse in the future. Another point is that even now we have mutually exclusive database configs in OVS and we have to handle them. i.e we're providing 'dpdk-alloc-mem' and 'dpdk-socket-mem' at the same time. Additionally, user allowed to configure same options in 'dpdk-extra'. So, we have similar/equal options in a three places in ovsdb and we need to choose which one to use. It's pretty easy for user to get into inconsistent database configuration, because users could update one field without removing others, and OVS has to resolve all the conflicts somehow constructing the EAL args. But we do not know in practice, if the resulted arguments are the arguments that user wanted to see or just forget to remove the higher priority knob. Next point is that we have 'dpdk-lcore-mask' (-c) which is actually not so user-friendly as '-l', but we have no option for it. Will we create additional 'dpdk-lcore-list' ? I guess, no, because it's not really important thing. But users will stuck with this not so user-friendly masks. Another thing is that OVS is not really need to check the consistency of the EAL arguments because DPDK could check it instead. DPDK will always check the args and it will do it better. There is no real need to duplicate this functionality. Keeping all the options in a single string 'dpdk-options' allows: * To have only one place for all the configs, i.e. it will be harder for user to forget to remove something from the single string while updating it. * Not to check the consistency between different configs, DPDK could check the cmdline itself. * Not to document every DPDK EAL argument in OVS. We can just describe few of them and point to DPDK docs for more information. * OVS still able to provide some minimal default arguments. Thanks to DPDK 18.11 we only need to specify an lcore. All other arguments are not necessary. (We still also providing default --socket-limit in case --socket-mem passed without it and without --legacy-mem) Usage example: ovs-vsctl set Open_vSwitch . \ other_config:dpdk-options="-l 1 -n 4 --socket-mem 1024,1024" Signed-off-by: Ilya Maximets --- Documentation/intro/install/dpdk.rst | 40 +--- NEWS | 7 +- lib/dpdk.c | 49 +- utilities/ovs-dev.py | 2 +- vswitchd/vswitch.xml | 132 ++- 5 files changed, 108 insertions(+), 122 deletions(-) diff --git a/Documentation/intro/install/dpdk.rst b/Documentation/intro/install/dpdk.rst index 344d2b3a6..2243fde53 100644 --- a/Documentation/intro/install/dpdk.rst +++ b/Documentation/intro/install/dpdk.rst @@ -235,32 +235,44 @@ listed below. Defaults will be provided for all values not explicitly set. A value of ``try`` will imply that the ovs-vswitchd process should continue running even if the EAL initialization fails. -``dpdk-lcore-mask`` - Specifies the CPU cores on which dpdk lcore threads should be spawned and - expects hex string (eg '0x123'). +``dpdk-options`` + Specifies the string of EAL command line arguments for DPDK. + For example, ``other_config:dpdk-options="-l 0 --socket-mem 1024,1024"``. + Important arguments that could be passed there are: -``dpdk-socket-mem`` - Comma separated list of memory to pre-allocate from hugepages on specific - sockets. If not specified, 1024 MB will be set for each numa node by - default. + - ``-c`` or ``-l`` -``dpdk-hugepage-dir`` - Directory where hugetlbfs is mounted +Specifies the CPU cores on which dpdk lcore threads should be