Re: [openstack-dev] [neutron] [stable] Tool to aid in scalability problems mitigation.
- Original Message - Hi Miguel, while we'd need to hear from the stable team, I think it's not such a bad idea to make this tool available to users of pre-juno openstack releases. As far as upstream repos are concerned, I don't know if this tool violates the criteria for stable branches. Even if it would be a rather large change for stable/icehouse, it is pretty much orthogonal to the existing code, so it could be ok. However, please note that stable/havana has now reached its EOL, so there will be no more stable release for it. Sure, I was mentioning havana as affected, but I understand it's already under U/S EOL, D/S distributions would always be free to backport, specially on an orthogonal change like this. About stable/icehouse, I'd like to hear from the stable maintainers. The orthogonal nature of this tool however also make the case for making it widely available on pypi. I think it should be ok to describe the scalability issue in the official OpenStack Icehouse docs and point out to this tool for mitigation. Yes, of course, I consider that as a second option, my point here is that direct upstream review time would result in better quality code here, and could certainly spot any hidden bugs, and increase testing quality. It also reduces packaging time all across distributions making it available via the standard neutron repository. Thanks for the feedback!, Salvatore On 23 October 2014 14:03, Miguel Angel Ajo Pelayo mangel...@redhat.com wrote: Recently, we have identified clients with problems due to the bad scalability of security groups in Havana and Icehouse, that was addressed during juno here [1] [2] This situation is identified by blinking agents (going UP/DOWN), high AMQP load, nigh neutron-server load, and timeout from openvswitch agents when trying to contact neutron-server security_group_rules_for_devices. Doing a [1] backport involves many dependent patches related to the general RPC refactor in neutron (which modifies all plugins), and subsequent ones fixing a few bugs. Sounds risky to me. [2] Introduces new features and it's dependent on features which aren't available on all systems. To remediate this on production systems, I wrote a quick tool to help on reporting security groups and mitigating the problem by writing almost-equivalent rules [3]. We believe this tool would be better available to the wider community, and under better review and testing, and, since it doesn't modify any behavior or actual code in neutron, I'd like to propose it for inclusion into, at least, Icehouse stable branch where it's more relevant. I know the usual way is to go master-Juno-Icehouse, but at this moment the tool is only interesting for Icehouse (and Havana), although I believe it could be extended to cleanup orphaned resources, or any other cleanup tasks, in that case it could make sense to be available for K-J-I. As a reference, I'm leaving links to outputs from the tool [4][5] Looking forward to get some feedback, Miguel Ángel. [1] https://review.openstack.org/#/c/111876/ security group rpc refactor [2] https://review.openstack.org/#/c/111877/ ipset support [3] https://github.com/mangelajo/neutrontool [4] http://paste.openstack.org/show/123519/ [5] http://paste.openstack.org/show/123525/ ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [neutron] [stable] Tool to aid in scalability problems mitigation.
-BEGIN PGP SIGNED MESSAGE- Hash: SHA512 On 24/10/14 11:56, Miguel Angel Ajo Pelayo wrote: - Original Message - Hi Miguel, while we'd need to hear from the stable team, I think it's not such a bad idea to make this tool available to users of pre-juno openstack releases. It's a great idea actually. It's great when code emerged from real life downstream support cases eventually flow up to upstream for all operator's benefit (and not just those who pay huge money for commercial service). As far as upstream repos are concerned, I don't know if this tool violates the criteria for stable branches. Even if it would be a rather large change for stable/icehouse, it is pretty much orthogonal to the existing code, so it could be ok. However, please note that stable/havana has now reached its EOL, so there will be no more stable release for it. Sure, I was mentioning havana as affected, but I understand it's already under U/S EOL, D/S distributions would always be free to backport, specially on an orthogonal change like this. About stable/icehouse, I'd like to hear from the stable maintainers. I'm for inclusion of the tool in the main neutron package. Though it's possible to publish it on pypi as a separate package, I would better apply formal review process to it, plus reduce packaging efforts for distributions (and myself). The tool may be later expanded for other useful operator hooks, so I'm for inclusion of the tool in master and backporting it back to all supported branches. Though official stable maintainership rules state that 'New features' are no-go for stable branch [1], I think they should not apply in this case since the tool does not touch production code in any way and just provides a way to heal security groups on operator demand. Also, rules are to break them. ;) Quoting the same document, Proposed backports breaking any of above guidelines can be discussed as exception requests on openstack-stable-maint list where stable-maint team will try to reach consensus. Operators should be more happy if we ship such a tool as part of neutron release and not as another third-party tool from pypi of potentially unsafe origin. BTW I wonder whether the tool can be useful for Juno+ setups too. Though we mostly mitigated the problem by RPC interface rework and ipset, some operators may still hit some limitation that could be workarounded by optimizing their rules. Also, I think the idea of having a tool with miscellaneous operator hooks in the master tree is quite interesting. I would recommend to still go with pushing it to master and then backporting to stable branches. That would also help to get more review attention from cores than stable branch requests usually receive. ;) [1]: https://wiki.openstack.org/wiki/StableBranch#Appropriate_Fixes The orthogonal nature of this tool however also make the case for making it widely available on pypi. I think it should be ok to describe the scalability issue in the official OpenStack Icehouse docs and point out to this tool for mitigation. Yes, of course, I consider that as a second option, my point here is that direct upstream review time would result in better quality code here, and could certainly spot any hidden bugs, and increase testing quality. It also reduces packaging time all across distributions making it available via the standard neutron repository. Thanks for the feedback!, Salvatore On 23 October 2014 14:03, Miguel Angel Ajo Pelayo mangel...@redhat.com wrote: Recently, we have identified clients with problems due to the bad scalability of security groups in Havana and Icehouse, that was addressed during juno here [1] [2] This situation is identified by blinking agents (going UP/DOWN), high AMQP load, nigh neutron-server load, and timeout from openvswitch agents when trying to contact neutron-server security_group_rules_for_devices. Doing a [1] backport involves many dependent patches related to the general RPC refactor in neutron (which modifies all plugins), and subsequent ones fixing a few bugs. Sounds risky to me. [2] Introduces new features and it's dependent on features which aren't available on all systems. To remediate this on production systems, I wrote a quick tool to help on reporting security groups and mitigating the problem by writing almost-equivalent rules [3]. We believe this tool would be better available to the wider community, and under better review and testing, and, since it doesn't modify any behavior or actual code in neutron, I'd like to propose it for inclusion into, at least, Icehouse stable branch where it's more relevant. I know the usual way is to go master-Juno-Icehouse, but at this moment the tool is only interesting for Icehouse (and Havana), although I believe it could be extended to cleanup orphaned resources, or any other cleanup tasks, in that case it could make sense to be available for K-J-I.
Re: [openstack-dev] [neutron] [stable] Tool to aid in scalability problems mitigation.
Thanks for your feedback too Ihar, comments inline. - Original Message - -BEGIN PGP SIGNED MESSAGE- Hash: SHA512 On 24/10/14 11:56, Miguel Angel Ajo Pelayo wrote: - Original Message - Hi Miguel, while we'd need to hear from the stable team, I think it's not such a bad idea to make this tool available to users of pre-juno openstack releases. It's a great idea actually. It's great when code emerged from real life downstream support cases eventually flow up to upstream for all operator's benefit (and not just those who pay huge money for commercial service). As far as upstream repos are concerned, I don't know if this tool violates the criteria for stable branches. Even if it would be a rather large change for stable/icehouse, it is pretty much orthogonal to the existing code, so it could be ok. However, please note that stable/havana has now reached its EOL, so there will be no more stable release for it. Sure, I was mentioning havana as affected, but I understand it's already under U/S EOL, D/S distributions would always be free to backport, specially on an orthogonal change like this. About stable/icehouse, I'd like to hear from the stable maintainers. I'm for inclusion of the tool in the main neutron package. Though it's possible to publish it on pypi as a separate package, I would better apply formal review process to it, plus reduce packaging efforts for distributions (and myself). The tool may be later expanded for other useful operator hooks, so I'm for inclusion of the tool in master and backporting it back to all supported branches. Though official stable maintainership rules state that 'New features' are no-go for stable branch [1], I think they should not apply in this case since the tool does not touch production code in any way and just provides a way to heal security groups on operator demand. Also, rules are to break them. ;) Quoting the same document, Proposed backports breaking any of above guidelines can be discussed as exception requests on openstack-stable-maint list where stable-maint team will try to reach consensus. Operators should be more happy if we ship such a tool as part of neutron release and not as another third-party tool from pypi of potentially unsafe origin. BTW I wonder whether the tool can be useful for Juno+ setups too. Though we mostly mitigated the problem by RPC interface rework and ipset, some operators may still hit some limitation that could be workarounded by optimizing their rules. Also, I think the idea of having a tool with miscellaneous operator hooks in the master tree is quite interesting. I would recommend to still go with pushing it to master and then backporting to stable branches. That would also help to get more review attention from cores than stable branch requests usually receive. ;) I believe the tool could also be expanded to report and , equally generate scripts to cleanup orphaned resources, those happen to be when you remove an instance and the port is not deleted, or you delete a tenant, but the resources are kept, etc. I know there are efforts to do proper cleanup when tenants are deleted, but still, I see production databases plagued of orphaned resources. [1]: https://wiki.openstack.org/wiki/StableBranch#Appropriate_Fixes The orthogonal nature of this tool however also make the case for making it widely available on pypi. I think it should be ok to describe the scalability issue in the official OpenStack Icehouse docs and point out to this tool for mitigation. Yes, of course, I consider that as a second option, my point here is that direct upstream review time would result in better quality code here, and could certainly spot any hidden bugs, and increase testing quality. It also reduces packaging time all across distributions making it available via the standard neutron repository. Thanks for the feedback!, Salvatore On 23 October 2014 14:03, Miguel Angel Ajo Pelayo mangel...@redhat.com wrote: Recently, we have identified clients with problems due to the bad scalability of security groups in Havana and Icehouse, that was addressed during juno here [1] [2] This situation is identified by blinking agents (going UP/DOWN), high AMQP load, nigh neutron-server load, and timeout from openvswitch agents when trying to contact neutron-server security_group_rules_for_devices. Doing a [1] backport involves many dependent patches related to the general RPC refactor in neutron (which modifies all plugins), and subsequent ones fixing a few bugs. Sounds risky to me. [2] Introduces new features and it's dependent on features which aren't available on all systems. To remediate this on production systems, I wrote a quick tool to help on reporting security groups and mitigating the problem by writing almost-equivalent rules
Re: [openstack-dev] [neutron] [stable] Tool to aid in scalability problems mitigation.
Hi Miguel, while we'd need to hear from the stable team, I think it's not such a bad idea to make this tool available to users of pre-juno openstack releases. As far as upstream repos are concerned, I don't know if this tool violates the criteria for stable branches. Even if it would be a rather large change for stable/icehouse, it is pretty much orthogonal to the existing code, so it could be ok. However, please note that stable/havana has now reached its EOL, so there will be no more stable release for it. The orthogonal nature of this tool however also make the case for making it widely available on pypi. I think it should be ok to describe the scalability issue in the official OpenStack Icehouse docs and point out to this tool for mitigation. Salvatore On 23 October 2014 14:03, Miguel Angel Ajo Pelayo mangel...@redhat.com wrote: Recently, we have identified clients with problems due to the bad scalability of security groups in Havana and Icehouse, that was addressed during juno here [1] [2] This situation is identified by blinking agents (going UP/DOWN), high AMQP load, nigh neutron-server load, and timeout from openvswitch agents when trying to contact neutron-server security_group_rules_for_devices. Doing a [1] backport involves many dependent patches related to the general RPC refactor in neutron (which modifies all plugins), and subsequent ones fixing a few bugs. Sounds risky to me. [2] Introduces new features and it's dependent on features which aren't available on all systems. To remediate this on production systems, I wrote a quick tool to help on reporting security groups and mitigating the problem by writing almost-equivalent rules [3]. We believe this tool would be better available to the wider community, and under better review and testing, and, since it doesn't modify any behavior or actual code in neutron, I'd like to propose it for inclusion into, at least, Icehouse stable branch where it's more relevant. I know the usual way is to go master-Juno-Icehouse, but at this moment the tool is only interesting for Icehouse (and Havana), although I believe it could be extended to cleanup orphaned resources, or any other cleanup tasks, in that case it could make sense to be available for K-J-I. As a reference, I'm leaving links to outputs from the tool [4][5] Looking forward to get some feedback, Miguel Ángel. [1] https://review.openstack.org/#/c/111876/ security group rpc refactor [2] https://review.openstack.org/#/c/111877/ ipset support [3] https://github.com/mangelajo/neutrontool [4] http://paste.openstack.org/show/123519/ [5] http://paste.openstack.org/show/123525/ ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev