Re: [openstack-dev] [neutron] [stable] Tool to aid in scalability problems mitigation.

2014-10-24 Thread Miguel Angel Ajo Pelayo


- Original Message -
 Hi Miguel,
 
 while we'd need to hear from the stable team, I think it's not such a bad
 idea to make this tool available to users of pre-juno openstack releases.
 As far as upstream repos are concerned, I don't know if this tool violates
 the criteria for stable branches. Even if it would be a rather large change
 for stable/icehouse, it is pretty much orthogonal to the existing code, so
 it could be ok. However, please note that stable/havana has now reached its
 EOL, so there will be no more stable release for it.

Sure, I was mentioning havana as affected, but I understand it's already
under U/S EOL, D/S distributions would always be free to backport, specially
on an orthogonal change like this.

About stable/icehouse, I'd like to hear from the stable maintainers.

 
 The orthogonal nature of this tool however also make the case for making it
 widely available on pypi. I think it should be ok to describe the
 scalability issue in the official OpenStack Icehouse docs and point out to
 this tool for mitigation.

Yes, of course, I consider that as a second option, my point here is that 
direct upstream review time would result in better quality code here, and 
could certainly spot any hidden bugs, and increase testing quality.

It also reduces packaging time all across distributions making it available
via the standard neutron repository.


Thanks for the feedback!,

 
 Salvatore
 
 On 23 October 2014 14:03, Miguel Angel Ajo Pelayo  mangel...@redhat.com 
 wrote:
 
 
 
 
 Recently, we have identified clients with problems due to the
 bad scalability of security groups in Havana and Icehouse, that
 was addressed during juno here [1] [2]
 
 This situation is identified by blinking agents (going UP/DOWN),
 high AMQP load, nigh neutron-server load, and timeout from openvswitch
 agents when trying to contact neutron-server
 security_group_rules_for_devices.
 
 Doing a [1] backport involves many dependent patches related
 to the general RPC refactor in neutron (which modifies all plugins),
 and subsequent ones fixing a few bugs. Sounds risky to me. [2] Introduces
 new features and it's dependent on features which aren't available on
 all systems.
 
 To remediate this on production systems, I wrote a quick tool
 to help on reporting security groups and mitigating the problem
 by writing almost-equivalent rules [3].
 
 We believe this tool would be better available to the wider community,
 and under better review and testing, and, since it doesn't modify any
 behavior
 or actual code in neutron, I'd like to propose it for inclusion into, at
 least,
 Icehouse stable branch where it's more relevant.
 
 I know the usual way is to go master-Juno-Icehouse, but at this moment
 the tool is only interesting for Icehouse (and Havana), although I believe
 it could be extended to cleanup orphaned resources, or any other cleanup
 tasks, in that case it could make sense to be available for K-J-I.
 
 As a reference, I'm leaving links to outputs from the tool [4][5]
 
 Looking forward to get some feedback,
 Miguel Ángel.
 
 
 [1] https://review.openstack.org/#/c/111876/ security group rpc refactor
 [2] https://review.openstack.org/#/c/111877/ ipset support
 [3] https://github.com/mangelajo/neutrontool
 [4] http://paste.openstack.org/show/123519/
 [5] http://paste.openstack.org/show/123525/
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [neutron] [stable] Tool to aid in scalability problems mitigation.

2014-10-24 Thread Ihar Hrachyshka
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA512

On 24/10/14 11:56, Miguel Angel Ajo Pelayo wrote:
 
 
 - Original Message -
 Hi Miguel,
 
 while we'd need to hear from the stable team, I think it's not
 such a bad idea to make this tool available to users of pre-juno
 openstack releases.

It's a great idea actually. It's great when code emerged from real
life downstream support cases eventually flow up to upstream for all
operator's benefit (and not just those who pay huge money for
commercial service).

 As far as upstream repos are concerned, I don't know if this tool
 violates the criteria for stable branches. Even if it would be a
 rather large change for stable/icehouse, it is pretty much
 orthogonal to the existing code, so it could be ok. However,
 please note that stable/havana has now reached its EOL, so there
 will be no more stable release for it.
 
 Sure, I was mentioning havana as affected, but I understand it's
 already under U/S EOL, D/S distributions would always be free to
 backport, specially on an orthogonal change like this.
 
 About stable/icehouse, I'd like to hear from the stable
 maintainers.

I'm for inclusion of the tool in the main neutron package. Though it's
possible to publish it on pypi as a separate package, I would better
apply formal review process to it, plus reduce packaging efforts for
distributions (and myself). The tool may be later expanded for other
useful operator hooks, so I'm for inclusion of the tool in master and
backporting it back to all supported branches.

Though official stable maintainership rules state that 'New features'
are no-go for stable branch [1], I think they should not apply in this
case since the tool does not touch production code in any way and just
provides a way to heal security groups on operator demand. Also, rules
are to break them. ;) Quoting the same document, Proposed backports
breaking any of above guidelines can be discussed as exception
requests on openstack-stable-maint list where stable-maint team will
try to reach consensus.

Operators should be more happy if we ship such a tool as part of
neutron release and not as another third-party tool from pypi of
potentially unsafe origin.

BTW I wonder whether the tool can be useful for Juno+ setups too.
Though we mostly mitigated the problem by RPC interface rework and
ipset, some operators may still hit some limitation that could be
workarounded by optimizing their rules. Also, I think the idea of
having a tool with miscellaneous operator hooks in the master tree is
quite interesting. I would recommend to still go with pushing it to
master and then backporting to stable branches. That would also help
to get more review attention from cores than stable branch requests
usually receive. ;)

[1]: https://wiki.openstack.org/wiki/StableBranch#Appropriate_Fixes

 
 
 The orthogonal nature of this tool however also make the case for
 making it widely available on pypi. I think it should be ok to
 describe the scalability issue in the official OpenStack Icehouse
 docs and point out to this tool for mitigation.
 
 Yes, of course, I consider that as a second option, my point here
 is that direct upstream review time would result in better quality
 code here, and could certainly spot any hidden bugs, and increase
 testing quality.
 
 It also reduces packaging time all across distributions making it
 available via the standard neutron repository.
 
 
 Thanks for the feedback!,
 
 
 Salvatore
 
 On 23 October 2014 14:03, Miguel Angel Ajo Pelayo 
 mangel...@redhat.com  wrote:
 
 
 
 
 Recently, we have identified clients with problems due to the bad
 scalability of security groups in Havana and Icehouse, that was
 addressed during juno here [1] [2]
 
 This situation is identified by blinking agents (going UP/DOWN), 
 high AMQP load, nigh neutron-server load, and timeout from
 openvswitch agents when trying to contact neutron-server 
 security_group_rules_for_devices.
 
 Doing a [1] backport involves many dependent patches related to
 the general RPC refactor in neutron (which modifies all
 plugins), and subsequent ones fixing a few bugs. Sounds risky to
 me. [2] Introduces new features and it's dependent on features
 which aren't available on all systems.
 
 To remediate this on production systems, I wrote a quick tool to
 help on reporting security groups and mitigating the problem by
 writing almost-equivalent rules [3].
 
 We believe this tool would be better available to the wider
 community, and under better review and testing, and, since it
 doesn't modify any behavior or actual code in neutron, I'd like
 to propose it for inclusion into, at least, Icehouse stable
 branch where it's more relevant.
 
 I know the usual way is to go master-Juno-Icehouse, but at this
 moment the tool is only interesting for Icehouse (and Havana),
 although I believe it could be extended to cleanup orphaned
 resources, or any other cleanup tasks, in that case it could make
 sense to be available for K-J-I.

Re: [openstack-dev] [neutron] [stable] Tool to aid in scalability problems mitigation.

2014-10-24 Thread Miguel Angel Ajo Pelayo
Thanks for your feedback too Ihar, comments inline.

- Original Message -
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA512
 
 On 24/10/14 11:56, Miguel Angel Ajo Pelayo wrote:
  
  
  - Original Message -
  Hi Miguel,
  
  while we'd need to hear from the stable team, I think it's not
  such a bad idea to make this tool available to users of pre-juno
  openstack releases.
 
 It's a great idea actually. It's great when code emerged from real
 life downstream support cases eventually flow up to upstream for all
 operator's benefit (and not just those who pay huge money for
 commercial service).
 
  As far as upstream repos are concerned, I don't know if this tool
  violates the criteria for stable branches. Even if it would be a
  rather large change for stable/icehouse, it is pretty much
  orthogonal to the existing code, so it could be ok. However,
  please note that stable/havana has now reached its EOL, so there
  will be no more stable release for it.
  
  Sure, I was mentioning havana as affected, but I understand it's
  already under U/S EOL, D/S distributions would always be free to
  backport, specially on an orthogonal change like this.
  
  About stable/icehouse, I'd like to hear from the stable
  maintainers.
 
 I'm for inclusion of the tool in the main neutron package. Though it's
 possible to publish it on pypi as a separate package, I would better
 apply formal review process to it, plus reduce packaging efforts for
 distributions (and myself). The tool may be later expanded for other
 useful operator hooks, so I'm for inclusion of the tool in master and
 backporting it back to all supported branches.
 
 Though official stable maintainership rules state that 'New features'
 are no-go for stable branch [1], I think they should not apply in this
 case since the tool does not touch production code in any way and just
 provides a way to heal security groups on operator demand. Also, rules
 are to break them. ;) Quoting the same document, Proposed backports
 breaking any of above guidelines can be discussed as exception
 requests on openstack-stable-maint list where stable-maint team will
 try to reach consensus.
 
 Operators should be more happy if we ship such a tool as part of
 neutron release and not as another third-party tool from pypi of
 potentially unsafe origin.
 
 BTW I wonder whether the tool can be useful for Juno+ setups too.
 Though we mostly mitigated the problem by RPC interface rework and
 ipset, some operators may still hit some limitation that could be
 workarounded by optimizing their rules. Also, I think the idea of
 having a tool with miscellaneous operator hooks in the master tree is
 quite interesting. I would recommend to still go with pushing it to
 master and then backporting to stable branches. That would also help
 to get more review attention from cores than stable branch requests
 usually receive. ;)


I believe the tool could also be expanded to report and , equally generate
scripts to cleanup orphaned resources, those happen to be when
you remove an instance and the port is not deleted, or you delete a 
tenant, but the resources are kept, etc.

I know there are efforts to do proper cleanup when tenants are deleted,
but still, I see production databases plagued of orphaned resources. 

 
 [1]: https://wiki.openstack.org/wiki/StableBranch#Appropriate_Fixes
 
  
  
  The orthogonal nature of this tool however also make the case for
  making it widely available on pypi. I think it should be ok to
  describe the scalability issue in the official OpenStack Icehouse
  docs and point out to this tool for mitigation.
  
  Yes, of course, I consider that as a second option, my point here
  is that direct upstream review time would result in better quality
  code here, and could certainly spot any hidden bugs, and increase
  testing quality.
  
  It also reduces packaging time all across distributions making it
  available via the standard neutron repository.
  
  
  Thanks for the feedback!,
  
  
  Salvatore
  
  On 23 October 2014 14:03, Miguel Angel Ajo Pelayo 
  mangel...@redhat.com  wrote:
  
  
  
  
  Recently, we have identified clients with problems due to the bad
  scalability of security groups in Havana and Icehouse, that was
  addressed during juno here [1] [2]
  
  This situation is identified by blinking agents (going UP/DOWN),
  high AMQP load, nigh neutron-server load, and timeout from
  openvswitch agents when trying to contact neutron-server
  security_group_rules_for_devices.
  
  Doing a [1] backport involves many dependent patches related to
  the general RPC refactor in neutron (which modifies all
  plugins), and subsequent ones fixing a few bugs. Sounds risky to
  me. [2] Introduces new features and it's dependent on features
  which aren't available on all systems.
  
  To remediate this on production systems, I wrote a quick tool to
  help on reporting security groups and mitigating the problem by
  writing almost-equivalent rules 

Re: [openstack-dev] [neutron] [stable] Tool to aid in scalability problems mitigation.

2014-10-23 Thread Salvatore Orlando
Hi Miguel,

while we'd need to hear from the stable team, I think it's not such a bad
idea to make this tool available to users of pre-juno openstack releases.
As far as upstream repos are concerned, I don't know if this tool violates
the criteria for stable branches. Even if it would be a rather large change
for stable/icehouse, it is pretty much orthogonal to the existing code, so
it could be ok. However, please note that stable/havana has now reached its
EOL, so there will be no more stable release for it.

The orthogonal nature of this tool however also make the case for making it
widely available on pypi. I think it should be ok to describe the
scalability issue in the official OpenStack Icehouse docs and point out to
this tool for mitigation.

Salvatore

On 23 October 2014 14:03, Miguel Angel Ajo Pelayo mangel...@redhat.com
wrote:



 Recently, we have identified clients with problems due to the
 bad scalability of security groups in Havana and Icehouse, that
 was addressed during juno here [1] [2]

 This situation is identified by blinking agents (going UP/DOWN),
 high AMQP load, nigh neutron-server load, and timeout from openvswitch
 agents when trying to contact neutron-server
 security_group_rules_for_devices.

 Doing a [1] backport involves many dependent patches related
 to the general RPC refactor in neutron (which modifies all plugins),
 and subsequent ones fixing a few bugs. Sounds risky to me. [2] Introduces
 new features and it's dependent on features which aren't available on
 all systems.

 To remediate this on production systems, I wrote a quick tool
 to help on reporting security groups and mitigating the problem
 by writing almost-equivalent rules [3].

 We believe this tool would be better available to the wider community,
 and under better review and testing, and, since it doesn't modify any
 behavior
 or actual code in neutron, I'd like to propose it for inclusion into, at
 least,
 Icehouse stable branch where it's more relevant.

 I know the usual way is to go master-Juno-Icehouse, but at this
 moment
 the tool is only interesting for Icehouse (and Havana), although I believe
 it could be extended to cleanup orphaned resources, or any other cleanup
 tasks, in that case it could make sense to be available for K-J-I.

 As a reference, I'm leaving links to outputs from the tool [4][5]

 Looking forward to get some feedback,
 Miguel Ángel.


 [1] https://review.openstack.org/#/c/111876/ security group rpc refactor
 [2] https://review.openstack.org/#/c/111877/ ipset support
 [3] https://github.com/mangelajo/neutrontool
 [4] http://paste.openstack.org/show/123519/
 [5] http://paste.openstack.org/show/123525/

 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev