Re: [openstack-dev] [neutron][all] switch from mysqldb to another eventlet aware mysql client
-BEGIN PGP SIGNED MESSAGE- Hash: SHA512 On 12/09/14 19:08, Doug Hellmann wrote: On Sep 12, 2014, at 1:03 PM, Ihar Hrachyshka ihrac...@redhat.com wrote: Signed PGP part On 12/09/14 17:30, Mike Bayer wrote: On Sep 12, 2014, at 10:40 AM, Ihar Hrachyshka ihrac...@redhat.com wrote: Signed PGP part On 12/09/14 16:33, Mike Bayer wrote: I agree with this, changing the MySQL driver now is not an option. That was not the proposal. The proposal was to introduce support to run against something different from MySQLdb + a gate job for that alternative. The next cycle was supposed to do thorough regression testing, benchmarking, etc. to decide whether we're ok to recommend that alternative to users. ah, well that is a great idea. But we can have that throughout Kilo anyway, why not ? Sure, it's not the end of the world. We'll just need to postpone work till RC1 (=opening of master for new stuff), pass spec bureauracy (reapplying for kilo)... That's some burden, but not tragedy. The only thing that I'm really sad about is that Juno users won't be able to try out that driver on their setup just to see how it works, so it narrows testing base to gate while we could get some valuable deployment feedback in Juno already. It’s all experimental, right? And implemented in libraries? So those users could update oslo.db and sqlalchemy-migrate and test the results under Juno. oslo.db is already bumped to the version that includes all those fixes needed. As for sqlalchemy-migrate, we may try to work on a fix for the library that will silently drop those COMMIT statements in SQL scripts. That would solve the problem without touching any migration code in nova, glance, or cinder. This is the piece that is currently missing to run Juno with that alternative driver. Also, as Angus said, we already can run migrations on mysqldb and then switch it for testing, without any of the changes. I'll work on making sure it's available to check out with Juno pieces in addition to Kilo. /Ihar -BEGIN PGP SIGNATURE- Version: GnuPG/MacGPG2 v2.0.22 (Darwin) iQEcBAEBCgAGBQJUFsBfAAoJEC5aWaUY1u57bLYH/jcbBhfPFRQg3Rklw2iYaZC4 ROHSvjMaudu+bgiqJxy1bNJEkxqQTqkJmWz1kYUhjaan4aVqBc/8aVrCMebottan UFChNmhxtfKSF/ioAEF7AuUuggXG+nsvcFcOzBpIZ1eMMUiLtQPsWEypyDMH0c3m sot650eoXD83VnrgpSRkDv4xJYGmhCQ2DYObIXm8j+KVlnOh8T7ElPKeeCE/Gahs /k8ObbzkeNJr2z7oPXqvR93mQkGzNYwONtKi5KFZtoHXYL0vDvO1zQ8Oub0L7CtI 1Jvr5crNsax7hE4WxHgmdppJvdSqzzECFKhNWfUS2vM3LY24iGpv8DcX5GeVVbo= =gMQE -END PGP SIGNATURE- ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [neutron][all] switch from mysqldb to another eventlet aware mysql client
-BEGIN PGP SIGNED MESSAGE- Hash: SHA512 On 12/09/14 18:00, Mike Bayer wrote: On Sep 12, 2014, at 11:56 AM, Johannes Erdfelt johan...@erdfelt.com wrote: On Fri, Sep 12, 2014, Doug Hellmann d...@doughellmann.com wrote: I don’t think we will want to retroactively change the migration scripts (that’s not something we generally like to do), We don't allow semantic changes to migration scripts since people who have already run it won't get those changes. However, we haven't been shy about fixing bugs that prevent the migration script from running (which this change would probably fall into). fortunately BEGIN/ COMMIT are not semantic directives. The migrations semantically indicated by the script are unaffected in any way by these run-environment settings. so we should look at changes needed to make sqlalchemy-migrate deal with them (by ignoring them, or working around the errors, or whatever). That said, I agree that sqlalchemy-migrate shouldn't be changing in a non-backwards compatible way. on the sqlalchemy-migrate side, the handling of it’s ill-conceived “sql script” feature can be further mitigated here by parsing for the “COMMIT” line when it breaks out the SQL and ignoring it, I’d favor that it emits a warning also. I went on with ignoring COMMIT specifically in SQL scripts: https://review.openstack.org/#/c/121517/ Though we could also ignore other transaction managing statements in those scripts, like ROLLBACK, they are highly unlikely to occur in migration code, so I ignore them in the patch. /Ihar -BEGIN PGP SIGNATURE- Version: GnuPG/MacGPG2 v2.0.22 (Darwin) iQEcBAEBCgAGBQJUFtXpAAoJEC5aWaUY1u57++gIAJb8JdVm5Du/6D9o18QRvH9S vZYtXWbI3637f0bII7rTwMVc5AK3m9s6q1WVuCiNZiFdMhI7YApU2qaC3KGMcxo7 3x+R1ptgbslR9rJj0T8ohMPX4pOVd2Wd0keqNw8plytduaT3tNK6J7Lvc/wqDWkS BDpIw6p5XWPMqbWzDdkPjIqK7rG6/bqZO8LXDsD1l/l4QjlzXB/qxyW5hFiR/ANe iAhEAfAmDLRQMs5DFHc6UNaOoh+DjODq7V4hMSQJtwC8x6RmW0mAbBg+Ii21dugD lqM53C9nIHmGP84jDjKy0W3aLeY0Z0m8ulUNCfGjKWZjy1ng5gRU9voxVse3Xfs= =Vt8X -END PGP SIGNATURE- ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [neutron][all] switch from mysqldb to another eventlet aware mysql client
-BEGIN PGP SIGNED MESSAGE- Hash: SHA512 Some updates/concerns/questions. The status of introducing a new driver to gate is: - - all the patches for mysql-connector are merged in all projects; - - all devstack patches to support switching the driver are merged; - - new sqlalchemy-migrate library is released; - - version bump is *not* yet done; - - package is still *not* yet published on pypi; - - new gate job is *not* yet introduced. The new sqlalchemy-migrate release introduced unit test failures in those three projects: nova, cinder, glance. On technical side of the failure: my understanding is that those projects that started to fail assume too much about how those SQL scripts are executed. They assume they are executed in one go, they also assume they need to open and commit transaction on their own. I don't think this is something to be fixed in sqlalchemy-migrate itself. Instead, simple removal of those 'BEGIN TRANSACTION; ... COMMIT;' statements should just work and looks like a sane thing to do anyway. I've proposed the following patches for all three projects to handle it [1]. That said, those failures were solved by pinning the version of the library in openstack/requirements and those projects. This is in major contrast to how we handled the new testtools release just several weeks ago, when the problem was solved by fixing three affected projects because of their incorrect usage of tearDown/setUp methods. Even more so, those failures seem to trigger the resolution to move the enable-mysql-connector oslo spec to Kilo, while the library version bump is the *only* change missing codewise (we will also need a gate job description, but that doesn't touch codebase at all). The resolution looks too prompt and ungrounded to me. Is it really that gate failure for three projects that resulted in it, or there are some other hidden reasons behind it? Was it discussed anywhere? If so, I wasn't given a chance to participate in that discussion; I suspect another supporter of the spec (Agnus Lees) was not involved either. Not allowing those last pieces of the spec in this cycle, we just postpone start of any realistic testing of the feature for another half a year. Why do we block new sqlalchemy-migrate and the spec for another cycle instead of fixing the affected projects with *primitive* patches like we did for new testtools? [1]: https://review.openstack.org/#/q/I10c58b3af75d3ab9153a8bbd2a539bf1577de328,n,z /Ihar On 09/07/14 13:17, Ihar Hrachyshka wrote: Hi all, Multiple projects are suffering from db lock timeouts due to deadlocks deep in mysqldb library that we use to interact with mysql servers. In essence, the problem is due to missing eventlet support in mysqldb module, meaning when a db lock is encountered, the library does not yield to the next green thread, allowing other threads to eventually unlock the grabbed lock, and instead it just blocks the main thread, that eventually raises timeout exception (OperationalError). The failed operation is not retried, leaving failing request not served. In Nova, there is a special retry mechanism for deadlocks, though I think it's more a hack than a proper fix. Neutron is one of the projects that suffer from those timeout errors a lot. Partly it's due to lack of discipline in how we do nested calls in l3_db and ml2_plugin code, but that's not something to change in foreseeable future, so we need to find another solution that is applicable for Juno. Ideally, the solution should be applicable for Icehouse too to allow distributors to resolve existing deadlocks without waiting for Juno. We've had several discussions and attempts to introduce a solution to the problem. Thanks to oslo.db guys, we now have more or less clear view on the cause of the failures and how to easily fix them. The solution is to switch mysqldb to something eventlet aware. The best candidate is probably MySQL Connector module that is an official MySQL client for Python and that shows some (preliminary) good results in terms of performance. I've posted a Neutron spec for the switch to the new client in Juno at [1]. Ideally, switch is just a matter of several fixes to oslo.db that would enable full support for the new driver already supported by SQLAlchemy, plus 'connection' string modified in service configuration files, plus documentation updates to refer to the new official way to configure services for MySQL. The database code won't, ideally, require any major changes, though some adaptation for the new client library may be needed. That said, Neutron does not seem to require any changes, though it was revealed that there are some alembic migration rules in Keystone or Glance that need (trivial) modifications. You can see how trivial the switch can be achieved for a service based on example for Neutron [2]. While this is a Neutron specific proposal, there is an obvious wish to switch to the new library globally
Re: [openstack-dev] [neutron][all] switch from mysqldb to another eventlet aware mysql client
On 09/12/2014 06:41 AM, Ihar Hrachyshka wrote: Some updates/concerns/questions. The status of introducing a new driver to gate is: - all the patches for mysql-connector are merged in all projects; - all devstack patches to support switching the driver are merged; - new sqlalchemy-migrate library is released; - version bump is *not* yet done; - package is still *not* yet published on pypi; - new gate job is *not* yet introduced. The new sqlalchemy-migrate release introduced unit test failures in those three projects: nova, cinder, glance. On technical side of the failure: my understanding is that those projects that started to fail assume too much about how those SQL scripts are executed. They assume they are executed in one go, they also assume they need to open and commit transaction on their own. I don't think this is something to be fixed in sqlalchemy-migrate itself. Instead, simple removal of those 'BEGIN TRANSACTION; ... COMMIT;' statements should just work and looks like a sane thing to do anyway. I've proposed the following patches for all three projects to handle it [1]. That said, those failures were solved by pinning the version of the library in openstack/requirements and those projects. This is in major contrast to how we handled the new testtools release just several weeks ago, when the problem was solved by fixing three affected projects because of their incorrect usage of tearDown/setUp methods. Even more so, those failures seem to trigger the resolution to move the enable-mysql-connector oslo spec to Kilo, while the library version bump is the *only* change missing codewise (we will also need a gate job description, but that doesn't touch codebase at all). The resolution looks too prompt and ungrounded to me. Is it really that gate failure for three projects that resulted in it, or there are some other hidden reasons behind it? Was it discussed anywhere? If so, I wasn't given a chance to participate in that discussion; I suspect another supporter of the spec (Agnus Lees) was not involved either. Not allowing those last pieces of the spec in this cycle, we just postpone start of any realistic testing of the feature for another half a year. Why do we block new sqlalchemy-migrate and the spec for another cycle instead of fixing the affected projects with *primitive* patches like we did for new testtools? Because we are in Feature Freeze. Now is the time for critical bug fixes only, as we start to stabalize the tree. Releasing dependent libraries that can cause breaks, for whatever reason, should be soundly avoided. If this was August, fine. But it's feature freeze. -Sean -- Sean Dague http://dague.net signature.asc Description: OpenPGP digital signature ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [neutron][all] switch from mysqldb to another eventlet aware mysql client
-BEGIN PGP SIGNED MESSAGE- Hash: SHA512 On 12/09/14 13:20, Sean Dague wrote: On 09/12/2014 06:41 AM, Ihar Hrachyshka wrote: Some updates/concerns/questions. The status of introducing a new driver to gate is: - all the patches for mysql-connector are merged in all projects; - all devstack patches to support switching the driver are merged; - new sqlalchemy-migrate library is released; - version bump is *not* yet done; - package is still *not* yet published on pypi; - new gate job is *not* yet introduced. The new sqlalchemy-migrate release introduced unit test failures in those three projects: nova, cinder, glance. On technical side of the failure: my understanding is that those projects that started to fail assume too much about how those SQL scripts are executed. They assume they are executed in one go, they also assume they need to open and commit transaction on their own. I don't think this is something to be fixed in sqlalchemy-migrate itself. Instead, simple removal of those 'BEGIN TRANSACTION; ... COMMIT;' statements should just work and looks like a sane thing to do anyway. I've proposed the following patches for all three projects to handle it [1]. That said, those failures were solved by pinning the version of the library in openstack/requirements and those projects. This is in major contrast to how we handled the new testtools release just several weeks ago, when the problem was solved by fixing three affected projects because of their incorrect usage of tearDown/setUp methods. Even more so, those failures seem to trigger the resolution to move the enable-mysql-connector oslo spec to Kilo, while the library version bump is the *only* change missing codewise (we will also need a gate job description, but that doesn't touch codebase at all). The resolution looks too prompt and ungrounded to me. Is it really that gate failure for three projects that resulted in it, or there are some other hidden reasons behind it? Was it discussed anywhere? If so, I wasn't given a chance to participate in that discussion; I suspect another supporter of the spec (Agnus Lees) was not involved either. Not allowing those last pieces of the spec in this cycle, we just postpone start of any realistic testing of the feature for another half a year. Why do we block new sqlalchemy-migrate and the spec for another cycle instead of fixing the affected projects with *primitive* patches like we did for new testtools? Because we are in Feature Freeze. Now is the time for critical bug fixes only, as we start to stabalize the tree. Releasing dependent libraries that can cause breaks, for whatever reason, should be soundly avoided. If this was August, fine. But it's feature freeze. I probably missed the fact that we are so strict now that we don't allow tiny missing bits to go in. In my excuse, I was offline for around three last weeks. I was a bit misled by the fact that I was approached by an oslo core very recently on which remaining bits we need to push before claiming the spec to be complete, and I assumed it means that we are free to complete the work this cycle. Otherwise, I wouldn't push for the new library version in the first place. Anyway, I guess there is no way now to get remaining bits in Juno, even if small, and we're doomed to postpone them to Kilo. Thanks for the explanation, /Ihar -BEGIN PGP SIGNATURE- Version: GnuPG/MacGPG2 v2.0.22 (Darwin) iQEcBAEBCgAGBQJUEvPjAAoJEC5aWaUY1u57kPYIAMuTz5w8cmNLeXHSGpb0s0BT 4GPbTvLIvoTRXf2froozSxVo6B4oKgUFe7IkSI8nsBHP+dcDPotKwJEMgAKpLL1n 37ccFR+RuMCVMa6ZYHgz88o4dbTgv5XC5tBTnY78mX7WOoQHQ0ByRcBUZkIc9aoI KF+SNRvHwVRT9qNPElcrfHKNPwROIe1Eml3aVaqnHWPWip5J7+E+/BU+YSxtDKIV whrJzUpHgwph4NJ1lHddrzVCAjf8mWKj8EX1WWU2zTgUtfLi+xqvOBCnQ+1rBXA8 brIBpbUOObMjBqbemlymKuFvcuy6yHTXXvAfLcgGcRXSmvdjtfAIZCr5d9AjKhU= =zPHu -END PGP SIGNATURE- ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [neutron][all] switch from mysqldb to another eventlet aware mysql client
On 2014-09-12 12:41:42 +0200 (+0200), Ihar Hrachyshka wrote: [...] That said, those failures were solved by pinning the version of the library in openstack/requirements and those projects. This is in major contrast to how we handled the new testtools release just several weeks ago, when the problem was solved by fixing three affected projects because of their incorrect usage of tearDown/setUp methods. [...] This was of course different because it came during a period where integrated projects are supposed to be focusing on stabilizing what they have toward release, but also our behavior was somewhat altered because we needed to perform some immediate damage control. One of the side-effects of the failure mode this sqlalchemy-migrate release induced was that each nova unit tests was generating ~0.5GiB of log data, instantly overwhelming our test log analysis systems and flooding our artifact archive (both in terms of bandwidth and disk). The fastest way to stop this was to roll back what changed, for which the options were either to introduce an exclusionary version pin or convince the library authors to release an even newer version tagged to the old one. We chose the first solution as it was more directly under the control of the infrastructure and nova core teams involved at that moment. -- Jeremy Stanley ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [neutron][all] switch from mysqldb to another eventlet aware mysql client
On Sep 12, 2014, at 7:20 AM, Sean Dague s...@dague.net wrote: Because we are in Feature Freeze. Now is the time for critical bug fixes only, as we start to stabalize the tree. Releasing dependent libraries that can cause breaks, for whatever reason, should be soundly avoided. If this was August, fine. But it's feature freeze. I agree with this, changing the MySQL driver now is not an option.That train has left the station, I think it’s better we all take the whole Kilo cycle to get used to mysql-connector and its quirks before launching it on the world, as there will be many more. However for Kilo, I think those “COMMIT” phrases should be removed and overall we need to make a very hard and fast rule that we *do not put multiple statements in an execute*.I’ve seen a bunch of these come through so far, and for some of them (more the in-Python ones) it seems like the underlying reason is a lack of understanding of what exactly a SQLAlchemy “Engine” is and what features it supports. So first, let me point folks to the documentation for this, which anyone writing code involving Engine objects should read first: http://docs.sqlalchemy.org/en/rel_0_9/core/connections.html Key to this is that while engine supports an “.execute()” method, in order to do anything that intends to work on a single connection and typically a single transaction, you procure a Connection and usually a Transaction from the Engine, most easily like this: with engine.begin() as conn: conn.execute(statement 1) conn.execute(statement 2) conn.execute(statement 3) .. etc Now let me apologize for the reason this misunderstanding exists in the first place: it’s because in 2005 I put the “.execute()” convenience method on the Engine itself (well in fact we didn’t have the Engine/Connection dichotomy back then), and I also thought that “implicit execution”, e.g. statement.execute(), would be a great idea.Tons of other people still think it’s a great idea and even though I’ve buried this whole thing in the docs, they still use it like candy….until they have the need to control the scope of connectivity. *Huge* mistake, it’s my fault, but not something that can really be changed now. Also, in 2005, Python didn’t have context managers.So we have all kinds of klunky patterns like “trans = conn.begin()”, kind of J2EE style, etc., but these days, the above pattern is your best bet when you want to invoke multiple statements.engine.execute() overall should just be avoided as it only leads to misunderstanding. When we all move all of our migrate stuff to Alembic, there won’t be an Engine provided to a migration script, it will be a Connection to start with. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [neutron][all] switch from mysqldb to another eventlet aware mysql client
-BEGIN PGP SIGNED MESSAGE- Hash: SHA512 On 12/09/14 16:33, Mike Bayer wrote: I agree with this, changing the MySQL driver now is not an option. That was not the proposal. The proposal was to introduce support to run against something different from MySQLdb + a gate job for that alternative. The next cycle was supposed to do thorough regression testing, benchmarking, etc. to decide whether we're ok to recommend that alternative to users. /Ihar -BEGIN PGP SIGNATURE- Version: GnuPG/MacGPG2 v2.0.22 (Darwin) iQEcBAEBCgAGBQJUEwX2AAoJEC5aWaUY1u576sEH/j1q6elB5jmt9AxInN77ei7E dG80fol/E56UB+rtuTfrev2ceLYU6iTF7p11t/ABzXdvGHWwcfzD/zJrPUEu0+Bq XbyATjNTEtjgBkcZr8R1Av2JwOgrny/3OeATQf8EfqDUKhjiUcAsPrYw14OebUyZ HRyTA7QvC83aJQK28hMK+l2x7cYCPG5CGugUXd5BTXP/yMOQ60izvHd9B9vnx/5y EgWDV3RwAXPiFQ41aeobIktlt9F+bl6y6S+mmJY3FgjsjqxKIJBlxmhCppKLcot5 9WhsBUa9uvgCAvOU7p7/B4pSo+9gaxJtXlCjzQBH6qWb07DItMLjsc8eF6uA5M0= =xIP3 -END PGP SIGNATURE- ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [neutron][all] switch from mysqldb to another eventlet aware mysql client
On Sep 12, 2014, at 9:23 AM, Ihar Hrachyshka ihrac...@redhat.com wrote: Signed PGP part On 12/09/14 13:20, Sean Dague wrote: On 09/12/2014 06:41 AM, Ihar Hrachyshka wrote: Some updates/concerns/questions. The status of introducing a new driver to gate is: - all the patches for mysql-connector are merged in all projects; - all devstack patches to support switching the driver are merged; - new sqlalchemy-migrate library is released; - version bump is *not* yet done; - package is still *not* yet published on pypi; - new gate job is *not* yet introduced. The new sqlalchemy-migrate release introduced unit test failures in those three projects: nova, cinder, glance. On technical side of the failure: my understanding is that those projects that started to fail assume too much about how those SQL scripts are executed. They assume they are executed in one go, they also assume they need to open and commit transaction on their own. I don't think this is something to be fixed in sqlalchemy-migrate itself. Instead, simple removal of those 'BEGIN TRANSACTION; ... COMMIT;' statements should just work and looks like a sane thing to do anyway. I've proposed the following patches for all three projects to handle it [1]. That said, those failures were solved by pinning the version of the library in openstack/requirements and those projects. This is in major contrast to how we handled the new testtools release just several weeks ago, when the problem was solved by fixing three affected projects because of their incorrect usage of tearDown/setUp methods. Even more so, those failures seem to trigger the resolution to move the enable-mysql-connector oslo spec to Kilo, while the library version bump is the *only* change missing codewise (we will also need a gate job description, but that doesn't touch codebase at all). The resolution looks too prompt and ungrounded to me. Is it really that gate failure for three projects that resulted in it, or there are some other hidden reasons behind it? Was it discussed anywhere? If so, I wasn't given a chance to participate in that discussion; I suspect another supporter of the spec (Agnus Lees) was not involved either. Not allowing those last pieces of the spec in this cycle, we just postpone start of any realistic testing of the feature for another half a year. Why do we block new sqlalchemy-migrate and the spec for another cycle instead of fixing the affected projects with *primitive* patches like we did for new testtools? Because we are in Feature Freeze. Now is the time for critical bug fixes only, as we start to stabalize the tree. Releasing dependent libraries that can cause breaks, for whatever reason, should be soundly avoided. If this was August, fine. But it's feature freeze. I probably missed the fact that we are so strict now that we don't allow tiny missing bits to go in. In my excuse, I was offline for around three last weeks. I was a bit misled by the fact that I was approached by an oslo core very recently on which remaining bits we need to push before claiming the spec to be complete, and I assumed it means that we are free to complete the work this cycle. Otherwise, I wouldn't push for the new library version in the first place. I suspect you’re referring to me, there. I believed the work was ready to be wrapped up. I’m sorry my misunderstanding led to the issues. Anyway, I guess there is no way now to get remaining bits in Juno, even if small, and we're doomed to postpone them to Kilo. I think we’re only looking at a couple of weeks delay. During that time we can work on fixing the problem. I don’t think we will want to retroactively change the migration scripts (that’s not something we generally like to do), so we should look at changes needed to make sqlalchemy-migrate deal with them (by ignoring them, or working around the errors, or whatever). Doug Thanks for the explanation, /Ihar ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [neutron][all] switch from mysqldb to another eventlet aware mysql client
On Sep 12, 2014, at 10:40 AM, Ihar Hrachyshka ihrac...@redhat.com wrote: Signed PGP part On 12/09/14 16:33, Mike Bayer wrote: I agree with this, changing the MySQL driver now is not an option. That was not the proposal. The proposal was to introduce support to run against something different from MySQLdb + a gate job for that alternative. The next cycle was supposed to do thorough regression testing, benchmarking, etc. to decide whether we're ok to recommend that alternative to users. ah, well that is a great idea. But we can have that throughout Kilo anyway, why not ? ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [neutron][all] switch from mysqldb to another eventlet aware mysql client
On Fri, Sep 12, 2014, Doug Hellmann d...@doughellmann.com wrote: I don’t think we will want to retroactively change the migration scripts (that’s not something we generally like to do), We don't allow semantic changes to migration scripts since people who have already run it won't get those changes. However, we haven't been shy about fixing bugs that prevent the migration script from running (which this change would probably fall into). so we should look at changes needed to make sqlalchemy-migrate deal with them (by ignoring them, or working around the errors, or whatever). That said, I agree that sqlalchemy-migrate shouldn't be changing in a non-backwards compatible way. JE ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [neutron][all] switch from mysqldb to another eventlet aware mysql client
On Sep 12, 2014, at 11:56 AM, Johannes Erdfelt johan...@erdfelt.com wrote: On Fri, Sep 12, 2014, Doug Hellmann d...@doughellmann.com wrote: I don’t think we will want to retroactively change the migration scripts (that’s not something we generally like to do), We don't allow semantic changes to migration scripts since people who have already run it won't get those changes. However, we haven't been shy about fixing bugs that prevent the migration script from running (which this change would probably fall into). fortunately BEGIN/ COMMIT are not semantic directives. The migrations semantically indicated by the script are unaffected in any way by these run-environment settings. so we should look at changes needed to make sqlalchemy-migrate deal with them (by ignoring them, or working around the errors, or whatever). That said, I agree that sqlalchemy-migrate shouldn't be changing in a non-backwards compatible way. on the sqlalchemy-migrate side, the handling of it’s ill-conceived “sql script” feature can be further mitigated here by parsing for the “COMMIT” line when it breaks out the SQL and ignoring it, I’d favor that it emits a warning also. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [neutron][all] switch from mysqldb to another eventlet aware mysql client
-BEGIN PGP SIGNED MESSAGE- Hash: SHA512 On 12/09/14 17:30, Mike Bayer wrote: On Sep 12, 2014, at 10:40 AM, Ihar Hrachyshka ihrac...@redhat.com wrote: Signed PGP part On 12/09/14 16:33, Mike Bayer wrote: I agree with this, changing the MySQL driver now is not an option. That was not the proposal. The proposal was to introduce support to run against something different from MySQLdb + a gate job for that alternative. The next cycle was supposed to do thorough regression testing, benchmarking, etc. to decide whether we're ok to recommend that alternative to users. ah, well that is a great idea. But we can have that throughout Kilo anyway, why not ? Sure, it's not the end of the world. We'll just need to postpone work till RC1 (=opening of master for new stuff), pass spec bureauracy (reapplying for kilo)... That's some burden, but not tragedy. The only thing that I'm really sad about is that Juno users won't be able to try out that driver on their setup just to see how it works, so it narrows testing base to gate while we could get some valuable deployment feedback in Juno already. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -BEGIN PGP SIGNATURE- Version: GnuPG/MacGPG2 v2.0.22 (Darwin) iQEcBAEBCgAGBQJUEydwAAoJEC5aWaUY1u57CbYIAKCwyAj/+xyGlcFeUJ04Jtxi 1mwl3IjO6Ue5BfdrrO7128MHMINUojcA4VnQv3jNfwjJ1j1TqWQ+/6uoFHiGn7uA ga1SVNGar1SkIbc8OqkdbEOd2tI36rvF9qA7dEP1pVJYuwT+iNRmPgiieDrsSpXu 40F3zZQLPfAFSqaANBDeh6sq2OxPF99IG15X49YqCjmI5+cwRCw331LCdZXAV/lq yHrIZDYrfFSMSHoldAVtb4dJLu06rQNuDTwWMrdEXKmlkNv00EfK3V+Er0E/lq8E 7QKH05dGbcRj0/qaofiQlPvAn/UomIFDHv9zZdV4UEKWxKo1oRB6cKUAv7LhGS4= =CnKn -END PGP SIGNATURE- ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [neutron][all] switch from mysqldb to another eventlet aware mysql client
On Sep 12, 2014, at 1:03 PM, Ihar Hrachyshka ihrac...@redhat.com wrote: Signed PGP part On 12/09/14 17:30, Mike Bayer wrote: On Sep 12, 2014, at 10:40 AM, Ihar Hrachyshka ihrac...@redhat.com wrote: Signed PGP part On 12/09/14 16:33, Mike Bayer wrote: I agree with this, changing the MySQL driver now is not an option. That was not the proposal. The proposal was to introduce support to run against something different from MySQLdb + a gate job for that alternative. The next cycle was supposed to do thorough regression testing, benchmarking, etc. to decide whether we're ok to recommend that alternative to users. ah, well that is a great idea. But we can have that throughout Kilo anyway, why not ? Sure, it's not the end of the world. We'll just need to postpone work till RC1 (=opening of master for new stuff), pass spec bureauracy (reapplying for kilo)... That's some burden, but not tragedy. The only thing that I'm really sad about is that Juno users won't be able to try out that driver on their setup just to see how it works, so it narrows testing base to gate while we could get some valuable deployment feedback in Juno already. It’s all experimental, right? And implemented in libraries? So those users could update oslo.db and sqlalchemy-migrate and test the results under Juno. Doug ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [neutron][all] switch from mysqldb to another eventlet aware mysql client
On Fri, 12 Sep 2014 01:08:04 PM Doug Hellmann wrote: On Sep 12, 2014, at 1:03 PM, Ihar Hrachyshka ihrac...@redhat.com wrote: Signed PGP part On 12/09/14 17:30, Mike Bayer wrote: On Sep 12, 2014, at 10:40 AM, Ihar Hrachyshka ihrac...@redhat.com wrote: Signed PGP part On 12/09/14 16:33, Mike Bayer wrote: I agree with this, changing the MySQL driver now is not an option. That was not the proposal. The proposal was to introduce support to run against something different from MySQLdb + a gate job for that alternative. The next cycle was supposed to do thorough regression testing, benchmarking, etc. to decide whether we're ok to recommend that alternative to users. ah, well that is a great idea. But we can have that throughout Kilo anyway, why not ? Sure, it's not the end of the world. We'll just need to postpone work till RC1 (=opening of master for new stuff), pass spec bureauracy (reapplying for kilo)... That's some burden, but not tragedy. The only thing that I'm really sad about is that Juno users won't be able to try out that driver on their setup just to see how it works, so it narrows testing base to gate while we could get some valuable deployment feedback in Juno already. It’s all experimental, right? And implemented in libraries? So those users could update oslo.db and sqlalchemy-migrate and test the results under Juno. Note that it's also theoretically possible to run the migrate with mysqldb and the regular production service (post-migrate) with an alternate mysql driver that won't deadlock eventlet... (ie: there's no reason the mysql driver choice needs to be universal and simultaneous) I'm sad that we (as a project) still haven't been able to make this technically trivial fix - or even make it an option for testing - after the original problem was identified and the fix proposed 2.5+ months ago. I'm encouraged to see various meta-threads popping up discussing issues with our development model and hopefully we can do better in future :( -- - Gus ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [neutron][all] switch from mysqldb to another eventlet aware mysql client
On Wed, 20 Aug 2014 05:03:51 PM Clark Boylan wrote: On Mon, Aug 18, 2014, at 01:59 AM, Ihar Hrachyshka wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA512 On 17/08/14 02:09, Angus Lees wrote: On 16 Aug 2014 06:09, Doug Hellmann d...@doughellmann.com mailto:d...@doughellmann.com wrote: On Aug 15, 2014, at 9:29 AM, Ihar Hrachyshka ihrac...@redhat.com mailto:ihrac...@redhat.com wrote: Signed PGP part Some updates on the matter: - oslo-spec was approved with narrowed scope which is now 'enabled mysqlconnector as an alternative in gate' instead of 'switch the default db driver to mysqlconnector'. We'll revisit the switch part the next cycle once we have the new driver running in gate and real benchmarking is heavy-lifted. - there are several patches that are needed to make devstack and tempest passing deployment and testing. Those are collected under the hood of: https://review.openstack.org/#/c/114207/ Not much of them. - we'll need a new oslo.db release to bump versions (this is needed to set raise_on_warnings=False for the new driver, which was incorrectly set to True in sqlalchemy till very recently). This is expected to be released this month (as per Roman Podoliaka). This release is currently blocked on landing some changes in projects using the library so they don?t break when the new version starts using different exception classes. We?re tracking that work in https://etherpad.openstack.org/p/sqla_exceptions_caught It looks like we?re down to 2 patches, one for cinder (https://review.openstack.org/#/c/111760/) and one for glance (https://review.openstack.org/#/c/109655). Roman, can you verify that those are the only two projects that need changes for the exception issue? - once the corresponding patch for sqlalchemy-migrate is merged, we'll also need a new version released for this. So we're going for a new version of sqlalchemy? (We have a separate workaround for raise_on_warnings that doesn't require the new sqlalchemy release if this brings too many other issues) Wrong. We're going for a new version of *sqlalchemy-migrate*. Which is the code that we inherited from Mike and currently track in stackforge. - on PyPI side, no news for now. The last time I've heard from Geert (the maintainer of MySQL Connector for Python), he was working on this. I suspect there are some legal considerations running inside Oracle. I'll update once I know more about that. If we don?t have the new package on PyPI, how do we plan to include it in the gate? Are there options to allow an exception, or to make the mirroring software download it anyway? We can test via devstack without waiting for pypi, since devstack will install via rpms/debs. I expect that it will be settled. I have no indication that the issue is unsolvable, it will just take a bit more time than we're accustomed to. :) At the moment, we install MySQLdb from distro packages for devstack. Same applies to new driver. It will be still great to see the package published on PyPI so that we can track its version requirements instead of relying on distros to package it properly. But I don't see it as a blocker. Also, we will probably be able to run with other drivers supported by SQLAlchemy once all the work is done. So I got bored last night and decided to take a stab at making PyMySQL work since I was a proponent of it earlier. Thankfully it did just mostly work like I thought it would. https://review.openstack.org/#/c/115495/ is the WIP devstack change to test this out. Thanks! Postgres tests fail because it was applying the pymysql driver to the postgres connection string. We can clean this up later in devstack and/or devstack-gate depending on how we need to apply this stuff. Bashate failed because I had to monkeypatch in a fix for a ceilometer issue loading sqlalchemy drivers. The tempest neutron full job fails on one test occasionally. Not sure yet if that is normal neutron full failure mode or if a new thing from PyMySQL. The regular tempest job passes just fine. There are also some DB related errors in the logs that will need to be cleaned up but overall it just works. So I would like to repropose that we stop focusing all of this effort on the hard thing and use the easy thing if it meets our needs. We can continue to make alternatives work, but that is a different problem that we can solve at a different pace. I am not sure how to test the neutron thing that Gus was running into though so we should also check that really quickly. TL;DR: pymysql passes my test case. I'm perfectly happy to move towards using mysql+pymysql in gate tests. (The various changes I've been submitting are to support _any_ non-default driver). If anyone cares, my test case is in
Re: [openstack-dev] [neutron][all] switch from mysqldb to another eventlet aware mysql client
Why pymysql over mysql-python? Endre Karlson 21. Aug. 2014 09:05 skrev Angus Lees g...@inodes.org følgende: On Wed, 20 Aug 2014 05:03:51 PM Clark Boylan wrote: On Mon, Aug 18, 2014, at 01:59 AM, Ihar Hrachyshka wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA512 On 17/08/14 02:09, Angus Lees wrote: On 16 Aug 2014 06:09, Doug Hellmann d...@doughellmann.com mailto:d...@doughellmann.com wrote: On Aug 15, 2014, at 9:29 AM, Ihar Hrachyshka ihrac...@redhat.com mailto:ihrac...@redhat.com wrote: Signed PGP part Some updates on the matter: - oslo-spec was approved with narrowed scope which is now 'enabled mysqlconnector as an alternative in gate' instead of 'switch the default db driver to mysqlconnector'. We'll revisit the switch part the next cycle once we have the new driver running in gate and real benchmarking is heavy-lifted. - there are several patches that are needed to make devstack and tempest passing deployment and testing. Those are collected under the hood of: https://review.openstack.org/#/c/114207/ Not much of them. - we'll need a new oslo.db release to bump versions (this is needed to set raise_on_warnings=False for the new driver, which was incorrectly set to True in sqlalchemy till very recently). This is expected to be released this month (as per Roman Podoliaka). This release is currently blocked on landing some changes in projects using the library so they don?t break when the new version starts using different exception classes. We?re tracking that work in https://etherpad.openstack.org/p/sqla_exceptions_caught It looks like we?re down to 2 patches, one for cinder (https://review.openstack.org/#/c/111760/) and one for glance (https://review.openstack.org/#/c/109655). Roman, can you verify that those are the only two projects that need changes for the exception issue? - once the corresponding patch for sqlalchemy-migrate is merged, we'll also need a new version released for this. So we're going for a new version of sqlalchemy? (We have a separate workaround for raise_on_warnings that doesn't require the new sqlalchemy release if this brings too many other issues) Wrong. We're going for a new version of *sqlalchemy-migrate*. Which is the code that we inherited from Mike and currently track in stackforge. - on PyPI side, no news for now. The last time I've heard from Geert (the maintainer of MySQL Connector for Python), he was working on this. I suspect there are some legal considerations running inside Oracle. I'll update once I know more about that. If we don?t have the new package on PyPI, how do we plan to include it in the gate? Are there options to allow an exception, or to make the mirroring software download it anyway? We can test via devstack without waiting for pypi, since devstack will install via rpms/debs. I expect that it will be settled. I have no indication that the issue is unsolvable, it will just take a bit more time than we're accustomed to. :) At the moment, we install MySQLdb from distro packages for devstack. Same applies to new driver. It will be still great to see the package published on PyPI so that we can track its version requirements instead of relying on distros to package it properly. But I don't see it as a blocker. Also, we will probably be able to run with other drivers supported by SQLAlchemy once all the work is done. So I got bored last night and decided to take a stab at making PyMySQL work since I was a proponent of it earlier. Thankfully it did just mostly work like I thought it would. https://review.openstack.org/#/c/115495/ is the WIP devstack change to test this out. Thanks! Postgres tests fail because it was applying the pymysql driver to the postgres connection string. We can clean this up later in devstack and/or devstack-gate depending on how we need to apply this stuff. Bashate failed because I had to monkeypatch in a fix for a ceilometer issue loading sqlalchemy drivers. The tempest neutron full job fails on one test occasionally. Not sure yet if that is normal neutron full failure mode or if a new thing from PyMySQL. The regular tempest job passes just fine. There are also some DB related errors in the logs that will need to be cleaned up but overall it just works. So I would like to repropose that we stop focusing all of this effort on the hard thing and use the easy thing if it meets our needs. We can continue to make alternatives work, but that is a different problem that we can solve at a different pace. I am not sure how to test the neutron thing that Gus was running into though so we should also check that really quickly. TL;DR: pymysql passes my test case. I'm perfectly happy to move
Re: [openstack-dev] [neutron][all] switch from mysqldb to another eventlet aware mysql client
-BEGIN PGP SIGNED MESSAGE- Hash: SHA512 On 21/08/14 02:03, Clark Boylan wrote: On Mon, Aug 18, 2014, at 01:59 AM, Ihar Hrachyshka wrote: On 17/08/14 02:09, Angus Lees wrote: On 16 Aug 2014 06:09, Doug Hellmann d...@doughellmann.com mailto:d...@doughellmann.com wrote: On Aug 15, 2014, at 9:29 AM, Ihar Hrachyshka ihrac...@redhat.com mailto:ihrac...@redhat.com wrote: Signed PGP part Some updates on the matter: - oslo-spec was approved with narrowed scope which is now 'enabled mysqlconnector as an alternative in gate' instead of 'switch the default db driver to mysqlconnector'. We'll revisit the switch part the next cycle once we have the new driver running in gate and real benchmarking is heavy-lifted. - there are several patches that are needed to make devstack and tempest passing deployment and testing. Those are collected under the hood of: https://review.openstack.org/#/c/114207/ Not much of them. - we'll need a new oslo.db release to bump versions (this is needed to set raise_on_warnings=False for the new driver, which was incorrectly set to True in sqlalchemy till very recently). This is expected to be released this month (as per Roman Podoliaka). This release is currently blocked on landing some changes in projects using the library so they don?t break when the new version starts using different exception classes. We?re tracking that work in https://etherpad.openstack.org/p/sqla_exceptions_caught It looks like we?re down to 2 patches, one for cinder (https://review.openstack.org/#/c/111760/) and one for glance (https://review.openstack.org/#/c/109655). Roman, can you verify that those are the only two projects that need changes for the exception issue? - once the corresponding patch for sqlalchemy-migrate is merged, we'll also need a new version released for this. So we're going for a new version of sqlalchemy? (We have a separate workaround for raise_on_warnings that doesn't require the new sqlalchemy release if this brings too many other issues) Wrong. We're going for a new version of *sqlalchemy-migrate*. Which is the code that we inherited from Mike and currently track in stackforge. - on PyPI side, no news for now. The last time I've heard from Geert (the maintainer of MySQL Connector for Python), he was working on this. I suspect there are some legal considerations running inside Oracle. I'll update once I know more about that. If we don?t have the new package on PyPI, how do we plan to include it in the gate? Are there options to allow an exception, or to make the mirroring software download it anyway? We can test via devstack without waiting for pypi, since devstack will install via rpms/debs. I expect that it will be settled. I have no indication that the issue is unsolvable, it will just take a bit more time than we're accustomed to. :) At the moment, we install MySQLdb from distro packages for devstack. Same applies to new driver. It will be still great to see the package published on PyPI so that we can track its version requirements instead of relying on distros to package it properly. But I don't see it as a blocker. Also, we will probably be able to run with other drivers supported by SQLAlchemy once all the work is done. So I got bored last night and decided to take a stab at making PyMySQL work since I was a proponent of it earlier. Thankfully it did just mostly work like I thought it would. https://review.openstack.org/#/c/115495/ is the WIP devstack change to test this out. Great! Postgres tests fail because it was applying the pymysql driver to the postgres connection string. We can clean this up later in devstack and/or devstack-gate depending on how we need to apply this stuff. Bashate failed because I had to monkeypatch in a fix for a ceilometer issue loading sqlalchemy drivers. The tempest neutron full job fails on one test occasionally. Not sure yet if that is normal neutron full failure mode or if a new thing from PyMySQL. The regular tempest job passes just fine. There are also some DB related errors in the logs that will need to be cleaned up but overall it just works. So I would like to repropose that we stop focusing all of this effort on the hard thing and use the easy thing if it meets our needs. We can continue to make alternatives work, but that is a different problem that we can solve at a different pace. I am not sure how to test the neutron thing that Gus was running into though so we should also check that really quickly. In our patches througout the projects, we're actually not focusing on any specific driver, even though the original spec is focused on MySQL Connector. I still think we should achieve MySQL Connector working in gate in the very near future. The current progress can be tracked at: https://review.openstack.org/#/c/114207/ Also, the tests themselves don't seem to run any faster or slower than
Re: [openstack-dev] [neutron][all] switch from mysqldb to another eventlet aware mysql client
-BEGIN PGP SIGNED MESSAGE- Hash: SHA512 On 21/08/14 09:42, Endre Karlson wrote: Why pymysql over mysql-python? http://specs.openstack.org/openstack/oslo-specs/specs/juno/enable-mysql-connector.html#problem-description -BEGIN PGP SIGNATURE- Version: GnuPG/MacGPG2 v2.0.22 (Darwin) iQEcBAEBCgAGBQJT9crkAAoJEC5aWaUY1u573MwH/1O9o41v616AysL0XC5mc7j6 Bit5rOcxlL7frEvgG9UYFwIlyHfoyBVK+5K9rc36ORlQySBvieta+T+9YhJEWuYI S7qy/e/a3Kl98hkT4PDSrVTfLT4Xe0eSaloJmfrrCkrqjUPd+hq8bopJBcj5U1K4 Wjb1DqZqrvTlHQtInFwADJxRW+s4hnXpPcE2rYXzZK1KyB5S7ov68LH1JYz8nuSq muJ5DeClvsLYFibQEYubcdC9d2y15s/QimKNcKJEd/CT2frNKbZdH9R921xnnvNY XVuTXwqN9Sc0fQxApmIfE6sPVhQ8inYpoqvRFzPAx9d1Vkx+u4jcREhHw9kyEn0= =9NcS -END PGP SIGNATURE- ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [neutron][all] switch from mysqldb to another eventlet aware mysql client
On Mon, Aug 18, 2014, at 01:59 AM, Ihar Hrachyshka wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA512 On 17/08/14 02:09, Angus Lees wrote: On 16 Aug 2014 06:09, Doug Hellmann d...@doughellmann.com mailto:d...@doughellmann.com wrote: On Aug 15, 2014, at 9:29 AM, Ihar Hrachyshka ihrac...@redhat.com mailto:ihrac...@redhat.com wrote: Signed PGP part Some updates on the matter: - oslo-spec was approved with narrowed scope which is now 'enabled mysqlconnector as an alternative in gate' instead of 'switch the default db driver to mysqlconnector'. We'll revisit the switch part the next cycle once we have the new driver running in gate and real benchmarking is heavy-lifted. - there are several patches that are needed to make devstack and tempest passing deployment and testing. Those are collected under the hood of: https://review.openstack.org/#/c/114207/ Not much of them. - we'll need a new oslo.db release to bump versions (this is needed to set raise_on_warnings=False for the new driver, which was incorrectly set to True in sqlalchemy till very recently). This is expected to be released this month (as per Roman Podoliaka). This release is currently blocked on landing some changes in projects using the library so they don?t break when the new version starts using different exception classes. We?re tracking that work in https://etherpad.openstack.org/p/sqla_exceptions_caught It looks like we?re down to 2 patches, one for cinder (https://review.openstack.org/#/c/111760/) and one for glance (https://review.openstack.org/#/c/109655). Roman, can you verify that those are the only two projects that need changes for the exception issue? - once the corresponding patch for sqlalchemy-migrate is merged, we'll also need a new version released for this. So we're going for a new version of sqlalchemy? (We have a separate workaround for raise_on_warnings that doesn't require the new sqlalchemy release if this brings too many other issues) Wrong. We're going for a new version of *sqlalchemy-migrate*. Which is the code that we inherited from Mike and currently track in stackforge. - on PyPI side, no news for now. The last time I've heard from Geert (the maintainer of MySQL Connector for Python), he was working on this. I suspect there are some legal considerations running inside Oracle. I'll update once I know more about that. If we don?t have the new package on PyPI, how do we plan to include it in the gate? Are there options to allow an exception, or to make the mirroring software download it anyway? We can test via devstack without waiting for pypi, since devstack will install via rpms/debs. I expect that it will be settled. I have no indication that the issue is unsolvable, it will just take a bit more time than we're accustomed to. :) At the moment, we install MySQLdb from distro packages for devstack. Same applies to new driver. It will be still great to see the package published on PyPI so that we can track its version requirements instead of relying on distros to package it properly. But I don't see it as a blocker. Also, we will probably be able to run with other drivers supported by SQLAlchemy once all the work is done. So I got bored last night and decided to take a stab at making PyMySQL work since I was a proponent of it earlier. Thankfully it did just mostly work like I thought it would. https://review.openstack.org/#/c/115495/ is the WIP devstack change to test this out. Postgres tests fail because it was applying the pymysql driver to the postgres connection string. We can clean this up later in devstack and/or devstack-gate depending on how we need to apply this stuff. Bashate failed because I had to monkeypatch in a fix for a ceilometer issue loading sqlalchemy drivers. The tempest neutron full job fails on one test occasionally. Not sure yet if that is normal neutron full failure mode or if a new thing from PyMySQL. The regular tempest job passes just fine. There are also some DB related errors in the logs that will need to be cleaned up but overall it just works. So I would like to repropose that we stop focusing all of this effort on the hard thing and use the easy thing if it meets our needs. We can continue to make alternatives work, but that is a different problem that we can solve at a different pace. I am not sure how to test the neutron thing that Gus was running into though so we should also check that really quickly. Also, the tests themselves don't seem to run any faster or slower than when using the default mysql driver. Hard to complain about that :) Clark - Gus ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -BEGIN PGP SIGNATURE- Version: GnuPG/MacGPG2 v2.0.22
Re: [openstack-dev] [neutron][all] switch from mysqldb to another eventlet aware mysql client
-BEGIN PGP SIGNED MESSAGE- Hash: SHA512 On 17/08/14 02:09, Angus Lees wrote: On 16 Aug 2014 06:09, Doug Hellmann d...@doughellmann.com mailto:d...@doughellmann.com wrote: On Aug 15, 2014, at 9:29 AM, Ihar Hrachyshka ihrac...@redhat.com mailto:ihrac...@redhat.com wrote: Signed PGP part Some updates on the matter: - oslo-spec was approved with narrowed scope which is now 'enabled mysqlconnector as an alternative in gate' instead of 'switch the default db driver to mysqlconnector'. We'll revisit the switch part the next cycle once we have the new driver running in gate and real benchmarking is heavy-lifted. - there are several patches that are needed to make devstack and tempest passing deployment and testing. Those are collected under the hood of: https://review.openstack.org/#/c/114207/ Not much of them. - we'll need a new oslo.db release to bump versions (this is needed to set raise_on_warnings=False for the new driver, which was incorrectly set to True in sqlalchemy till very recently). This is expected to be released this month (as per Roman Podoliaka). This release is currently blocked on landing some changes in projects using the library so they don?t break when the new version starts using different exception classes. We?re tracking that work in https://etherpad.openstack.org/p/sqla_exceptions_caught It looks like we?re down to 2 patches, one for cinder (https://review.openstack.org/#/c/111760/) and one for glance (https://review.openstack.org/#/c/109655). Roman, can you verify that those are the only two projects that need changes for the exception issue? - once the corresponding patch for sqlalchemy-migrate is merged, we'll also need a new version released for this. So we're going for a new version of sqlalchemy? (We have a separate workaround for raise_on_warnings that doesn't require the new sqlalchemy release if this brings too many other issues) Wrong. We're going for a new version of *sqlalchemy-migrate*. Which is the code that we inherited from Mike and currently track in stackforge. - on PyPI side, no news for now. The last time I've heard from Geert (the maintainer of MySQL Connector for Python), he was working on this. I suspect there are some legal considerations running inside Oracle. I'll update once I know more about that. If we don?t have the new package on PyPI, how do we plan to include it in the gate? Are there options to allow an exception, or to make the mirroring software download it anyway? We can test via devstack without waiting for pypi, since devstack will install via rpms/debs. I expect that it will be settled. I have no indication that the issue is unsolvable, it will just take a bit more time than we're accustomed to. :) At the moment, we install MySQLdb from distro packages for devstack. Same applies to new driver. It will be still great to see the package published on PyPI so that we can track its version requirements instead of relying on distros to package it properly. But I don't see it as a blocker. Also, we will probably be able to run with other drivers supported by SQLAlchemy once all the work is done. - Gus ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -BEGIN PGP SIGNATURE- Version: GnuPG/MacGPG2 v2.0.22 (Darwin) iQEcBAEBCgAGBQJT8cCDAAoJEC5aWaUY1u57YgQH/0T6QxHfYv0NUWh95AOS4U24 YQ70/A5RD41rn+b/+RU27B97WCY7kJDSn//tfW+THFTZFWZDLy/g2AuXKkbTT3DU 4DTEIvkk4pTtmUhRGLDp4hVWnZ/wKMq1Xtu+jtuXEvz0dcSxgtzwKE3auJ1fD/SL gZPhUQwPpdQASQo8DSh1iziftlpTzvmhsvAAexvDRFdpZs87b7VTH2AFLYRgW47P 07eow5WL9KprR+Yxfg680A9GoghtB0ffGLvQmdnfOaln+MRx51ywTcq3RKeUU4RH fgJjddZOPsKhHHPHEwak8+qd2iZ/AvUh0OvkZ3QqX9Dj3ZcpYnJYMApHQNvnebw= =n5u7 -END PGP SIGNATURE- ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [neutron][all] switch from mysqldb to another eventlet aware mysql client
Hello Doug, All. This release is currently blocked on landing some changes in projects using the library so they don’t break when the new version starts using different exception classes. We’re tracking that work in https://etherpad.openstack.org/p/sqla_exceptions_caught It looks like we’re down to 2 patches, one for cinder ( https://review.openstack.org/#/c/111760/) and one for glance ( https://review.openstack.org/#/c/109655). At the moment these patches are merged, so the exception issue was fixed in all core OS projects. But unfortunately, there is another blocker for the oslo.db release - Heat uses BaseMigrationTestCase class, which was removed from oslo.db in patch https://review.openstack.org/#/c/93424/ , so the new oslo.db release will break unittests in Heat. Here is the patch, which should fix this issue - https://review.openstack.org/#/c/109658/ I really hope, that this patch is the last release blocker :) Roman, folks - please fix me, if I miss something On Fri, Aug 15, 2014 at 11:07 PM, Doug Hellmann d...@doughellmann.com wrote: On Aug 15, 2014, at 9:29 AM, Ihar Hrachyshka ihrac...@redhat.com wrote: Signed PGP part Some updates on the matter: - oslo-spec was approved with narrowed scope which is now 'enabled mysqlconnector as an alternative in gate' instead of 'switch the default db driver to mysqlconnector'. We'll revisit the switch part the next cycle once we have the new driver running in gate and real benchmarking is heavy-lifted. - there are several patches that are needed to make devstack and tempest passing deployment and testing. Those are collected under the hood of: https://review.openstack.org/#/c/114207/ Not much of them. - we'll need a new oslo.db release to bump versions (this is needed to set raise_on_warnings=False for the new driver, which was incorrectly set to True in sqlalchemy till very recently). This is expected to be released this month (as per Roman Podoliaka). This release is currently blocked on landing some changes in projects using the library so they don’t break when the new version starts using different exception classes. We’re tracking that work in https://etherpad.openstack.org/p/sqla_exceptions_caught It looks like we’re down to 2 patches, one for cinder ( https://review.openstack.org/#/c/111760/) and one for glance ( https://review.openstack.org/#/c/109655). Roman, can you verify that those are the only two projects that need changes for the exception issue? - once the corresponding patch for sqlalchemy-migrate is merged, we'll also need a new version released for this. - on PyPI side, no news for now. The last time I've heard from Geert (the maintainer of MySQL Connector for Python), he was working on this. I suspect there are some legal considerations running inside Oracle. I'll update once I know more about that. If we don’t have the new package on PyPI, how do we plan to include it in the gate? Are there options to allow an exception, or to make the mirroring software download it anyway? Doug - once all the relevant patches land in affected projects and devstack, I'm going to introduce a separate gate job to run against mysqlconnector. Cheers, /Ihar On 22/07/14 15:03, Ihar Hrachyshka wrote: FYI: I've moved the spec to oslo space since the switch is not really limited to neutron, and most of coding is to be done in oslo.db (though not much anyway). New spec: https://review.openstack.org/#/c/108355/ On 09/07/14 13:17, Ihar Hrachyshka wrote: Hi all, Multiple projects are suffering from db lock timeouts due to deadlocks deep in mysqldb library that we use to interact with mysql servers. In essence, the problem is due to missing eventlet support in mysqldb module, meaning when a db lock is encountered, the library does not yield to the next green thread, allowing other threads to eventually unlock the grabbed lock, and instead it just blocks the main thread, that eventually raises timeout exception (OperationalError). The failed operation is not retried, leaving failing request not served. In Nova, there is a special retry mechanism for deadlocks, though I think it's more a hack than a proper fix. Neutron is one of the projects that suffer from those timeout errors a lot. Partly it's due to lack of discipline in how we do nested calls in l3_db and ml2_plugin code, but that's not something to change in foreseeable future, so we need to find another solution that is applicable for Juno. Ideally, the solution should be applicable for Icehouse too to allow distributors to resolve existing deadlocks without waiting for Juno. We've had several discussions and attempts to introduce a solution to the problem. Thanks to oslo.db guys, we now have more or less clear view on the cause of the failures and how to easily fix them. The solution is to switch mysqldb to
Re: [openstack-dev] [neutron][all] switch from mysqldb to another eventlet aware mysql client
On 16 Aug 2014 06:09, Doug Hellmann d...@doughellmann.com wrote: On Aug 15, 2014, at 9:29 AM, Ihar Hrachyshka ihrac...@redhat.com wrote: Signed PGP part Some updates on the matter: - oslo-spec was approved with narrowed scope which is now 'enabled mysqlconnector as an alternative in gate' instead of 'switch the default db driver to mysqlconnector'. We'll revisit the switch part the next cycle once we have the new driver running in gate and real benchmarking is heavy-lifted. - there are several patches that are needed to make devstack and tempest passing deployment and testing. Those are collected under the hood of: https://review.openstack.org/#/c/114207/ Not much of them. - we'll need a new oslo.db release to bump versions (this is needed to set raise_on_warnings=False for the new driver, which was incorrectly set to True in sqlalchemy till very recently). This is expected to be released this month (as per Roman Podoliaka). This release is currently blocked on landing some changes in projects using the library so they don’t break when the new version starts using different exception classes. We’re tracking that work in https://etherpad.openstack.org/p/sqla_exceptions_caught It looks like we’re down to 2 patches, one for cinder ( https://review.openstack.org/#/c/111760/) and one for glance ( https://review.openstack.org/#/c/109655). Roman, can you verify that those are the only two projects that need changes for the exception issue? - once the corresponding patch for sqlalchemy-migrate is merged, we'll also need a new version released for this. So we're going for a new version of sqlalchemy? (We have a separate workaround for raise_on_warnings that doesn't require the new sqlalchemy release if this brings too many other issues) - on PyPI side, no news for now. The last time I've heard from Geert (the maintainer of MySQL Connector for Python), he was working on this. I suspect there are some legal considerations running inside Oracle. I'll update once I know more about that. If we don’t have the new package on PyPI, how do we plan to include it in the gate? Are there options to allow an exception, or to make the mirroring software download it anyway? We can test via devstack without waiting for pypi, since devstack will install via rpms/debs. - Gus ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [neutron][all] switch from mysqldb to another eventlet aware mysql client
-BEGIN PGP SIGNED MESSAGE- Hash: SHA512 Some updates on the matter: - - oslo-spec was approved with narrowed scope which is now 'enabled mysqlconnector as an alternative in gate' instead of 'switch the default db driver to mysqlconnector'. We'll revisit the switch part the next cycle once we have the new driver running in gate and real benchmarking is heavy-lifted. - - there are several patches that are needed to make devstack and tempest passing deployment and testing. Those are collected under the hood of: https://review.openstack.org/#/c/114207/ Not much of them. - - we'll need a new oslo.db release to bump versions (this is needed to set raise_on_warnings=False for the new driver, which was incorrectly set to True in sqlalchemy till very recently). This is expected to be released this month (as per Roman Podoliaka). - - once the corresponding patch for sqlalchemy-migrate is merged, we'll also need a new version released for this. - - on PyPI side, no news for now. The last time I've heard from Geert (the maintainer of MySQL Connector for Python), he was working on this. I suspect there are some legal considerations running inside Oracle. I'll update once I know more about that. - - once all the relevant patches land in affected projects and devstack, I'm going to introduce a separate gate job to run against mysqlconnector. Cheers, /Ihar On 22/07/14 15:03, Ihar Hrachyshka wrote: FYI: I've moved the spec to oslo space since the switch is not really limited to neutron, and most of coding is to be done in oslo.db (though not much anyway). New spec: https://review.openstack.org/#/c/108355/ On 09/07/14 13:17, Ihar Hrachyshka wrote: Hi all, Multiple projects are suffering from db lock timeouts due to deadlocks deep in mysqldb library that we use to interact with mysql servers. In essence, the problem is due to missing eventlet support in mysqldb module, meaning when a db lock is encountered, the library does not yield to the next green thread, allowing other threads to eventually unlock the grabbed lock, and instead it just blocks the main thread, that eventually raises timeout exception (OperationalError). The failed operation is not retried, leaving failing request not served. In Nova, there is a special retry mechanism for deadlocks, though I think it's more a hack than a proper fix. Neutron is one of the projects that suffer from those timeout errors a lot. Partly it's due to lack of discipline in how we do nested calls in l3_db and ml2_plugin code, but that's not something to change in foreseeable future, so we need to find another solution that is applicable for Juno. Ideally, the solution should be applicable for Icehouse too to allow distributors to resolve existing deadlocks without waiting for Juno. We've had several discussions and attempts to introduce a solution to the problem. Thanks to oslo.db guys, we now have more or less clear view on the cause of the failures and how to easily fix them. The solution is to switch mysqldb to something eventlet aware. The best candidate is probably MySQL Connector module that is an official MySQL client for Python and that shows some (preliminary) good results in terms of performance. I've posted a Neutron spec for the switch to the new client in Juno at [1]. Ideally, switch is just a matter of several fixes to oslo.db that would enable full support for the new driver already supported by SQLAlchemy, plus 'connection' string modified in service configuration files, plus documentation updates to refer to the new official way to configure services for MySQL. The database code won't, ideally, require any major changes, though some adaptation for the new client library may be needed. That said, Neutron does not seem to require any changes, though it was revealed that there are some alembic migration rules in Keystone or Glance that need (trivial) modifications. You can see how trivial the switch can be achieved for a service based on example for Neutron [2]. While this is a Neutron specific proposal, there is an obvious wish to switch to the new library globally throughout all the projects, to reduce devops burden, among other things. My vision is that, ideally, we switch all projects to the new library in Juno, though we still may leave several projects for K in case any issues arise, similar to the way projects switched to oslo.messaging during two cycles instead of one. Though looking at how easy Neutron can be switched to the new library, I wouldn't expect any issues that would postpone the switch till K. It was mentioned in comments to the spec proposal that there were some discussions at the latest summit around possible switch in context of Nova that revealed some concerns, though they do not seem to be documented anywhere. So if you know anything about it, please comment. So, we'd like to hear from other projects what's your take on that move,
Re: [openstack-dev] [neutron][all] switch from mysqldb to another eventlet aware mysql client
On Aug 15, 2014, at 9:29 AM, Ihar Hrachyshka ihrac...@redhat.com wrote: Signed PGP part Some updates on the matter: - oslo-spec was approved with narrowed scope which is now 'enabled mysqlconnector as an alternative in gate' instead of 'switch the default db driver to mysqlconnector'. We'll revisit the switch part the next cycle once we have the new driver running in gate and real benchmarking is heavy-lifted. - there are several patches that are needed to make devstack and tempest passing deployment and testing. Those are collected under the hood of: https://review.openstack.org/#/c/114207/ Not much of them. - we'll need a new oslo.db release to bump versions (this is needed to set raise_on_warnings=False for the new driver, which was incorrectly set to True in sqlalchemy till very recently). This is expected to be released this month (as per Roman Podoliaka). This release is currently blocked on landing some changes in projects using the library so they don’t break when the new version starts using different exception classes. We’re tracking that work in https://etherpad.openstack.org/p/sqla_exceptions_caught It looks like we’re down to 2 patches, one for cinder (https://review.openstack.org/#/c/111760/) and one for glance (https://review.openstack.org/#/c/109655). Roman, can you verify that those are the only two projects that need changes for the exception issue? - once the corresponding patch for sqlalchemy-migrate is merged, we'll also need a new version released for this. - on PyPI side, no news for now. The last time I've heard from Geert (the maintainer of MySQL Connector for Python), he was working on this. I suspect there are some legal considerations running inside Oracle. I'll update once I know more about that. If we don’t have the new package on PyPI, how do we plan to include it in the gate? Are there options to allow an exception, or to make the mirroring software download it anyway? Doug - once all the relevant patches land in affected projects and devstack, I'm going to introduce a separate gate job to run against mysqlconnector. Cheers, /Ihar On 22/07/14 15:03, Ihar Hrachyshka wrote: FYI: I've moved the spec to oslo space since the switch is not really limited to neutron, and most of coding is to be done in oslo.db (though not much anyway). New spec: https://review.openstack.org/#/c/108355/ On 09/07/14 13:17, Ihar Hrachyshka wrote: Hi all, Multiple projects are suffering from db lock timeouts due to deadlocks deep in mysqldb library that we use to interact with mysql servers. In essence, the problem is due to missing eventlet support in mysqldb module, meaning when a db lock is encountered, the library does not yield to the next green thread, allowing other threads to eventually unlock the grabbed lock, and instead it just blocks the main thread, that eventually raises timeout exception (OperationalError). The failed operation is not retried, leaving failing request not served. In Nova, there is a special retry mechanism for deadlocks, though I think it's more a hack than a proper fix. Neutron is one of the projects that suffer from those timeout errors a lot. Partly it's due to lack of discipline in how we do nested calls in l3_db and ml2_plugin code, but that's not something to change in foreseeable future, so we need to find another solution that is applicable for Juno. Ideally, the solution should be applicable for Icehouse too to allow distributors to resolve existing deadlocks without waiting for Juno. We've had several discussions and attempts to introduce a solution to the problem. Thanks to oslo.db guys, we now have more or less clear view on the cause of the failures and how to easily fix them. The solution is to switch mysqldb to something eventlet aware. The best candidate is probably MySQL Connector module that is an official MySQL client for Python and that shows some (preliminary) good results in terms of performance. I've posted a Neutron spec for the switch to the new client in Juno at [1]. Ideally, switch is just a matter of several fixes to oslo.db that would enable full support for the new driver already supported by SQLAlchemy, plus 'connection' string modified in service configuration files, plus documentation updates to refer to the new official way to configure services for MySQL. The database code won't, ideally, require any major changes, though some adaptation for the new client library may be needed. That said, Neutron does not seem to require any changes, though it was revealed that there are some alembic migration rules in Keystone or Glance that need (trivial) modifications. You can see how trivial the switch can be achieved for a service based on example for Neutron [2]. While this is a Neutron specific proposal, there is an obvious wish to switch to the new library globally throughout
Re: [openstack-dev] [neutron][all] switch from mysqldb to another eventlet aware mysql client
-BEGIN PGP SIGNED MESSAGE- Hash: SHA512 FYI: I've moved the spec to oslo space since the switch is not really limited to neutron, and most of coding is to be done in oslo.db (though not much anyway). New spec: https://review.openstack.org/#/c/108355/ On 09/07/14 13:17, Ihar Hrachyshka wrote: Hi all, Multiple projects are suffering from db lock timeouts due to deadlocks deep in mysqldb library that we use to interact with mysql servers. In essence, the problem is due to missing eventlet support in mysqldb module, meaning when a db lock is encountered, the library does not yield to the next green thread, allowing other threads to eventually unlock the grabbed lock, and instead it just blocks the main thread, that eventually raises timeout exception (OperationalError). The failed operation is not retried, leaving failing request not served. In Nova, there is a special retry mechanism for deadlocks, though I think it's more a hack than a proper fix. Neutron is one of the projects that suffer from those timeout errors a lot. Partly it's due to lack of discipline in how we do nested calls in l3_db and ml2_plugin code, but that's not something to change in foreseeable future, so we need to find another solution that is applicable for Juno. Ideally, the solution should be applicable for Icehouse too to allow distributors to resolve existing deadlocks without waiting for Juno. We've had several discussions and attempts to introduce a solution to the problem. Thanks to oslo.db guys, we now have more or less clear view on the cause of the failures and how to easily fix them. The solution is to switch mysqldb to something eventlet aware. The best candidate is probably MySQL Connector module that is an official MySQL client for Python and that shows some (preliminary) good results in terms of performance. I've posted a Neutron spec for the switch to the new client in Juno at [1]. Ideally, switch is just a matter of several fixes to oslo.db that would enable full support for the new driver already supported by SQLAlchemy, plus 'connection' string modified in service configuration files, plus documentation updates to refer to the new official way to configure services for MySQL. The database code won't, ideally, require any major changes, though some adaptation for the new client library may be needed. That said, Neutron does not seem to require any changes, though it was revealed that there are some alembic migration rules in Keystone or Glance that need (trivial) modifications. You can see how trivial the switch can be achieved for a service based on example for Neutron [2]. While this is a Neutron specific proposal, there is an obvious wish to switch to the new library globally throughout all the projects, to reduce devops burden, among other things. My vision is that, ideally, we switch all projects to the new library in Juno, though we still may leave several projects for K in case any issues arise, similar to the way projects switched to oslo.messaging during two cycles instead of one. Though looking at how easy Neutron can be switched to the new library, I wouldn't expect any issues that would postpone the switch till K. It was mentioned in comments to the spec proposal that there were some discussions at the latest summit around possible switch in context of Nova that revealed some concerns, though they do not seem to be documented anywhere. So if you know anything about it, please comment. So, we'd like to hear from other projects what's your take on that move, whether you see any issues or have concerns about it. Thanks for your comments, /Ihar [1]: https://review.openstack.org/#/c/104905/ [2]: https://review.openstack.org/#/c/105209/ ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -BEGIN PGP SIGNATURE- Version: GnuPG/MacGPG2 v2.0.22 (Darwin) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBCgAGBQJTzmE3AAoJEC5aWaUY1u57Wm4H/iK7qsBsXXu5EbHeCpzSDejt Crp0wzhlI2LInF8r4oMdVc1qIldSBvfgb5seYraphrItJx2jVHrVNf3zQxYd8lDh 79Xi4+PPJkgnCsGc/dpUnah5Yvl32KMnpG860kfkQQR+xtOqS92wSVP9YrOr/cF4 D9674M+ks3v13sUNz9AnFyz5FwJxHUXhOxzAyzcz5e4bvAXlCo2HdxChblH/cA7y fqpTvRH4l6iqaznx6bD8kfLlDKNAErMSkYLoPwoA1k6Ek586G1LKLlRNkeecSz3b 3y39cbE+ZM3StPgmk1AqdbX/nKzQSZMzHS7QnET8ijRPmC3DqJYrXjAx7m7UQ1s= =cqK+ -END PGP SIGNATURE- ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [neutron][all] switch from mysqldb to another eventlet aware mysql client
-BEGIN PGP SIGNED MESSAGE- Hash: SHA512 On 21/07/14 04:53, Angus Lees wrote: Status, as I understand it: * oslo.db changes to support other mysql drivers: https://review.openstack.org/#/c/104425/ (merged) https://review.openstack.org/#/c/106928/ (awaiting oslo.db review) https://review.openstack.org/#/c/107221/ (awaiting oslo.db review) For that last one, the idea is correct, but the implementation is wrong, see my comments in the review. (These are sufficient to allow operators to switch connection strings and use mysqlconnector. The rest is all for our testing environment) * oslo.db change to allow testing with other mysql drivers: https://review.openstack.org/#/c/104428/ (awaiting oslo.db review) https://review.openstack.org/#/c/104447/ (awaiting oslo.db review. Ongoing discussion towards a larger rewrite of oslo.db testing instead) * Integration into jenkins environment: Blocked on getting Oracle to distribute mysql-connector via pypi. Ihar and others are having conversations with the upstream author. * Devstack change to switch to mysqlconnector for neutron: https://review.openstack.org/#/c/105209/ (marked wip) Ihar: do you want me to pick this up, or are you going to continue it once some of the above has settled? This is in WIP because it's not clear now whether the switch is expected to be global or local to neutron. I'll make sure it's covered if/when spec is approved. * oslo.db gate test that reproduces the deadlock with eventlet: https://review.openstack.org/#/c/104436/ (In review. Can't be submitted until gate environment is switched to mysqlconnector) + performance is yet to be benchmarked for different projects. Overall I'm not happy with the rate of change - but we're getting there. That's Openstack! Changes take time here. I look forward to getting this fixed :/ Thanks for tracking oslo.db part of that, I really appreciate that. On 18 July 2014 21:45, Ihar Hrachyshka ihrac...@redhat.com mailto:ihrac...@redhat.com wrote: On 14/07/14 17:03, Ihar Hrachyshka wrote: On 14/07/14 15:54, Clark Boylan wrote: On Sun, Jul 13, 2014 at 9:20 AM, Ihar Hrachyshka ihrac...@redhat.com mailto:ihrac...@redhat.com wrote: On 11/07/14 19:20, Clark Boylan wrote: Before we get too far ahead of ourselves mysql-connector is not hosted on pypi. Instead it is an external package link. We recently managed to remove all packages that are hosted as external package links from openstack and will not add new ones in. Before we can use mysql-connector in the gate oracle will need to publish mysql-connector on pypi properly. There is misunderstanding in our community on how we deploy db client modules. No project actually depends on any of them. We assume deployers will install the proper one and configure 'connection' string to use it. In case of devstack, we install the appropriate package from distribution packages, not pip. Correct, but for all of the other test suites (unittests) we install the db clients via pip because tox runs them and virtualenvs allowing site packages cause too many problems. See https://git.openstack.org/cgit/openstack/nova/tree/test-requirements.txt#n8. So we do actually depend on these things being pip installable. Basically this allows devs to run `tox` and it works. Roger that, and thanks for clarification. I'm trying to reach the author and the maintainer of mysqlconnector-python to see whether I'll be able to convince him to publish the packages on pypi.python.org http://pypi.python.org. I've reached the maintainer of the module, he told me he is currently working on uploading releases directly to pypi.python.org http://pypi.python.org. I would argue that we should have devstack install via pip too for consistency, but that is a different issue (it is already installing all of the other python dependencies this way so why special case?). What we do is recommending a module for our users in our documentation. That said, I assume the gate is a non-issue. Correct? That said there is at least one other pure python alternative, PyMySQL. PyMySQL supports py3k and pypy. We should look at using PyMySQL instead if we want to start with a reasonable path to getting this in the gate. MySQL Connector supports py3k too (not sure about pypy though). Clark On Fri, Jul 11, 2014 at 10:07 AM, Miguel Angel Ajo Pelayo mangel...@redhat.com mailto:mangel...@redhat.com wrote: +1 here too, Amazed with the performance gains, x2.4 seems a lot, and we'd get rid of deadlocks. - Original Message - +1 I'm pretty excited about the possibilities here. I've had this mysqldb/eventlet contention in the back of my mind for some time now. I'm glad to see some work being done in this area. Carl On Fri, Jul 11, 2014 at 7:04 AM, Ihar Hrachyshka ihrac...@redhat.com mailto:ihrac...@redhat.com wrote:
Re: [openstack-dev] [neutron][all] switch from mysqldb to another eventlet aware mysql client
Status, as I understand it: * oslo.db changes to support other mysql drivers: https://review.openstack.org/#/c/104425/ (merged) https://review.openstack.org/#/c/106928/ (awaiting oslo.db review) https://review.openstack.org/#/c/107221/ (awaiting oslo.db review) (These are sufficient to allow operators to switch connection strings and use mysqlconnector. The rest is all for our testing environment) * oslo.db change to allow testing with other mysql drivers: https://review.openstack.org/#/c/104428/ (awaiting oslo.db review) https://review.openstack.org/#/c/104447/ (awaiting oslo.db review. Ongoing discussion towards a larger rewrite of oslo.db testing instead) * Integration into jenkins environment: Blocked on getting Oracle to distribute mysql-connector via pypi. Ihar and others are having conversations with the upstream author. * Devstack change to switch to mysqlconnector for neutron: https://review.openstack.org/#/c/105209/ (marked wip) Ihar: do you want me to pick this up, or are you going to continue it once some of the above has settled? * oslo.db gate test that reproduces the deadlock with eventlet: https://review.openstack.org/#/c/104436/ (In review. Can't be submitted until gate environment is switched to mysqlconnector) Overall I'm not happy with the rate of change - but we're getting there. I look forward to getting this fixed :/ On 18 July 2014 21:45, Ihar Hrachyshka ihrac...@redhat.com wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA512 On 14/07/14 17:03, Ihar Hrachyshka wrote: On 14/07/14 15:54, Clark Boylan wrote: On Sun, Jul 13, 2014 at 9:20 AM, Ihar Hrachyshka ihrac...@redhat.com wrote: On 11/07/14 19:20, Clark Boylan wrote: Before we get too far ahead of ourselves mysql-connector is not hosted on pypi. Instead it is an external package link. We recently managed to remove all packages that are hosted as external package links from openstack and will not add new ones in. Before we can use mysql-connector in the gate oracle will need to publish mysql-connector on pypi properly. There is misunderstanding in our community on how we deploy db client modules. No project actually depends on any of them. We assume deployers will install the proper one and configure 'connection' string to use it. In case of devstack, we install the appropriate package from distribution packages, not pip. Correct, but for all of the other test suites (unittests) we install the db clients via pip because tox runs them and virtualenvs allowing site packages cause too many problems. See https://git.openstack.org/cgit/openstack/nova/tree/test-requirements.txt#n8 . So we do actually depend on these things being pip installable. Basically this allows devs to run `tox` and it works. Roger that, and thanks for clarification. I'm trying to reach the author and the maintainer of mysqlconnector-python to see whether I'll be able to convince him to publish the packages on pypi.python.org. I've reached the maintainer of the module, he told me he is currently working on uploading releases directly to pypi.python.org. I would argue that we should have devstack install via pip too for consistency, but that is a different issue (it is already installing all of the other python dependencies this way so why special case?). What we do is recommending a module for our users in our documentation. That said, I assume the gate is a non-issue. Correct? That said there is at least one other pure python alternative, PyMySQL. PyMySQL supports py3k and pypy. We should look at using PyMySQL instead if we want to start with a reasonable path to getting this in the gate. MySQL Connector supports py3k too (not sure about pypy though). Clark On Fri, Jul 11, 2014 at 10:07 AM, Miguel Angel Ajo Pelayo mangel...@redhat.com wrote: +1 here too, Amazed with the performance gains, x2.4 seems a lot, and we'd get rid of deadlocks. - Original Message - +1 I'm pretty excited about the possibilities here. I've had this mysqldb/eventlet contention in the back of my mind for some time now. I'm glad to see some work being done in this area. Carl On Fri, Jul 11, 2014 at 7:04 AM, Ihar Hrachyshka ihrac...@redhat.com wrote: On 09/07/14 13:17, Ihar Hrachyshka wrote: Hi all, Multiple projects are suffering from db lock timeouts due to deadlocks deep in mysqldb library that we use to interact with mysql servers. In essence, the problem is due to missing eventlet support in mysqldb module, meaning when a db lock is encountered, the library does not yield to the next green thread, allowing other threads to eventually unlock the grabbed lock, and instead it just blocks the main thread, that eventually raises timeout exception (OperationalError). The failed operation is not retried, leaving failing request not served. In Nova, there is a
Re: [openstack-dev] [neutron][all] switch from mysqldb to another eventlet aware mysql client
-BEGIN PGP SIGNED MESSAGE- Hash: SHA512 On 14/07/14 17:03, Ihar Hrachyshka wrote: On 14/07/14 15:54, Clark Boylan wrote: On Sun, Jul 13, 2014 at 9:20 AM, Ihar Hrachyshka ihrac...@redhat.com wrote: On 11/07/14 19:20, Clark Boylan wrote: Before we get too far ahead of ourselves mysql-connector is not hosted on pypi. Instead it is an external package link. We recently managed to remove all packages that are hosted as external package links from openstack and will not add new ones in. Before we can use mysql-connector in the gate oracle will need to publish mysql-connector on pypi properly. There is misunderstanding in our community on how we deploy db client modules. No project actually depends on any of them. We assume deployers will install the proper one and configure 'connection' string to use it. In case of devstack, we install the appropriate package from distribution packages, not pip. Correct, but for all of the other test suites (unittests) we install the db clients via pip because tox runs them and virtualenvs allowing site packages cause too many problems. See https://git.openstack.org/cgit/openstack/nova/tree/test-requirements.txt#n8. So we do actually depend on these things being pip installable. Basically this allows devs to run `tox` and it works. Roger that, and thanks for clarification. I'm trying to reach the author and the maintainer of mysqlconnector-python to see whether I'll be able to convince him to publish the packages on pypi.python.org. I've reached the maintainer of the module, he told me he is currently working on uploading releases directly to pypi.python.org. I would argue that we should have devstack install via pip too for consistency, but that is a different issue (it is already installing all of the other python dependencies this way so why special case?). What we do is recommending a module for our users in our documentation. That said, I assume the gate is a non-issue. Correct? That said there is at least one other pure python alternative, PyMySQL. PyMySQL supports py3k and pypy. We should look at using PyMySQL instead if we want to start with a reasonable path to getting this in the gate. MySQL Connector supports py3k too (not sure about pypy though). Clark On Fri, Jul 11, 2014 at 10:07 AM, Miguel Angel Ajo Pelayo mangel...@redhat.com wrote: +1 here too, Amazed with the performance gains, x2.4 seems a lot, and we'd get rid of deadlocks. - Original Message - +1 I'm pretty excited about the possibilities here. I've had this mysqldb/eventlet contention in the back of my mind for some time now. I'm glad to see some work being done in this area. Carl On Fri, Jul 11, 2014 at 7:04 AM, Ihar Hrachyshka ihrac...@redhat.com wrote: On 09/07/14 13:17, Ihar Hrachyshka wrote: Hi all, Multiple projects are suffering from db lock timeouts due to deadlocks deep in mysqldb library that we use to interact with mysql servers. In essence, the problem is due to missing eventlet support in mysqldb module, meaning when a db lock is encountered, the library does not yield to the next green thread, allowing other threads to eventually unlock the grabbed lock, and instead it just blocks the main thread, that eventually raises timeout exception (OperationalError). The failed operation is not retried, leaving failing request not served. In Nova, there is a special retry mechanism for deadlocks, though I think it's more a hack than a proper fix. Neutron is one of the projects that suffer from those timeout errors a lot. Partly it's due to lack of discipline in how we do nested calls in l3_db and ml2_plugin code, but that's not something to change in foreseeable future, so we need to find another solution that is applicable for Juno. Ideally, the solution should be applicable for Icehouse too to allow distributors to resolve existing deadlocks without waiting for Juno. We've had several discussions and attempts to introduce a solution to the problem. Thanks to oslo.db guys, we now have more or less clear view on the cause of the failures and how to easily fix them. The solution is to switch mysqldb to something eventlet aware. The best candidate is probably MySQL Connector module that is an official MySQL client for Python and that shows some (preliminary) good results in terms of performance. I've made additional testing, creating 2000 networks in parallel (10 thread workers) for both drivers and comparing results. With mysqldb: 215.81 sec With mysql-connector: 88.66 ~2.4 times performance boost, ok? ;) I think we should switch to that library *even* if we forget about all the nasty deadlocks we experience now. I've posted a Neutron spec for the switch to the new client in Juno at [1]. Ideally, switch is just a matter of several fixes to oslo.db that would enable full support for the new
Re: [openstack-dev] [neutron][all] switch from mysqldb to another eventlet aware mysql client
-BEGIN PGP SIGNED MESSAGE- Hash: SHA512 On 16/07/14 01:50, Vishvananda Ishaya wrote: On Jul 15, 2014, at 3:30 PM, Ihar Hrachyshka ihrac...@redhat.com wrote: Signed PGP part On 14/07/14 22:48, Vishvananda Ishaya wrote: On Jul 13, 2014, at 9:29 AM, Ihar Hrachyshka ihrac...@redhat.com wrote: Signed PGP part On 12/07/14 03:17, Mike Bayer wrote: On 7/11/14, 7:26 PM, Carl Baldwin wrote: On Jul 11, 2014 5:32 PM, Vishvananda Ishaya vishvana...@gmail.com mailto:vishvana...@gmail.com wrote: I have tried using pymysql in place of mysqldb and in real world concurrency tests against cinder and nova it performs slower. I was inspired by the mention of mysql-connector so I just tried that option instead. Mysql-connector seems to be slightly slower as well, which leads me to believe that the blocking inside of Do you have some numbers? Seems to be slightly slower doesn't really stand up as an argument against the numbers that have been posted in this thread. Numbers are highly dependent on a number of other factors, but I was seeing 100 concurrent list commands against cinder going from an average of 400 ms to an average of around 600 ms with both msql-connector and pymsql. I've made my tests on neutron only, so there is possibility that cinder works somehow differently. But, those numbers don't tell a lot in terms of considering the switch. Do you have numbers for mysqldb case? Sorry if my commentary above was unclear. The 400ms is mysqldb. The 600ms average was the same for both the other options. It is also worth mentioning that my test of 100 concurrent creates from the same project in cinder leads to average response times over 3 seconds. Note that creates return before the request is sent to the node for processing, so this is just the api creating the db record and sticking a message on the queue. A huge part of the slowdown is in quota reservation processing which does a row lock on the project id. Again, are those 3 seconds better or worse than what we have for mysqldb? The 3 seconds is from mysqldb. I don?t have average response times for mysql-connector due to the timeouts I mention below. Before we are sure that an eventlet friendly backend ?gets rid of all deadlocks?, I will mention that trying this test against connector leads to some requests timing out at our load balancer (5 minute timeout), so we may actually be introducing deadlocks where the retry_on_deadlock operator is used. Deadlocks != timeouts. I attempt to fix eventlet-triggered db deadlocks, not all possible deadlocks that you may envision, or timeouts. That may be true, but if switching the default is trading one problem for another it isn?t necessarily the right fix. The timeout means that one or more greenthreads are never actually generating a response. I suspect and endless retry_on_deadlock between a couple of competing greenthreads which we don?t hit with mysqldb, but it could be any number of things. Consider the above anecdotal for the moment, since I can?t verify for sure that switching the sql driver didn?t introduce some other race or unrelated problem. Let me just caution that we can?t recommend replacing our mysql backend without real performance and load testing. I agree. Not saying that the tests are somehow complete, but here is what I was into last two days. There is a nice openstack project called Rally that is designed to allow easy benchmarks for openstack projects. They have four scenarios for neutron implemented: for networks, ports, routers, and subnets. Each scenario combines create and list commands. I've run each test with the following runner settings: times = 100, concurrency = 10, meaning each scenario is run 100 times in parallel, and there were not more than 10 parallel scenarios running. Then I've repeated the same for times = 100, concurrency = 20 (also set max_pool_size to 20 to allow sqlalchemy utilize that level of parallelism), and times = 1000, concurrency = 100 (same note on sqlalchemy parallelism). You can find detailed html files with nice graphs here [1]. Brief description of results is below: 1. create_and_list_networks scenario: for 10 parallel workers performance boost is -12.5% from original time, for 20 workers -6.3%, for 100 workers there is a slight reduction of average time spent for scenario +9.4% (this is the only scenario that showed slight reduction in performance, I'll try to rerun the test tomorrow to see whether it was some discrepancy when I executed it that influenced the result). 2. create_and_list_ports scenario: for 10 parallel workers boost is -25.8%, for 20 workers it's -9.4%, and for 100 workers it's -12.6%. 3. create_and_list_routers scenario: for 10 parallel workers boost is -46.6% (almost half of original time), for 20 workers it's -51.7% (more than a half), for 100 workers it's -41.5%. 4.
Re: [openstack-dev] [neutron][all] switch from mysqldb to another eventlet aware mysql client
-BEGIN PGP SIGNED MESSAGE- Hash: SHA512 On 14/07/14 22:48, Vishvananda Ishaya wrote: On Jul 13, 2014, at 9:29 AM, Ihar Hrachyshka ihrac...@redhat.com wrote: Signed PGP part On 12/07/14 03:17, Mike Bayer wrote: On 7/11/14, 7:26 PM, Carl Baldwin wrote: On Jul 11, 2014 5:32 PM, Vishvananda Ishaya vishvana...@gmail.com mailto:vishvana...@gmail.com wrote: I have tried using pymysql in place of mysqldb and in real world concurrency tests against cinder and nova it performs slower. I was inspired by the mention of mysql-connector so I just tried that option instead. Mysql-connector seems to be slightly slower as well, which leads me to believe that the blocking inside of Do you have some numbers? Seems to be slightly slower doesn't really stand up as an argument against the numbers that have been posted in this thread. Numbers are highly dependent on a number of other factors, but I was seeing 100 concurrent list commands against cinder going from an average of 400 ms to an average of around 600 ms with both msql-connector and pymsql. I've made my tests on neutron only, so there is possibility that cinder works somehow differently. But, those numbers don't tell a lot in terms of considering the switch. Do you have numbers for mysqldb case? It is also worth mentioning that my test of 100 concurrent creates from the same project in cinder leads to average response times over 3 seconds. Note that creates return before the request is sent to the node for processing, so this is just the api creating the db record and sticking a message on the queue. A huge part of the slowdown is in quota reservation processing which does a row lock on the project id. Again, are those 3 seconds better or worse than what we have for mysqldb? Before we are sure that an eventlet friendly backend “gets rid of all deadlocks”, I will mention that trying this test against connector leads to some requests timing out at our load balancer (5 minute timeout), so we may actually be introducing deadlocks where the retry_on_deadlock operator is used. Deadlocks != timeouts. I attempt to fix eventlet-triggered db deadlocks, not all possible deadlocks that you may envision, or timeouts. Consider the above anecdotal for the moment, since I can’t verify for sure that switching the sql driver didn’t introduce some other race or unrelated problem. Let me just caution that we can’t recommend replacing our mysql backend without real performance and load testing. I agree. Not saying that the tests are somehow complete, but here is what I was into last two days. There is a nice openstack project called Rally that is designed to allow easy benchmarks for openstack projects. They have four scenarios for neutron implemented: for networks, ports, routers, and subnets. Each scenario combines create and list commands. I've run each test with the following runner settings: times = 100, concurrency = 10, meaning each scenario is run 100 times in parallel, and there were not more than 10 parallel scenarios running. Then I've repeated the same for times = 100, concurrency = 20 (also set max_pool_size to 20 to allow sqlalchemy utilize that level of parallelism), and times = 1000, concurrency = 100 (same note on sqlalchemy parallelism). You can find detailed html files with nice graphs here [1]. Brief description of results is below: 1. create_and_list_networks scenario: for 10 parallel workers performance boost is -12.5% from original time, for 20 workers -6.3%, for 100 workers there is a slight reduction of average time spent for scenario +9.4% (this is the only scenario that showed slight reduction in performance, I'll try to rerun the test tomorrow to see whether it was some discrepancy when I executed it that influenced the result). 2. create_and_list_ports scenario: for 10 parallel workers boost is - -25.8%, for 20 workers it's -9.4%, and for 100 workers it's -12.6%. 3. create_and_list_routers scenario: for 10 parallel workers boost is - -46.6% (almost half of original time), for 20 workers it's -51.7% (more than a half), for 100 workers it's -41.5%. 4. create_and_list_subnets scenario: for 10 parallel workers boost is - -26.4%, for 20 workers it's -51.1% (more than half reduction in time spent for average scenario), and for 100 workers it's -31.7%. I've tried to check how it scales till 200 parallel workers, but was hit by local file opened limits and mysql max_connection settings. I will retry my tests with limits raised tomorrow to see how it handles that huge load. Tomorrow I will also try to test new library with multiple API workers. Other than that, what are your suggestions on what to check/test? FYI: [1] contains the following directories: mysqlconnector/ mysqldb/ Each of them contains the following directories: 10-10/ - 10 parallel workers, max_pool_size = 10 (default) 20-100/ - 20 parallel workers, max_pool_size = 100 100-100/ - 100 parallel workers,
Re: [openstack-dev] [neutron][all] switch from mysqldb to another eventlet aware mysql client
On Jul 15, 2014, at 3:30 PM, Ihar Hrachyshka ihrac...@redhat.com wrote: Signed PGP part On 14/07/14 22:48, Vishvananda Ishaya wrote: On Jul 13, 2014, at 9:29 AM, Ihar Hrachyshka ihrac...@redhat.com wrote: Signed PGP part On 12/07/14 03:17, Mike Bayer wrote: On 7/11/14, 7:26 PM, Carl Baldwin wrote: On Jul 11, 2014 5:32 PM, Vishvananda Ishaya vishvana...@gmail.com mailto:vishvana...@gmail.com wrote: I have tried using pymysql in place of mysqldb and in real world concurrency tests against cinder and nova it performs slower. I was inspired by the mention of mysql-connector so I just tried that option instead. Mysql-connector seems to be slightly slower as well, which leads me to believe that the blocking inside of Do you have some numbers? Seems to be slightly slower doesn't really stand up as an argument against the numbers that have been posted in this thread. Numbers are highly dependent on a number of other factors, but I was seeing 100 concurrent list commands against cinder going from an average of 400 ms to an average of around 600 ms with both msql-connector and pymsql. I've made my tests on neutron only, so there is possibility that cinder works somehow differently. But, those numbers don't tell a lot in terms of considering the switch. Do you have numbers for mysqldb case? Sorry if my commentary above was unclear. The 400ms is mysqldb. The 600ms average was the same for both the other options. It is also worth mentioning that my test of 100 concurrent creates from the same project in cinder leads to average response times over 3 seconds. Note that creates return before the request is sent to the node for processing, so this is just the api creating the db record and sticking a message on the queue. A huge part of the slowdown is in quota reservation processing which does a row lock on the project id. Again, are those 3 seconds better or worse than what we have for mysqldb? The 3 seconds is from mysqldb. I don’t have average response times for mysql-connector due to the timeouts I mention below. Before we are sure that an eventlet friendly backend “gets rid of all deadlocks”, I will mention that trying this test against connector leads to some requests timing out at our load balancer (5 minute timeout), so we may actually be introducing deadlocks where the retry_on_deadlock operator is used. Deadlocks != timeouts. I attempt to fix eventlet-triggered db deadlocks, not all possible deadlocks that you may envision, or timeouts. That may be true, but if switching the default is trading one problem for another it isn’t necessarily the right fix. The timeout means that one or more greenthreads are never actually generating a response. I suspect and endless retry_on_deadlock between a couple of competing greenthreads which we don’t hit with mysqldb, but it could be any number of things. Consider the above anecdotal for the moment, since I can’t verify for sure that switching the sql driver didn’t introduce some other race or unrelated problem. Let me just caution that we can’t recommend replacing our mysql backend without real performance and load testing. I agree. Not saying that the tests are somehow complete, but here is what I was into last two days. There is a nice openstack project called Rally that is designed to allow easy benchmarks for openstack projects. They have four scenarios for neutron implemented: for networks, ports, routers, and subnets. Each scenario combines create and list commands. I've run each test with the following runner settings: times = 100, concurrency = 10, meaning each scenario is run 100 times in parallel, and there were not more than 10 parallel scenarios running. Then I've repeated the same for times = 100, concurrency = 20 (also set max_pool_size to 20 to allow sqlalchemy utilize that level of parallelism), and times = 1000, concurrency = 100 (same note on sqlalchemy parallelism). You can find detailed html files with nice graphs here [1]. Brief description of results is below: 1. create_and_list_networks scenario: for 10 parallel workers performance boost is -12.5% from original time, for 20 workers -6.3%, for 100 workers there is a slight reduction of average time spent for scenario +9.4% (this is the only scenario that showed slight reduction in performance, I'll try to rerun the test tomorrow to see whether it was some discrepancy when I executed it that influenced the result). 2. create_and_list_ports scenario: for 10 parallel workers boost is -25.8%, for 20 workers it's -9.4%, and for 100 workers it's -12.6%. 3. create_and_list_routers scenario: for 10 parallel workers boost is -46.6% (almost half of original time), for 20 workers it's -51.7% (more than a half), for 100 workers it's -41.5%. 4. create_and_list_subnets scenario: for 10 parallel workers boost is -26.4%, for 20 workers
Re: [openstack-dev] [neutron][all] switch from mysqldb to another eventlet aware mysql client
-BEGIN PGP SIGNED MESSAGE- Hash: SHA512 On 14/07/14 07:45, Thomas Goirand wrote: On 07/14/2014 12:20 AM, Ihar Hrachyshka wrote: On 11/07/14 19:20, Clark Boylan wrote: That said there is at least one other pure python alternative, PyMySQL. PyMySQL supports py3k and pypy. We should look at using PyMySQL instead if we want to start with a reasonable path to getting this in the gate. MySQL Connector supports py3k too (not sure about pypy though). Yes, and it's also what Django people recommend: https://docs.djangoproject.com/en/1.7/ref/databases/#mysql-db-api-drivers As for mysqldb and Python3, the only way is to use a Python 3 fork such as this one: https://github.com/clelland/MySQL-for-Python-3 I wouldn't like using different versions of Python modules depending on the Python version, and therefore, python-mysql.connector / python3-mysql.connector would be preferred. However, it'd be nice if *all* projects could switch to that, and not just Neutron, otherwise, we'd be just adding a new dependency, which isn't great. Yes, we envision global switch, though some projects may choose to wait for another cycle to see how it works for pioneers. Also, about eventlet, there's been long threads about switching to something else like asyncio. Wouldn't it be time to also do that (at the same time)? Eventlet has lots of flaws, though I don't see it replaced by asyncio or any other mechanism this or even the next cycle. There is lots of work to do to replace it. Switching mysql library is a 100 line patch + performance benchmarking to avoid regression. Switching async library is thousands of LOC + refactoring + very significant work on oslo side. Cheers, Thomas Goirand (zigo) ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -BEGIN PGP SIGNATURE- Version: GnuPG/MacGPG2 v2.0.22 (Darwin) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBCgAGBQJTw5GsAAoJEC5aWaUY1u57dQ8IAMSCV+/BXo2gPy4dhostajDV HQytfo6gJbPRWn9UkFVcXpECjhWvQcwgWXSagij16ayuGl41O4Gdtx3nG3amwLb9 kq8ryy9Hc+yoBGhz64OT6pJVX5zr0AduzMeBQnXkAshLmrxP9sIXI3TUAHd+840j mofz14vwprBzSPJq/dKIuPfXNWaWKt0C5O27RG7gI39HVZskQDO7D29QA4nZFEQO oRprWGDFlvfeZHz5rM4/9yLgYGFU6yXoqm5E0GA+oJSb6OHVO/YNBQlwUQsqO1No CpCXZ5PlWbCyXujCIsJcM7xSCFncBxsQxxw7hWJ85ocYQsNKQLx09BsHw1gwBFg= =PTnt -END PGP SIGNATURE- ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [neutron][all] switch from mysqldb to another eventlet aware mysql client
On Sun, Jul 13, 2014 at 9:20 AM, Ihar Hrachyshka ihrac...@redhat.com wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA512 On 11/07/14 19:20, Clark Boylan wrote: Before we get too far ahead of ourselves mysql-connector is not hosted on pypi. Instead it is an external package link. We recently managed to remove all packages that are hosted as external package links from openstack and will not add new ones in. Before we can use mysql-connector in the gate oracle will need to publish mysql-connector on pypi properly. There is misunderstanding in our community on how we deploy db client modules. No project actually depends on any of them. We assume deployers will install the proper one and configure 'connection' string to use it. In case of devstack, we install the appropriate package from distribution packages, not pip. Correct, but for all of the other test suites (unittests) we install the db clients via pip because tox runs them and virtualenvs allowing site packages cause too many problems. See https://git.openstack.org/cgit/openstack/nova/tree/test-requirements.txt#n8. So we do actually depend on these things being pip installable. Basically this allows devs to run `tox` and it works. I would argue that we should have devstack install via pip too for consistency, but that is a different issue (it is already installing all of the other python dependencies this way so why special case?). What we do is recommending a module for our users in our documentation. That said, I assume the gate is a non-issue. Correct? That said there is at least one other pure python alternative, PyMySQL. PyMySQL supports py3k and pypy. We should look at using PyMySQL instead if we want to start with a reasonable path to getting this in the gate. MySQL Connector supports py3k too (not sure about pypy though). Clark On Fri, Jul 11, 2014 at 10:07 AM, Miguel Angel Ajo Pelayo mangel...@redhat.com wrote: +1 here too, Amazed with the performance gains, x2.4 seems a lot, and we'd get rid of deadlocks. - Original Message - +1 I'm pretty excited about the possibilities here. I've had this mysqldb/eventlet contention in the back of my mind for some time now. I'm glad to see some work being done in this area. Carl On Fri, Jul 11, 2014 at 7:04 AM, Ihar Hrachyshka ihrac...@redhat.com wrote: On 09/07/14 13:17, Ihar Hrachyshka wrote: Hi all, Multiple projects are suffering from db lock timeouts due to deadlocks deep in mysqldb library that we use to interact with mysql servers. In essence, the problem is due to missing eventlet support in mysqldb module, meaning when a db lock is encountered, the library does not yield to the next green thread, allowing other threads to eventually unlock the grabbed lock, and instead it just blocks the main thread, that eventually raises timeout exception (OperationalError). The failed operation is not retried, leaving failing request not served. In Nova, there is a special retry mechanism for deadlocks, though I think it's more a hack than a proper fix. Neutron is one of the projects that suffer from those timeout errors a lot. Partly it's due to lack of discipline in how we do nested calls in l3_db and ml2_plugin code, but that's not something to change in foreseeable future, so we need to find another solution that is applicable for Juno. Ideally, the solution should be applicable for Icehouse too to allow distributors to resolve existing deadlocks without waiting for Juno. We've had several discussions and attempts to introduce a solution to the problem. Thanks to oslo.db guys, we now have more or less clear view on the cause of the failures and how to easily fix them. The solution is to switch mysqldb to something eventlet aware. The best candidate is probably MySQL Connector module that is an official MySQL client for Python and that shows some (preliminary) good results in terms of performance. I've made additional testing, creating 2000 networks in parallel (10 thread workers) for both drivers and comparing results. With mysqldb: 215.81 sec With mysql-connector: 88.66 ~2.4 times performance boost, ok? ;) I think we should switch to that library *even* if we forget about all the nasty deadlocks we experience now. I've posted a Neutron spec for the switch to the new client in Juno at [1]. Ideally, switch is just a matter of several fixes to oslo.db that would enable full support for the new driver already supported by SQLAlchemy, plus 'connection' string modified in service configuration files, plus documentation updates to refer to the new official way to configure services for MySQL. The database code won't, ideally, require any major changes, though some adaptation for the new client library may be needed. That said, Neutron does not seem to require any changes, though it was revealed that there are some alembic migration rules in Keystone or Glance that need (trivial)
Re: [openstack-dev] [neutron][all] switch from mysqldb to another eventlet aware mysql client
-BEGIN PGP SIGNED MESSAGE- Hash: SHA512 On 14/07/14 15:54, Clark Boylan wrote: On Sun, Jul 13, 2014 at 9:20 AM, Ihar Hrachyshka ihrac...@redhat.com wrote: On 11/07/14 19:20, Clark Boylan wrote: Before we get too far ahead of ourselves mysql-connector is not hosted on pypi. Instead it is an external package link. We recently managed to remove all packages that are hosted as external package links from openstack and will not add new ones in. Before we can use mysql-connector in the gate oracle will need to publish mysql-connector on pypi properly. There is misunderstanding in our community on how we deploy db client modules. No project actually depends on any of them. We assume deployers will install the proper one and configure 'connection' string to use it. In case of devstack, we install the appropriate package from distribution packages, not pip. Correct, but for all of the other test suites (unittests) we install the db clients via pip because tox runs them and virtualenvs allowing site packages cause too many problems. See https://git.openstack.org/cgit/openstack/nova/tree/test-requirements.txt#n8. So we do actually depend on these things being pip installable. Basically this allows devs to run `tox` and it works. Roger that, and thanks for clarification. I'm trying to reach the author and the maintainer of mysqlconnector-python to see whether I'll be able to convince him to publish the packages on pypi.python.org. I would argue that we should have devstack install via pip too for consistency, but that is a different issue (it is already installing all of the other python dependencies this way so why special case?). What we do is recommending a module for our users in our documentation. That said, I assume the gate is a non-issue. Correct? That said there is at least one other pure python alternative, PyMySQL. PyMySQL supports py3k and pypy. We should look at using PyMySQL instead if we want to start with a reasonable path to getting this in the gate. MySQL Connector supports py3k too (not sure about pypy though). Clark On Fri, Jul 11, 2014 at 10:07 AM, Miguel Angel Ajo Pelayo mangel...@redhat.com wrote: +1 here too, Amazed with the performance gains, x2.4 seems a lot, and we'd get rid of deadlocks. - Original Message - +1 I'm pretty excited about the possibilities here. I've had this mysqldb/eventlet contention in the back of my mind for some time now. I'm glad to see some work being done in this area. Carl On Fri, Jul 11, 2014 at 7:04 AM, Ihar Hrachyshka ihrac...@redhat.com wrote: On 09/07/14 13:17, Ihar Hrachyshka wrote: Hi all, Multiple projects are suffering from db lock timeouts due to deadlocks deep in mysqldb library that we use to interact with mysql servers. In essence, the problem is due to missing eventlet support in mysqldb module, meaning when a db lock is encountered, the library does not yield to the next green thread, allowing other threads to eventually unlock the grabbed lock, and instead it just blocks the main thread, that eventually raises timeout exception (OperationalError). The failed operation is not retried, leaving failing request not served. In Nova, there is a special retry mechanism for deadlocks, though I think it's more a hack than a proper fix. Neutron is one of the projects that suffer from those timeout errors a lot. Partly it's due to lack of discipline in how we do nested calls in l3_db and ml2_plugin code, but that's not something to change in foreseeable future, so we need to find another solution that is applicable for Juno. Ideally, the solution should be applicable for Icehouse too to allow distributors to resolve existing deadlocks without waiting for Juno. We've had several discussions and attempts to introduce a solution to the problem. Thanks to oslo.db guys, we now have more or less clear view on the cause of the failures and how to easily fix them. The solution is to switch mysqldb to something eventlet aware. The best candidate is probably MySQL Connector module that is an official MySQL client for Python and that shows some (preliminary) good results in terms of performance. I've made additional testing, creating 2000 networks in parallel (10 thread workers) for both drivers and comparing results. With mysqldb: 215.81 sec With mysql-connector: 88.66 ~2.4 times performance boost, ok? ;) I think we should switch to that library *even* if we forget about all the nasty deadlocks we experience now. I've posted a Neutron spec for the switch to the new client in Juno at [1]. Ideally, switch is just a matter of several fixes to oslo.db that would enable full support for the new driver already supported by SQLAlchemy, plus 'connection' string modified in service configuration files, plus documentation updates to refer to the new official way to configure services for MySQL. The
Re: [openstack-dev] [neutron][all] switch from mysqldb to another eventlet aware mysql client
On Jul 13, 2014, at 9:29 AM, Ihar Hrachyshka ihrac...@redhat.com wrote: Signed PGP part On 12/07/14 03:17, Mike Bayer wrote: On 7/11/14, 7:26 PM, Carl Baldwin wrote: On Jul 11, 2014 5:32 PM, Vishvananda Ishaya vishvana...@gmail.com mailto:vishvana...@gmail.com wrote: I have tried using pymysql in place of mysqldb and in real world concurrency tests against cinder and nova it performs slower. I was inspired by the mention of mysql-connector so I just tried that option instead. Mysql-connector seems to be slightly slower as well, which leads me to believe that the blocking inside of Do you have some numbers? Seems to be slightly slower doesn't really stand up as an argument against the numbers that have been posted in this thread. Numbers are highly dependent on a number of other factors, but I was seeing 100 concurrent list commands against cinder going from an average of 400 ms to an average of around 600 ms with both msql-connector and pymsql. It is also worth mentioning that my test of 100 concurrent creates from the same project in cinder leads to average response times over 3 seconds. Note that creates return before the request is sent to the node for processing, so this is just the api creating the db record and sticking a message on the queue. A huge part of the slowdown is in quota reservation processing which does a row lock on the project id. Before we are sure that an eventlet friendly backend “gets rid of all deadlocks”, I will mention that trying this test against connector leads to some requests timing out at our load balancer (5 minute timeout), so we may actually be introducing deadlocks where the retry_on_deadlock operator is used. Consider the above anecdotal for the moment, since I can’t verify for sure that switching the sql driver didn’t introduce some other race or unrelated problem. Let me just caution that we can’t recommend replacing our mysql backend without real performance and load testing. Vish sqlalchemy is not the main bottleneck across projects. Vish P.S. The performanace in all cases was abysmal, so performance work definitely needs to be done, but just the guess that replacing our mysql library is going to solve all of our performance problems appears to be incorrect at first blush. The motivation is still mostly deadlock relief but more performance work should be done. I agree with you there. I'm still hopeful for some improvement from this. To identify performance that's alleviated by async you have to establish up front that IO blocking is the issue, which would entail having code that's blazing fast until you start running it against concurrent connections, at which point you can identify via profiling that IO operations are being serialized. This is a very specific issue. In contrast, to identify why some arbitrary openstack app is slow, my bet is that async is often not the big issue. Every day I look at openstack code and talk to people working on things, I see many performance issues that have nothing to do with concurrency, and as I detailed in my wiki page at https://wiki.openstack.org/wiki/OpenStack_and_SQLAlchemy there is a long road to cleaning up all the excessive queries, hundreds of unnecessary rows and columns being pulled over the network, unindexed lookups, subquery joins, hammering of Python-intensive operations (often due to the nature of OS apps as lots and lots of tiny API calls) that can be cached. There's a clear path to tons better performance documented there and most of it is not about async - which means that successful async isn't going to solve all those issues. Of course there is a long road to decent performance, and switching a library won't magically fix all out issues. But if it will fix deadlocks, and give 30% to 150% performance boost for different operations, and since the switch is almost smooth, this is something worth doing. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev signature.asc Description: Message signed with OpenPGP using GPGMail ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [neutron][all] switch from mysqldb to another eventlet aware mysql client
-BEGIN PGP SIGNED MESSAGE- Hash: SHA512 On 11/07/14 19:20, Clark Boylan wrote: Before we get too far ahead of ourselves mysql-connector is not hosted on pypi. Instead it is an external package link. We recently managed to remove all packages that are hosted as external package links from openstack and will not add new ones in. Before we can use mysql-connector in the gate oracle will need to publish mysql-connector on pypi properly. There is misunderstanding in our community on how we deploy db client modules. No project actually depends on any of them. We assume deployers will install the proper one and configure 'connection' string to use it. In case of devstack, we install the appropriate package from distribution packages, not pip. What we do is recommending a module for our users in our documentation. That said, I assume the gate is a non-issue. Correct? That said there is at least one other pure python alternative, PyMySQL. PyMySQL supports py3k and pypy. We should look at using PyMySQL instead if we want to start with a reasonable path to getting this in the gate. MySQL Connector supports py3k too (not sure about pypy though). Clark On Fri, Jul 11, 2014 at 10:07 AM, Miguel Angel Ajo Pelayo mangel...@redhat.com wrote: +1 here too, Amazed with the performance gains, x2.4 seems a lot, and we'd get rid of deadlocks. - Original Message - +1 I'm pretty excited about the possibilities here. I've had this mysqldb/eventlet contention in the back of my mind for some time now. I'm glad to see some work being done in this area. Carl On Fri, Jul 11, 2014 at 7:04 AM, Ihar Hrachyshka ihrac...@redhat.com wrote: On 09/07/14 13:17, Ihar Hrachyshka wrote: Hi all, Multiple projects are suffering from db lock timeouts due to deadlocks deep in mysqldb library that we use to interact with mysql servers. In essence, the problem is due to missing eventlet support in mysqldb module, meaning when a db lock is encountered, the library does not yield to the next green thread, allowing other threads to eventually unlock the grabbed lock, and instead it just blocks the main thread, that eventually raises timeout exception (OperationalError). The failed operation is not retried, leaving failing request not served. In Nova, there is a special retry mechanism for deadlocks, though I think it's more a hack than a proper fix. Neutron is one of the projects that suffer from those timeout errors a lot. Partly it's due to lack of discipline in how we do nested calls in l3_db and ml2_plugin code, but that's not something to change in foreseeable future, so we need to find another solution that is applicable for Juno. Ideally, the solution should be applicable for Icehouse too to allow distributors to resolve existing deadlocks without waiting for Juno. We've had several discussions and attempts to introduce a solution to the problem. Thanks to oslo.db guys, we now have more or less clear view on the cause of the failures and how to easily fix them. The solution is to switch mysqldb to something eventlet aware. The best candidate is probably MySQL Connector module that is an official MySQL client for Python and that shows some (preliminary) good results in terms of performance. I've made additional testing, creating 2000 networks in parallel (10 thread workers) for both drivers and comparing results. With mysqldb: 215.81 sec With mysql-connector: 88.66 ~2.4 times performance boost, ok? ;) I think we should switch to that library *even* if we forget about all the nasty deadlocks we experience now. I've posted a Neutron spec for the switch to the new client in Juno at [1]. Ideally, switch is just a matter of several fixes to oslo.db that would enable full support for the new driver already supported by SQLAlchemy, plus 'connection' string modified in service configuration files, plus documentation updates to refer to the new official way to configure services for MySQL. The database code won't, ideally, require any major changes, though some adaptation for the new client library may be needed. That said, Neutron does not seem to require any changes, though it was revealed that there are some alembic migration rules in Keystone or Glance that need (trivial) modifications. You can see how trivial the switch can be achieved for a service based on example for Neutron [2]. While this is a Neutron specific proposal, there is an obvious wish to switch to the new library globally throughout all the projects, to reduce devops burden, among other things. My vision is that, ideally, we switch all projects to the new library in Juno, though we still may leave several projects for K in case any issues arise, similar to the way projects switched to oslo.messaging during two cycles instead of one. Though looking at how easy Neutron can be switched to the new library, I wouldn't expect any issues
Re: [openstack-dev] [neutron][all] switch from mysqldb to another eventlet aware mysql client
-BEGIN PGP SIGNED MESSAGE- Hash: SHA512 On 12/07/14 03:17, Mike Bayer wrote: On 7/11/14, 7:26 PM, Carl Baldwin wrote: On Jul 11, 2014 5:32 PM, Vishvananda Ishaya vishvana...@gmail.com mailto:vishvana...@gmail.com wrote: I have tried using pymysql in place of mysqldb and in real world concurrency tests against cinder and nova it performs slower. I was inspired by the mention of mysql-connector so I just tried that option instead. Mysql-connector seems to be slightly slower as well, which leads me to believe that the blocking inside of Do you have some numbers? Seems to be slightly slower doesn't really stand up as an argument against the numbers that have been posted in this thread. sqlalchemy is not the main bottleneck across projects. Vish P.S. The performanace in all cases was abysmal, so performance work definitely needs to be done, but just the guess that replacing our mysql library is going to solve all of our performance problems appears to be incorrect at first blush. The motivation is still mostly deadlock relief but more performance work should be done. I agree with you there. I'm still hopeful for some improvement from this. To identify performance that's alleviated by async you have to establish up front that IO blocking is the issue, which would entail having code that's blazing fast until you start running it against concurrent connections, at which point you can identify via profiling that IO operations are being serialized. This is a very specific issue. In contrast, to identify why some arbitrary openstack app is slow, my bet is that async is often not the big issue. Every day I look at openstack code and talk to people working on things, I see many performance issues that have nothing to do with concurrency, and as I detailed in my wiki page at https://wiki.openstack.org/wiki/OpenStack_and_SQLAlchemy there is a long road to cleaning up all the excessive queries, hundreds of unnecessary rows and columns being pulled over the network, unindexed lookups, subquery joins, hammering of Python-intensive operations (often due to the nature of OS apps as lots and lots of tiny API calls) that can be cached. There's a clear path to tons better performance documented there and most of it is not about async - which means that successful async isn't going to solve all those issues. Of course there is a long road to decent performance, and switching a library won't magically fix all out issues. But if it will fix deadlocks, and give 30% to 150% performance boost for different operations, and since the switch is almost smooth, this is something worth doing. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -BEGIN PGP SIGNATURE- Version: GnuPG/MacGPG2 v2.0.22 (Darwin) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBCgAGBQJTwrP+AAoJEC5aWaUY1u57gRUH+wTAUcl1kujT5iVUwcZUEEfx P9nC0JPduXxYlobiFYyQKVVQm6pTaPgbgBoq4M/vKD0PbxBYMSTFA+igmewX6cHN RlsfvQgTsB/FU+dxK3gfRBQU3OnHLUSKWwZydp0YjmrCCHP3wiPj/HqWdD7vl1H8 Cxl+R2Zr7dR1ZMQBHDbAtu6FrGEa5SBnAwvAcsIr6mKymkFzQ4DU2DKOc8mm1i1f GhwCG8IvhKYo/w3yt0CUjPJoPSDaAoIGv984NDv7sjHeSZCRxNmV8jXlRUcztFcG lBAWguZtdyOiX+qHNmRZHV1Mm0SybcGIZN0Zw+u+1hpoiR0ZjAbLoofBwkCHxDE= =OXs7 -END PGP SIGNATURE- ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [neutron][all] switch from mysqldb to another eventlet aware mysql client
-BEGIN PGP SIGNED MESSAGE- Hash: SHA512 On 12/07/14 00:30, Vishvananda Ishaya wrote: I have tried using pymysql in place of mysqldb and in real world concurrency tests against cinder and nova it performs slower. I was inspired by the mention of mysql-connector so I just tried that option instead. Mysql-connector seems to be slightly slower as well, which leads me to believe that the blocking inside of sqlalchemy is not the main bottleneck across projects. I wonder what's your setup and library versions, and your script that you use for testing would also be great to see. In my tests, mysql-connector showed similar performance to what mysqldb provides in serial testing. Once you get to parallel requests execution, that's where the real benefit of parallelism shows up. Have you run your testing with parallel requests in mind? I now realise that I should have posted the benchmark I've used myself in the first place. So here it is, as a gist: https://gist.github.com/booxter/c4f3e743a2573ba7809f Vish P.S. The performanace in all cases was abysmal, so performance work definitely needs to be done, but just the guess that replacing our mysql library is going to solve all of our performance problems appears to be incorrect at first blush. All? Not at all. Some of them? Probably. That said, the primary reason to switch the library is avoiding database deadlocks. Additional performance boost is just a nice thing to have with little effort. On Jul 11, 2014, at 10:20 AM, Clark Boylan clark.boy...@gmail.com wrote: Before we get too far ahead of ourselves mysql-connector is not hosted on pypi. Instead it is an external package link. We recently managed to remove all packages that are hosted as external package links from openstack and will not add new ones in. Before we can use mysql-connector in the gate oracle will need to publish mysql-connector on pypi properly. That said there is at least one other pure python alternative, PyMySQL. PyMySQL supports py3k and pypy. We should look at using PyMySQL instead if we want to start with a reasonable path to getting this in the gate. Clark On Fri, Jul 11, 2014 at 10:07 AM, Miguel Angel Ajo Pelayo mangel...@redhat.com wrote: +1 here too, Amazed with the performance gains, x2.4 seems a lot, and we'd get rid of deadlocks. - Original Message - +1 I'm pretty excited about the possibilities here. I've had this mysqldb/eventlet contention in the back of my mind for some time now. I'm glad to see some work being done in this area. Carl On Fri, Jul 11, 2014 at 7:04 AM, Ihar Hrachyshka ihrac...@redhat.com wrote: On 09/07/14 13:17, Ihar Hrachyshka wrote: Hi all, Multiple projects are suffering from db lock timeouts due to deadlocks deep in mysqldb library that we use to interact with mysql servers. In essence, the problem is due to missing eventlet support in mysqldb module, meaning when a db lock is encountered, the library does not yield to the next green thread, allowing other threads to eventually unlock the grabbed lock, and instead it just blocks the main thread, that eventually raises timeout exception (OperationalError). The failed operation is not retried, leaving failing request not served. In Nova, there is a special retry mechanism for deadlocks, though I think it's more a hack than a proper fix. Neutron is one of the projects that suffer from those timeout errors a lot. Partly it's due to lack of discipline in how we do nested calls in l3_db and ml2_plugin code, but that's not something to change in foreseeable future, so we need to find another solution that is applicable for Juno. Ideally, the solution should be applicable for Icehouse too to allow distributors to resolve existing deadlocks without waiting for Juno. We've had several discussions and attempts to introduce a solution to the problem. Thanks to oslo.db guys, we now have more or less clear view on the cause of the failures and how to easily fix them. The solution is to switch mysqldb to something eventlet aware. The best candidate is probably MySQL Connector module that is an official MySQL client for Python and that shows some (preliminary) good results in terms of performance. I've made additional testing, creating 2000 networks in parallel (10 thread workers) for both drivers and comparing results. With mysqldb: 215.81 sec With mysql-connector: 88.66 ~2.4 times performance boost, ok? ;) I think we should switch to that library *even* if we forget about all the nasty deadlocks we experience now. I've posted a Neutron spec for the switch to the new client in Juno at [1]. Ideally, switch is just a matter of several fixes to oslo.db that would enable full support for the new driver already supported by SQLAlchemy, plus 'connection' string modified in service configuration files, plus documentation updates to refer to the new official way to configure
Re: [openstack-dev] [neutron][all] switch from mysqldb to another eventlet aware mysql client
-BEGIN PGP SIGNED MESSAGE- Hash: SHA512 On 12/07/14 02:49, Jay Pipes wrote: On 07/11/2014 08:04 AM, Ihar Hrachyshka wrote: On 09/07/14 13:17, Ihar Hrachyshka wrote: Hi all, Multiple projects are suffering from db lock timeouts due to deadlocks deep in mysqldb library that we use to interact with mysql servers. In essence, the problem is due to missing eventlet support in mysqldb module, meaning when a db lock is encountered, the library does not yield to the next green thread, allowing other threads to eventually unlock the grabbed lock, and instead it just blocks the main thread, that eventually raises timeout exception (OperationalError). The failed operation is not retried, leaving failing request not served. In Nova, there is a special retry mechanism for deadlocks, though I think it's more a hack than a proper fix. Neutron is one of the projects that suffer from those timeout errors a lot. Partly it's due to lack of discipline in how we do nested calls in l3_db and ml2_plugin code, but that's not something to change in foreseeable future, so we need to find another solution that is applicable for Juno. Ideally, the solution should be applicable for Icehouse too to allow distributors to resolve existing deadlocks without waiting for Juno. We've had several discussions and attempts to introduce a solution to the problem. Thanks to oslo.db guys, we now have more or less clear view on the cause of the failures and how to easily fix them. The solution is to switch mysqldb to something eventlet aware. The best candidate is probably MySQL Connector module that is an official MySQL client for Python and that shows some (preliminary) good results in terms of performance. I've made additional testing, creating 2000 networks in parallel (10 thread workers) for both drivers and comparing results. With mysqldb: 215.81 sec With mysql-connector: 88.66 ~2.4 times performance boost, ok? ;) That really doesn't tell me much. Please remember that performance != scalability. I actually agree. I need to spend more time preparing benchmarks (I'll spend a part of my next week on this). If you showed the test/benchmark code, that would be great. You need to Here is a gist: https://gist.github.com/booxter/c4f3e743a2573ba7809f run your benchmarks at varying levels of concurrency and varying levels of read/write ratios for the workers. Otherwise it's like looking at a a single dot of paint on a painting. Without looking at the patterns of throughput (performance) and concurrency/locking (scalability) with various levels of workers and read/write ratios, you miss the whole picture. Good point. I surely need to check read operations, and play with multiple API/RPC workers. Before I go with implementing my own tests, do you have more formal requirements or suggestions on which data points we can be interested in? Another thing to ensure is that you are using real *processes*, not threads, so that you actually simulate a real OpenStack service like Nova or Neutron, which are multi-plexed, not multi-threaded, and have a greenlet pool within each worker process. My test is an actual neutron client that issues full blown REST requests to a local neutron server, in multiple thread workers. I think this is ok, right? Best -jay I think we should switch to that library *even* if we forget about all the nasty deadlocks we experience now. I've posted a Neutron spec for the switch to the new client in Juno at [1]. Ideally, switch is just a matter of several fixes to oslo.db that would enable full support for the new driver already supported by SQLAlchemy, plus 'connection' string modified in service configuration files, plus documentation updates to refer to the new official way to configure services for MySQL. The database code won't, ideally, require any major changes, though some adaptation for the new client library may be needed. That said, Neutron does not seem to require any changes, though it was revealed that there are some alembic migration rules in Keystone or Glance that need (trivial) modifications. You can see how trivial the switch can be achieved for a service based on example for Neutron [2]. While this is a Neutron specific proposal, there is an obvious wish to switch to the new library globally throughout all the projects, to reduce devops burden, among other things. My vision is that, ideally, we switch all projects to the new library in Juno, though we still may leave several projects for K in case any issues arise, similar to the way projects switched to oslo.messaging during two cycles instead of one. Though looking at how easy Neutron can be switched to the new library, I wouldn't expect any issues that would postpone the switch till K. It was mentioned in comments to the spec proposal that there were some discussions at the latest summit around possible switch in context of Nova that
Re: [openstack-dev] [neutron][all] switch from mysqldb to another eventlet aware mysql client
On 07/14/2014 12:20 AM, Ihar Hrachyshka wrote: On 11/07/14 19:20, Clark Boylan wrote: That said there is at least one other pure python alternative, PyMySQL. PyMySQL supports py3k and pypy. We should look at using PyMySQL instead if we want to start with a reasonable path to getting this in the gate. MySQL Connector supports py3k too (not sure about pypy though). Yes, and it's also what Django people recommend: https://docs.djangoproject.com/en/1.7/ref/databases/#mysql-db-api-drivers As for mysqldb and Python3, the only way is to use a Python 3 fork such as this one: https://github.com/clelland/MySQL-for-Python-3 I wouldn't like using different versions of Python modules depending on the Python version, and therefore, python-mysql.connector / python3-mysql.connector would be preferred. However, it'd be nice if *all* projects could switch to that, and not just Neutron, otherwise, we'd be just adding a new dependency, which isn't great. Also, about eventlet, there's been long threads about switching to something else like asyncio. Wouldn't it be time to also do that (at the same time)? Cheers, Thomas Goirand (zigo) ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [neutron][all] switch from mysqldb to another eventlet aware mysql client
+1. Well put. No one is arguing against this other approach. The two efforts can be taken independently. Carl On Jul 11, 2014 10:48 PM, Mike Bayer mba...@redhat.com wrote: On 7/11/14, 11:26 PM, Jay Pipes wrote: Yep, couldn't agree more. Frankly, the steps you outline in the wiki above are excellent examples of where we can make significant gains in both performance and scalability. In addition to those you listed, the underlying database schemas themselves, with the excessive use of large VARCHAR fields, BLOB fields for JSONified values, and the general bad strategy of bunching heavily-read fields with infrequently-read fields in the same tables, are also a source of poor overall database performance. Well the topic of schema modifications I actually left out of that document entirely for starters - I made a conscious choice to focus entirely on things that don't involve any apps changing any of their fundamental approaches or schemas...at least just yet! :)I'm hoping that as oslo.db improves and the patterns start to roll out, we can start working on schema design too.Because yeah I've seen the giant lists of VARCHAR everything and just said, OK well we're going to have to get to that..just not right now :). ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [neutron][all] switch from mysqldb to another eventlet aware mysql client
-BEGIN PGP SIGNED MESSAGE- Hash: SHA512 On 09/07/14 13:17, Ihar Hrachyshka wrote: Hi all, Multiple projects are suffering from db lock timeouts due to deadlocks deep in mysqldb library that we use to interact with mysql servers. In essence, the problem is due to missing eventlet support in mysqldb module, meaning when a db lock is encountered, the library does not yield to the next green thread, allowing other threads to eventually unlock the grabbed lock, and instead it just blocks the main thread, that eventually raises timeout exception (OperationalError). The failed operation is not retried, leaving failing request not served. In Nova, there is a special retry mechanism for deadlocks, though I think it's more a hack than a proper fix. Neutron is one of the projects that suffer from those timeout errors a lot. Partly it's due to lack of discipline in how we do nested calls in l3_db and ml2_plugin code, but that's not something to change in foreseeable future, so we need to find another solution that is applicable for Juno. Ideally, the solution should be applicable for Icehouse too to allow distributors to resolve existing deadlocks without waiting for Juno. We've had several discussions and attempts to introduce a solution to the problem. Thanks to oslo.db guys, we now have more or less clear view on the cause of the failures and how to easily fix them. The solution is to switch mysqldb to something eventlet aware. The best candidate is probably MySQL Connector module that is an official MySQL client for Python and that shows some (preliminary) good results in terms of performance. I've made additional testing, creating 2000 networks in parallel (10 thread workers) for both drivers and comparing results. With mysqldb: 215.81 sec With mysql-connector: 88.66 ~2.4 times performance boost, ok? ;) I think we should switch to that library *even* if we forget about all the nasty deadlocks we experience now. I've posted a Neutron spec for the switch to the new client in Juno at [1]. Ideally, switch is just a matter of several fixes to oslo.db that would enable full support for the new driver already supported by SQLAlchemy, plus 'connection' string modified in service configuration files, plus documentation updates to refer to the new official way to configure services for MySQL. The database code won't, ideally, require any major changes, though some adaptation for the new client library may be needed. That said, Neutron does not seem to require any changes, though it was revealed that there are some alembic migration rules in Keystone or Glance that need (trivial) modifications. You can see how trivial the switch can be achieved for a service based on example for Neutron [2]. While this is a Neutron specific proposal, there is an obvious wish to switch to the new library globally throughout all the projects, to reduce devops burden, among other things. My vision is that, ideally, we switch all projects to the new library in Juno, though we still may leave several projects for K in case any issues arise, similar to the way projects switched to oslo.messaging during two cycles instead of one. Though looking at how easy Neutron can be switched to the new library, I wouldn't expect any issues that would postpone the switch till K. It was mentioned in comments to the spec proposal that there were some discussions at the latest summit around possible switch in context of Nova that revealed some concerns, though they do not seem to be documented anywhere. So if you know anything about it, please comment. So, we'd like to hear from other projects what's your take on that move, whether you see any issues or have concerns about it. Thanks for your comments, /Ihar [1]: https://review.openstack.org/#/c/104905/ [2]: https://review.openstack.org/#/c/105209/ ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -BEGIN PGP SIGNATURE- Version: GnuPG/MacGPG2 v2.0.22 (Darwin) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBCgAGBQJTv9LHAAoJEC5aWaUY1u57d2cIAIAthLuM6qxN9fVjPwoICEae oSOLvaDNPpZ+xBBqKI+2l5aFiBXSkHzgCfWGHEZB4e+5odAzt8r3Dg5eG/hwckGt iZLPGLxcmvD5K0cRoSSPWkPC4KkOwKw0yQHl/JQarDcHQlLgO64jx3bzlB1LDxRu R/Bvqo1SBo8g/cupWyxJXNViu9z7zAlvcHLRg4j/AfNTsTDZRrSgbMF2/gLTMvN2 FPtkjBvZq++zOva5G5/TySr1b3QRBFCG0uetVbcVF//90XOw+O++rUiDW1v7vkA9 OS2sCIXmx1i8kt9yuvs0h11MS8qfX9rSXREJXyPq6NDmePdQdKFsozMdTmqaDfU= =JfiC -END PGP SIGNATURE- ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [neutron][all] switch from mysqldb to another eventlet aware mysql client
On 7/9/14, 10:59 AM, Roman Podoliaka wrote: Hi all, Not sure what issues you are talking about, but I just replaced mysql with mysql+mysqlconnector in my db connection string in neutron.conf and neutron-db-manage upgrade head worked like a charm for an empty schema. Ihar, could please elaborate on what changes to oslo.db are needed? (as an oslo.db developer I'm very interested in this part :) ) Thanks, Roman On Wed, Jul 9, 2014 at 5:43 PM, Ihar Hrachyshka ihrac...@redhat.com wrote: On 09/07/14 15:40, Sean Dague wrote: On 07/09/2014 09:00 AM, Roman Podoliaka wrote: Hi Ihar, AFAIU, the switch is a matter of pip install + specifying the correct db URI in the config files. I'm not sure why you are filing a spec in Neutron project. IMHO, this has nothing to do with projects, but rather a purely deployment question. E.g. don't we have PostgreSQL+psycopg2 or MySQL+pymysql deployments of OpenStack right now? I think what you really want is to change the defaults we test in the gate, which is a different problem. Because this is really a *new* driver. As you can see by the attempted run, it doesn't work with alembic given the definitions that neutron has. So it's not like this is currently compatible with OpenStack code. Well, to fix that, you just need to specify raise_on_warnings=False for connection (it's default for mysqldb but not mysql-connector). I've done it in devstack patch for now, but probably it belongs to this is also semi-my fault as mysqlconnector apparently defaults this to False now, but for some reason the SQLAlchemy mysqlconnector dialect is flipping it to True (this dialect was contributed by MySQL-connector's folks, so not sure why the inconsistency, perhaps they changed their minds) oslo.db. Thanks, Roman On Wed, Jul 9, 2014 at 2:17 PM, Ihar Hrachyshka ihrac...@redhat.com wrote: Hi all, Multiple projects are suffering from db lock timeouts due to deadlocks deep in mysqldb library that we use to interact with mysql servers. In essence, the problem is due to missing eventlet support in mysqldb module, meaning when a db lock is encountered, the library does not yield to the next green thread, allowing other threads to eventually unlock the grabbed lock, and instead it just blocks the main thread, that eventually raises timeout exception (OperationalError). The failed operation is not retried, leaving failing request not served. In Nova, there is a special retry mechanism for deadlocks, though I think it's more a hack than a proper fix. Neutron is one of the projects that suffer from those timeout errors a lot. Partly it's due to lack of discipline in how we do nested calls in l3_db and ml2_plugin code, but that's not something to change in foreseeable future, so we need to find another solution that is applicable for Juno. Ideally, the solution should be applicable for Icehouse too to allow distributors to resolve existing deadlocks without waiting for Juno. We've had several discussions and attempts to introduce a solution to the problem. Thanks to oslo.db guys, we now have more or less clear view on the cause of the failures and how to easily fix them. The solution is to switch mysqldb to something eventlet aware. The best candidate is probably MySQL Connector module that is an official MySQL client for Python and that shows some (preliminary) good results in terms of performance. I've posted a Neutron spec for the switch to the new client in Juno at [1]. Ideally, switch is just a matter of several fixes to oslo.db that would enable full support for the new driver already supported by SQLAlchemy, plus 'connection' string modified in service configuration files, plus documentation updates to refer to the new official way to configure services for MySQL. The database code won't, ideally, require any major changes, though some adaptation for the new client library may be needed. That said, Neutron does not seem to require any changes, though it was revealed that there are some alembic migration rules in Keystone or Glance that need (trivial) modifications. You can see how trivial the switch can be achieved for a service based on example for Neutron [2]. While this is a Neutron specific proposal, there is an obvious wish to switch to the new library globally throughout all the projects, to reduce devops burden, among other things. My vision is that, ideally, we switch all projects to the new library in Juno, though we still may leave several projects for K in case any issues arise, similar to the way projects switched to oslo.messaging during two cycles instead of one. Though looking at how easy Neutron can be switched to the new library, I wouldn't expect any issues that would postpone the switch till K. It was mentioned in comments to the spec proposal that there were some discussions at the latest summit
Re: [openstack-dev] [neutron][all] switch from mysqldb to another eventlet aware mysql client
Before we get too far ahead of ourselves mysql-connector is not hosted on pypi. Instead it is an external package link. We recently managed to remove all packages that are hosted as external package links from openstack and will not add new ones in. Before we can use mysql-connector in the gate oracle will need to publish mysql-connector on pypi properly. That said there is at least one other pure python alternative, PyMySQL. PyMySQL supports py3k and pypy. We should look at using PyMySQL instead if we want to start with a reasonable path to getting this in the gate. Clark On Fri, Jul 11, 2014 at 10:07 AM, Miguel Angel Ajo Pelayo mangel...@redhat.com wrote: +1 here too, Amazed with the performance gains, x2.4 seems a lot, and we'd get rid of deadlocks. - Original Message - +1 I'm pretty excited about the possibilities here. I've had this mysqldb/eventlet contention in the back of my mind for some time now. I'm glad to see some work being done in this area. Carl On Fri, Jul 11, 2014 at 7:04 AM, Ihar Hrachyshka ihrac...@redhat.com wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA512 On 09/07/14 13:17, Ihar Hrachyshka wrote: Hi all, Multiple projects are suffering from db lock timeouts due to deadlocks deep in mysqldb library that we use to interact with mysql servers. In essence, the problem is due to missing eventlet support in mysqldb module, meaning when a db lock is encountered, the library does not yield to the next green thread, allowing other threads to eventually unlock the grabbed lock, and instead it just blocks the main thread, that eventually raises timeout exception (OperationalError). The failed operation is not retried, leaving failing request not served. In Nova, there is a special retry mechanism for deadlocks, though I think it's more a hack than a proper fix. Neutron is one of the projects that suffer from those timeout errors a lot. Partly it's due to lack of discipline in how we do nested calls in l3_db and ml2_plugin code, but that's not something to change in foreseeable future, so we need to find another solution that is applicable for Juno. Ideally, the solution should be applicable for Icehouse too to allow distributors to resolve existing deadlocks without waiting for Juno. We've had several discussions and attempts to introduce a solution to the problem. Thanks to oslo.db guys, we now have more or less clear view on the cause of the failures and how to easily fix them. The solution is to switch mysqldb to something eventlet aware. The best candidate is probably MySQL Connector module that is an official MySQL client for Python and that shows some (preliminary) good results in terms of performance. I've made additional testing, creating 2000 networks in parallel (10 thread workers) for both drivers and comparing results. With mysqldb: 215.81 sec With mysql-connector: 88.66 ~2.4 times performance boost, ok? ;) I think we should switch to that library *even* if we forget about all the nasty deadlocks we experience now. I've posted a Neutron spec for the switch to the new client in Juno at [1]. Ideally, switch is just a matter of several fixes to oslo.db that would enable full support for the new driver already supported by SQLAlchemy, plus 'connection' string modified in service configuration files, plus documentation updates to refer to the new official way to configure services for MySQL. The database code won't, ideally, require any major changes, though some adaptation for the new client library may be needed. That said, Neutron does not seem to require any changes, though it was revealed that there are some alembic migration rules in Keystone or Glance that need (trivial) modifications. You can see how trivial the switch can be achieved for a service based on example for Neutron [2]. While this is a Neutron specific proposal, there is an obvious wish to switch to the new library globally throughout all the projects, to reduce devops burden, among other things. My vision is that, ideally, we switch all projects to the new library in Juno, though we still may leave several projects for K in case any issues arise, similar to the way projects switched to oslo.messaging during two cycles instead of one. Though looking at how easy Neutron can be switched to the new library, I wouldn't expect any issues that would postpone the switch till K. It was mentioned in comments to the spec proposal that there were some discussions at the latest summit around possible switch in context of Nova that revealed some concerns, though they do not seem to be documented anywhere. So if you know anything about it, please comment. So, we'd like to hear from other projects what's your take on that move, whether you see any issues or have concerns about it. Thanks for your comments, /Ihar [1]:
Re: [openstack-dev] [neutron][all] switch from mysqldb to another eventlet aware mysql client
I have tried using pymysql in place of mysqldb and in real world concurrency tests against cinder and nova it performs slower. I was inspired by the mention of mysql-connector so I just tried that option instead. Mysql-connector seems to be slightly slower as well, which leads me to believe that the blocking inside of sqlalchemy is not the main bottleneck across projects. Vish P.S. The performanace in all cases was abysmal, so performance work definitely needs to be done, but just the guess that replacing our mysql library is going to solve all of our performance problems appears to be incorrect at first blush. On Jul 11, 2014, at 10:20 AM, Clark Boylan clark.boy...@gmail.com wrote: Before we get too far ahead of ourselves mysql-connector is not hosted on pypi. Instead it is an external package link. We recently managed to remove all packages that are hosted as external package links from openstack and will not add new ones in. Before we can use mysql-connector in the gate oracle will need to publish mysql-connector on pypi properly. That said there is at least one other pure python alternative, PyMySQL. PyMySQL supports py3k and pypy. We should look at using PyMySQL instead if we want to start with a reasonable path to getting this in the gate. Clark On Fri, Jul 11, 2014 at 10:07 AM, Miguel Angel Ajo Pelayo mangel...@redhat.com wrote: +1 here too, Amazed with the performance gains, x2.4 seems a lot, and we'd get rid of deadlocks. - Original Message - +1 I'm pretty excited about the possibilities here. I've had this mysqldb/eventlet contention in the back of my mind for some time now. I'm glad to see some work being done in this area. Carl On Fri, Jul 11, 2014 at 7:04 AM, Ihar Hrachyshka ihrac...@redhat.com wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA512 On 09/07/14 13:17, Ihar Hrachyshka wrote: Hi all, Multiple projects are suffering from db lock timeouts due to deadlocks deep in mysqldb library that we use to interact with mysql servers. In essence, the problem is due to missing eventlet support in mysqldb module, meaning when a db lock is encountered, the library does not yield to the next green thread, allowing other threads to eventually unlock the grabbed lock, and instead it just blocks the main thread, that eventually raises timeout exception (OperationalError). The failed operation is not retried, leaving failing request not served. In Nova, there is a special retry mechanism for deadlocks, though I think it's more a hack than a proper fix. Neutron is one of the projects that suffer from those timeout errors a lot. Partly it's due to lack of discipline in how we do nested calls in l3_db and ml2_plugin code, but that's not something to change in foreseeable future, so we need to find another solution that is applicable for Juno. Ideally, the solution should be applicable for Icehouse too to allow distributors to resolve existing deadlocks without waiting for Juno. We've had several discussions and attempts to introduce a solution to the problem. Thanks to oslo.db guys, we now have more or less clear view on the cause of the failures and how to easily fix them. The solution is to switch mysqldb to something eventlet aware. The best candidate is probably MySQL Connector module that is an official MySQL client for Python and that shows some (preliminary) good results in terms of performance. I've made additional testing, creating 2000 networks in parallel (10 thread workers) for both drivers and comparing results. With mysqldb: 215.81 sec With mysql-connector: 88.66 ~2.4 times performance boost, ok? ;) I think we should switch to that library *even* if we forget about all the nasty deadlocks we experience now. I've posted a Neutron spec for the switch to the new client in Juno at [1]. Ideally, switch is just a matter of several fixes to oslo.db that would enable full support for the new driver already supported by SQLAlchemy, plus 'connection' string modified in service configuration files, plus documentation updates to refer to the new official way to configure services for MySQL. The database code won't, ideally, require any major changes, though some adaptation for the new client library may be needed. That said, Neutron does not seem to require any changes, though it was revealed that there are some alembic migration rules in Keystone or Glance that need (trivial) modifications. You can see how trivial the switch can be achieved for a service based on example for Neutron [2]. While this is a Neutron specific proposal, there is an obvious wish to switch to the new library globally throughout all the projects, to reduce devops burden, among other things. My vision is that, ideally, we switch all projects to the new library in Juno, though we still may leave several projects for K in case any issues arise, similar to the way projects switched to
Re: [openstack-dev] [neutron][all] switch from mysqldb to another eventlet aware mysql client
On Jul 11, 2014 5:32 PM, Vishvananda Ishaya vishvana...@gmail.com wrote: I have tried using pymysql in place of mysqldb and in real world concurrency tests against cinder and nova it performs slower. I was inspired by the mention of mysql-connector so I just tried that option instead. Mysql-connector seems to be slightly slower as well, which leads me to believe that the blocking inside of Do you have some numbers? Seems to be slightly slower doesn't really stand up as an argument against the numbers that have been posted in this thread. sqlalchemy is not the main bottleneck across projects. Vish P.S. The performanace in all cases was abysmal, so performance work definitely needs to be done, but just the guess that replacing our mysql library is going to solve all of our performance problems appears to be incorrect at first blush. The motivation is still mostly deadlock relief but more performance work should be done. I agree with you there. I'm still hopeful for some improvement from this. On Jul 11, 2014, at 10:20 AM, Clark Boylan clark.boy...@gmail.com wrote: Before we get too far ahead of ourselves mysql-connector is not hosted on pypi. Instead it is an external package link. We recently managed to remove all packages that are hosted as external package links from openstack and will not add new ones in. Before we can use mysql-connector in the gate oracle will need to publish mysql-connector on pypi properly. That said there is at least one other pure python alternative, PyMySQL. PyMySQL supports py3k and pypy. We should look at using PyMySQL instead if we want to start with a reasonable path to getting this in the gate. Clark On Fri, Jul 11, 2014 at 10:07 AM, Miguel Angel Ajo Pelayo mangel...@redhat.com wrote: +1 here too, Amazed with the performance gains, x2.4 seems a lot, and we'd get rid of deadlocks. - Original Message - +1 I'm pretty excited about the possibilities here. I've had this mysqldb/eventlet contention in the back of my mind for some time now. I'm glad to see some work being done in this area. Carl On Fri, Jul 11, 2014 at 7:04 AM, Ihar Hrachyshka ihrac...@redhat.com wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA512 On 09/07/14 13:17, Ihar Hrachyshka wrote: Hi all, Multiple projects are suffering from db lock timeouts due to deadlocks deep in mysqldb library that we use to interact with mysql servers. In essence, the problem is due to missing eventlet support in mysqldb module, meaning when a db lock is encountered, the library does not yield to the next green thread, allowing other threads to eventually unlock the grabbed lock, and instead it just blocks the main thread, that eventually raises timeout exception (OperationalError). The failed operation is not retried, leaving failing request not served. In Nova, there is a special retry mechanism for deadlocks, though I think it's more a hack than a proper fix. Neutron is one of the projects that suffer from those timeout errors a lot. Partly it's due to lack of discipline in how we do nested calls in l3_db and ml2_plugin code, but that's not something to change in foreseeable future, so we need to find another solution that is applicable for Juno. Ideally, the solution should be applicable for Icehouse too to allow distributors to resolve existing deadlocks without waiting for Juno. We've had several discussions and attempts to introduce a solution to the problem. Thanks to oslo.db guys, we now have more or less clear view on the cause of the failures and how to easily fix them. The solution is to switch mysqldb to something eventlet aware. The best candidate is probably MySQL Connector module that is an official MySQL client for Python and that shows some (preliminary) good results in terms of performance. I've made additional testing, creating 2000 networks in parallel (10 thread workers) for both drivers and comparing results. With mysqldb: 215.81 sec With mysql-connector: 88.66 ~2.4 times performance boost, ok? ;) I think we should switch to that library *even* if we forget about all the nasty deadlocks we experience now. I've posted a Neutron spec for the switch to the new client in Juno at [1]. Ideally, switch is just a matter of several fixes to oslo.db that would enable full support for the new driver already supported by SQLAlchemy, plus 'connection' string modified in service configuration files, plus documentation updates to refer to the new official way to configure services for MySQL. The database code won't, ideally, require any major changes, though some adaptation for the new client library may be needed. That said, Neutron does not seem to require any changes, though it was revealed that there are some alembic migration rules in Keystone or Glance that need (trivial) modifications. You can see how
Re: [openstack-dev] [neutron][all] switch from mysqldb to another eventlet aware mysql client
Clark, You make a good point. It's there some resistance to this or is it just a matter of asking? Carl On Jul 11, 2014 12:23 PM, Clark Boylan clark.boy...@gmail.com wrote: Before we get too far ahead of ourselves mysql-connector is not hosted on pypi. Instead it is an external package link. We recently managed to remove all packages that are hosted as external package links from openstack and will not add new ones in. Before we can use mysql-connector in the gate oracle will need to publish mysql-connector on pypi properly. That said there is at least one other pure python alternative, PyMySQL. PyMySQL supports py3k and pypy. We should look at using PyMySQL instead if we want to start with a reasonable path to getting this in the gate. Clark On Fri, Jul 11, 2014 at 10:07 AM, Miguel Angel Ajo Pelayo mangel...@redhat.com wrote: +1 here too, Amazed with the performance gains, x2.4 seems a lot, and we'd get rid of deadlocks. - Original Message - +1 I'm pretty excited about the possibilities here. I've had this mysqldb/eventlet contention in the back of my mind for some time now. I'm glad to see some work being done in this area. Carl On Fri, Jul 11, 2014 at 7:04 AM, Ihar Hrachyshka ihrac...@redhat.com wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA512 On 09/07/14 13:17, Ihar Hrachyshka wrote: Hi all, Multiple projects are suffering from db lock timeouts due to deadlocks deep in mysqldb library that we use to interact with mysql servers. In essence, the problem is due to missing eventlet support in mysqldb module, meaning when a db lock is encountered, the library does not yield to the next green thread, allowing other threads to eventually unlock the grabbed lock, and instead it just blocks the main thread, that eventually raises timeout exception (OperationalError). The failed operation is not retried, leaving failing request not served. In Nova, there is a special retry mechanism for deadlocks, though I think it's more a hack than a proper fix. Neutron is one of the projects that suffer from those timeout errors a lot. Partly it's due to lack of discipline in how we do nested calls in l3_db and ml2_plugin code, but that's not something to change in foreseeable future, so we need to find another solution that is applicable for Juno. Ideally, the solution should be applicable for Icehouse too to allow distributors to resolve existing deadlocks without waiting for Juno. We've had several discussions and attempts to introduce a solution to the problem. Thanks to oslo.db guys, we now have more or less clear view on the cause of the failures and how to easily fix them. The solution is to switch mysqldb to something eventlet aware. The best candidate is probably MySQL Connector module that is an official MySQL client for Python and that shows some (preliminary) good results in terms of performance. I've made additional testing, creating 2000 networks in parallel (10 thread workers) for both drivers and comparing results. With mysqldb: 215.81 sec With mysql-connector: 88.66 ~2.4 times performance boost, ok? ;) I think we should switch to that library *even* if we forget about all the nasty deadlocks we experience now. I've posted a Neutron spec for the switch to the new client in Juno at [1]. Ideally, switch is just a matter of several fixes to oslo.db that would enable full support for the new driver already supported by SQLAlchemy, plus 'connection' string modified in service configuration files, plus documentation updates to refer to the new official way to configure services for MySQL. The database code won't, ideally, require any major changes, though some adaptation for the new client library may be needed. That said, Neutron does not seem to require any changes, though it was revealed that there are some alembic migration rules in Keystone or Glance that need (trivial) modifications. You can see how trivial the switch can be achieved for a service based on example for Neutron [2]. While this is a Neutron specific proposal, there is an obvious wish to switch to the new library globally throughout all the projects, to reduce devops burden, among other things. My vision is that, ideally, we switch all projects to the new library in Juno, though we still may leave several projects for K in case any issues arise, similar to the way projects switched to oslo.messaging during two cycles instead of one. Though looking at how easy Neutron can be switched to the new library, I wouldn't expect any issues that would postpone the switch till K. It was mentioned in comments to the spec proposal that there were some discussions at the latest summit around possible switch in context of Nova that revealed some concerns, though they do
Re: [openstack-dev] [neutron][all] switch from mysqldb to another eventlet aware mysql client
On 07/11/2014 08:04 AM, Ihar Hrachyshka wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA512 On 09/07/14 13:17, Ihar Hrachyshka wrote: Hi all, Multiple projects are suffering from db lock timeouts due to deadlocks deep in mysqldb library that we use to interact with mysql servers. In essence, the problem is due to missing eventlet support in mysqldb module, meaning when a db lock is encountered, the library does not yield to the next green thread, allowing other threads to eventually unlock the grabbed lock, and instead it just blocks the main thread, that eventually raises timeout exception (OperationalError). The failed operation is not retried, leaving failing request not served. In Nova, there is a special retry mechanism for deadlocks, though I think it's more a hack than a proper fix. Neutron is one of the projects that suffer from those timeout errors a lot. Partly it's due to lack of discipline in how we do nested calls in l3_db and ml2_plugin code, but that's not something to change in foreseeable future, so we need to find another solution that is applicable for Juno. Ideally, the solution should be applicable for Icehouse too to allow distributors to resolve existing deadlocks without waiting for Juno. We've had several discussions and attempts to introduce a solution to the problem. Thanks to oslo.db guys, we now have more or less clear view on the cause of the failures and how to easily fix them. The solution is to switch mysqldb to something eventlet aware. The best candidate is probably MySQL Connector module that is an official MySQL client for Python and that shows some (preliminary) good results in terms of performance. I've made additional testing, creating 2000 networks in parallel (10 thread workers) for both drivers and comparing results. With mysqldb: 215.81 sec With mysql-connector: 88.66 ~2.4 times performance boost, ok? ;) That really doesn't tell me much. Please remember that performance != scalability. If you showed the test/benchmark code, that would be great. You need to run your benchmarks at varying levels of concurrency and varying levels of read/write ratios for the workers. Otherwise it's like looking at a a single dot of paint on a painting. Without looking at the patterns of throughput (performance) and concurrency/locking (scalability) with various levels of workers and read/write ratios, you miss the whole picture. Another thing to ensure is that you are using real *processes*, not threads, so that you actually simulate a real OpenStack service like Nova or Neutron, which are multi-plexed, not multi-threaded, and have a greenlet pool within each worker process. Best -jay I think we should switch to that library *even* if we forget about all the nasty deadlocks we experience now. I've posted a Neutron spec for the switch to the new client in Juno at [1]. Ideally, switch is just a matter of several fixes to oslo.db that would enable full support for the new driver already supported by SQLAlchemy, plus 'connection' string modified in service configuration files, plus documentation updates to refer to the new official way to configure services for MySQL. The database code won't, ideally, require any major changes, though some adaptation for the new client library may be needed. That said, Neutron does not seem to require any changes, though it was revealed that there are some alembic migration rules in Keystone or Glance that need (trivial) modifications. You can see how trivial the switch can be achieved for a service based on example for Neutron [2]. While this is a Neutron specific proposal, there is an obvious wish to switch to the new library globally throughout all the projects, to reduce devops burden, among other things. My vision is that, ideally, we switch all projects to the new library in Juno, though we still may leave several projects for K in case any issues arise, similar to the way projects switched to oslo.messaging during two cycles instead of one. Though looking at how easy Neutron can be switched to the new library, I wouldn't expect any issues that would postpone the switch till K. It was mentioned in comments to the spec proposal that there were some discussions at the latest summit around possible switch in context of Nova that revealed some concerns, though they do not seem to be documented anywhere. So if you know anything about it, please comment. So, we'd like to hear from other projects what's your take on that move, whether you see any issues or have concerns about it. Thanks for your comments, /Ihar [1]: https://review.openstack.org/#/c/104905/ [2]: https://review.openstack.org/#/c/105209/ ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -BEGIN PGP SIGNATURE- Version: GnuPG/MacGPG2 v2.0.22 (Darwin) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/
Re: [openstack-dev] [neutron][all] switch from mysqldb to another eventlet aware mysql client
On 7/11/14, 7:26 PM, Carl Baldwin wrote: On Jul 11, 2014 5:32 PM, Vishvananda Ishaya vishvana...@gmail.com mailto:vishvana...@gmail.com wrote: I have tried using pymysql in place of mysqldb and in real world concurrency tests against cinder and nova it performs slower. I was inspired by the mention of mysql-connector so I just tried that option instead. Mysql-connector seems to be slightly slower as well, which leads me to believe that the blocking inside of Do you have some numbers? Seems to be slightly slower doesn't really stand up as an argument against the numbers that have been posted in this thread. sqlalchemy is not the main bottleneck across projects. Vish P.S. The performanace in all cases was abysmal, so performance work definitely needs to be done, but just the guess that replacing our mysql library is going to solve all of our performance problems appears to be incorrect at first blush. The motivation is still mostly deadlock relief but more performance work should be done. I agree with you there. I'm still hopeful for some improvement from this. To identify performance that's alleviated by async you have to establish up front that IO blocking is the issue, which would entail having code that's blazing fast until you start running it against concurrent connections, at which point you can identify via profiling that IO operations are being serialized. This is a very specific issue. In contrast, to identify why some arbitrary openstack app is slow, my bet is that async is often not the big issue. Every day I look at openstack code and talk to people working on things, I see many performance issues that have nothing to do with concurrency, and as I detailed in my wiki page at https://wiki.openstack.org/wiki/OpenStack_and_SQLAlchemy there is a long road to cleaning up all the excessive queries, hundreds of unnecessary rows and columns being pulled over the network, unindexed lookups, subquery joins, hammering of Python-intensive operations (often due to the nature of OS apps as lots and lots of tiny API calls) that can be cached. There's a clear path to tons better performance documented there and most of it is not about async - which means that successful async isn't going to solve all those issues. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [neutron][all] switch from mysqldb to another eventlet aware mysql client
On 07/11/2014 09:17 PM, Mike Bayer wrote: ... To identify performance that's alleviated by async you have to establish up front that IO blocking is the issue, which would entail having code that's blazing fast until you start running it against concurrent connections, at which point you can identify via profiling that IO operations are being serialized. This is a very specific issue. In contrast, to identify why some arbitrary openstack app is slow, my bet is that async is often not the big issue. Every day I look at openstack code and talk to people working on things, I see many performance issues that have nothing to do with concurrency, and as I detailed in my wiki page at https://wiki.openstack.org/wiki/OpenStack_and_SQLAlchemy there is a long road to cleaning up all the excessive queries, hundreds of unnecessary rows and columns being pulled over the network, unindexed lookups, subquery joins, hammering of Python-intensive operations (often due to the nature of OS apps as lots and lots of tiny API calls) that can be cached. There's a clear path to tons better performance documented there and most of it is not about async - which means that successful async isn't going to solve all those issues. Yep, couldn't agree more. Frankly, the steps you outline in the wiki above are excellent examples of where we can make significant gains in both performance and scalability. In addition to those you listed, the underlying database schemas themselves, with the excessive use of large VARCHAR fields, BLOB fields for JSONified values, and the general bad strategy of bunching heavily-read fields with infrequently-read fields in the same tables, are also a source of poor overall database performance. Best, -jay ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [neutron][all] switch from mysqldb to another eventlet aware mysql client
On 7/11/14, 11:26 PM, Jay Pipes wrote: Yep, couldn't agree more. Frankly, the steps you outline in the wiki above are excellent examples of where we can make significant gains in both performance and scalability. In addition to those you listed, the underlying database schemas themselves, with the excessive use of large VARCHAR fields, BLOB fields for JSONified values, and the general bad strategy of bunching heavily-read fields with infrequently-read fields in the same tables, are also a source of poor overall database performance. Well the topic of schema modifications I actually left out of that document entirely for starters - I made a conscious choice to focus entirely on things that don't involve any apps changing any of their fundamental approaches or schemas...at least just yet! :)I'm hoping that as oslo.db improves and the patterns start to roll out, we can start working on schema design too.Because yeah I've seen the giant lists of VARCHAR everything and just said, OK well we're going to have to get to that..just not right now :). ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [neutron][all] switch from mysqldb to another eventlet aware mysql client
On 10 July 2014 00:59, Roman Podoliaka rpodoly...@mirantis.com wrote: Not sure what issues you are talking about, but I just replaced mysql with mysql+mysqlconnector in my db connection string in neutron.conf and neutron-db-manage upgrade head worked like a charm for an empty schema. Yep, I don't think we're far away from it being that simple. Most of the changes/work I've seen discussed is in shifting the test environments and suitably addressing everyone's performance/uncertainty concerns. Ihar, could please elaborate on what changes to oslo.db are needed? (as an oslo.db developer I'm very interested in this part :) ) The changes I've been working on are (and most of these need oslo.db reviews): https://review.openstack.org/#/c/104436/ Test that concurrent sqlalchemy transactions don't block (This test reproduces the core issue. There's an open question where it belongs) https://review.openstack.org/#/c/104425/ Add DBDuplicateEntry detection for mysqlconnector driver https://review.openstack.org/#/c/104428/ Allow tox tests with complex OS_TEST_DBAPI_CONNECTION URLs https://review.openstack.org/#/c/104447/ Support OS_TEST_DBAPI_ADMIN_CONNECTION override https://review.openstack.org/#/c/104430/ Don't drop pre-existing database before tests https://github.com/zzzeek/sqlalchemy/pull/102 sqlalchemy mysql unicode improvement, that has various workarounds available to us in the meantime - Gus ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [neutron][all] switch from mysqldb to another eventlet aware mysql client
-BEGIN PGP SIGNED MESSAGE- Hash: SHA512 On 10/07/14 08:09, Angus Lees wrote: On 10 July 2014 00:59, Roman Podoliaka rpodoly...@mirantis.com wrote: Not sure what issues you are talking about, but I just replaced mysql with mysql+mysqlconnector in my db connection string in neutron.conf and neutron-db-manage upgrade head worked like a charm for an empty schema. Yep, I don't think we're far away from it being that simple. Most of the changes/work I've seen discussed is in shifting the test environments and suitably addressing everyone's performance/uncertainty concerns. Ihar, could please elaborate on what changes to oslo.db are needed? AFAIK we should: - - set raise_on_warnings to False for sqlconnector (the current default for sqlalchemy is True, which is wrong and to be fixed, but will need to wait for the next release which is months away from us); - - set encoding=utf8 until sqalchemy 1.0 is released and we require it. Plus changes you already track (duplicate error detection and testing parity - the latter is probably not a requirement). (as an oslo.db developer I'm very interested in this part :) ) The changes I've been working on are (and most of these need oslo.db reviews): https://review.openstack.org/#/c/104436/ Test that concurrent sqlalchemy transactions don't block (This test reproduces the core issue. There's an open question where it belongs) https://review.openstack.org/#/c/104425/ Add DBDuplicateEntry detection for mysqlconnector driver https://review.openstack.org/#/c/104428/ Allow tox tests with complex OS_TEST_DBAPI_CONNECTION URLs https://review.openstack.org/#/c/104447/ Support OS_TEST_DBAPI_ADMIN_CONNECTION override https://review.openstack.org/#/c/104430/ Don't drop pre-existing database before tests https://github.com/zzzeek/sqlalchemy/pull/102 sqlalchemy mysql unicode improvement, that has various workarounds available to us in the meantime - Gus ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -BEGIN PGP SIGNATURE- Version: GnuPG/MacGPG2 v2.0.22 (Darwin) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBCgAGBQJTvlqyAAoJEC5aWaUY1u57mVkH/3rLbPfVFbgOjLsDTY+2pnEo cm6gAhsGhSAIMgUcPiJrQdryD49muzQbqbSxszzL6JF4/4F6fAlm4lNNgjsCcmjn E9lbFgwDNWz6ZGkHcFwcZ1w7I88qGZRPDJvKsPYfgLN7guWGDTfPPnmN56omxplx gYnRedBGTErXJUxIiEopLf+n9cMFKK5vDFtqmcMvKaix/FMsHOiXXiuijSKwXtF2 R+2WBADhg6bP7Eu1am1LjnlkJbG9XtvBrdroPxAN4ZII3daJ8sEKfEFEIw+ZCeOy d4YewzR9WJ9hdMhV68Dmc+0iUK/PyvE1en/0vJaF/5Q55jstocmhAhKr01P339Q= =UrRc -END PGP SIGNATURE- ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [neutron][all] switch from mysqldb to another eventlet aware mysql client
-BEGIN PGP SIGNED MESSAGE- Hash: SHA512 Hi all, Multiple projects are suffering from db lock timeouts due to deadlocks deep in mysqldb library that we use to interact with mysql servers. In essence, the problem is due to missing eventlet support in mysqldb module, meaning when a db lock is encountered, the library does not yield to the next green thread, allowing other threads to eventually unlock the grabbed lock, and instead it just blocks the main thread, that eventually raises timeout exception (OperationalError). The failed operation is not retried, leaving failing request not served. In Nova, there is a special retry mechanism for deadlocks, though I think it's more a hack than a proper fix. Neutron is one of the projects that suffer from those timeout errors a lot. Partly it's due to lack of discipline in how we do nested calls in l3_db and ml2_plugin code, but that's not something to change in foreseeable future, so we need to find another solution that is applicable for Juno. Ideally, the solution should be applicable for Icehouse too to allow distributors to resolve existing deadlocks without waiting for Juno. We've had several discussions and attempts to introduce a solution to the problem. Thanks to oslo.db guys, we now have more or less clear view on the cause of the failures and how to easily fix them. The solution is to switch mysqldb to something eventlet aware. The best candidate is probably MySQL Connector module that is an official MySQL client for Python and that shows some (preliminary) good results in terms of performance. I've posted a Neutron spec for the switch to the new client in Juno at [1]. Ideally, switch is just a matter of several fixes to oslo.db that would enable full support for the new driver already supported by SQLAlchemy, plus 'connection' string modified in service configuration files, plus documentation updates to refer to the new official way to configure services for MySQL. The database code won't, ideally, require any major changes, though some adaptation for the new client library may be needed. That said, Neutron does not seem to require any changes, though it was revealed that there are some alembic migration rules in Keystone or Glance that need (trivial) modifications. You can see how trivial the switch can be achieved for a service based on example for Neutron [2]. While this is a Neutron specific proposal, there is an obvious wish to switch to the new library globally throughout all the projects, to reduce devops burden, among other things. My vision is that, ideally, we switch all projects to the new library in Juno, though we still may leave several projects for K in case any issues arise, similar to the way projects switched to oslo.messaging during two cycles instead of one. Though looking at how easy Neutron can be switched to the new library, I wouldn't expect any issues that would postpone the switch till K. It was mentioned in comments to the spec proposal that there were some discussions at the latest summit around possible switch in context of Nova that revealed some concerns, though they do not seem to be documented anywhere. So if you know anything about it, please comment. So, we'd like to hear from other projects what's your take on that move, whether you see any issues or have concerns about it. Thanks for your comments, /Ihar [1]: https://review.openstack.org/#/c/104905/ [2]: https://review.openstack.org/#/c/105209/ -BEGIN PGP SIGNATURE- Version: GnuPG/MacGPG2 v2.0.22 (Darwin) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBCgAGBQJTvSS+AAoJEC5aWaUY1u57uk0IAMNIW4e59fU8uiF7eg8KwIgU 5vjzDP4GX454Oxm0h5q3Olc0nXIeB6zSBGDoomgLk9+4AS250ihGRA/V10iDEJTF yubcvknep/ZfF+lKkmgBA3WNXJgTXffXeN2bimC6t5zJA+8Cmn3lUPu0djt0GWs7 AktufkPbVj7ZauN6w9OpW4AnoZX1fARvynCilTuHYu+lb8nQ/Hatqu5dgqdeDyRp jodLoN1ow3VYR7Cq5jocqhw719aiKLJdlUgWVHNL5A5oTR1Uxu0AdleeUzXVUvFm +EQO0Xe+slMSBgzBsgPJAiX0Vkc6kfJdFHR571QUWCXaXF1nUEIkgra/7j+0Uqs= =bgds -END PGP SIGNATURE- ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [neutron][all] switch from mysqldb to another eventlet aware mysql client
Hi Ihar, AFAIU, the switch is a matter of pip install + specifying the correct db URI in the config files. I'm not sure why you are filing a spec in Neutron project. IMHO, this has nothing to do with projects, but rather a purely deployment question. E.g. don't we have PostgreSQL+psycopg2 or MySQL+pymysql deployments of OpenStack right now? I think what you really want is to change the defaults we test in the gate, which is a different problem. Thanks, Roman On Wed, Jul 9, 2014 at 2:17 PM, Ihar Hrachyshka ihrac...@redhat.com wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA512 Hi all, Multiple projects are suffering from db lock timeouts due to deadlocks deep in mysqldb library that we use to interact with mysql servers. In essence, the problem is due to missing eventlet support in mysqldb module, meaning when a db lock is encountered, the library does not yield to the next green thread, allowing other threads to eventually unlock the grabbed lock, and instead it just blocks the main thread, that eventually raises timeout exception (OperationalError). The failed operation is not retried, leaving failing request not served. In Nova, there is a special retry mechanism for deadlocks, though I think it's more a hack than a proper fix. Neutron is one of the projects that suffer from those timeout errors a lot. Partly it's due to lack of discipline in how we do nested calls in l3_db and ml2_plugin code, but that's not something to change in foreseeable future, so we need to find another solution that is applicable for Juno. Ideally, the solution should be applicable for Icehouse too to allow distributors to resolve existing deadlocks without waiting for Juno. We've had several discussions and attempts to introduce a solution to the problem. Thanks to oslo.db guys, we now have more or less clear view on the cause of the failures and how to easily fix them. The solution is to switch mysqldb to something eventlet aware. The best candidate is probably MySQL Connector module that is an official MySQL client for Python and that shows some (preliminary) good results in terms of performance. I've posted a Neutron spec for the switch to the new client in Juno at [1]. Ideally, switch is just a matter of several fixes to oslo.db that would enable full support for the new driver already supported by SQLAlchemy, plus 'connection' string modified in service configuration files, plus documentation updates to refer to the new official way to configure services for MySQL. The database code won't, ideally, require any major changes, though some adaptation for the new client library may be needed. That said, Neutron does not seem to require any changes, though it was revealed that there are some alembic migration rules in Keystone or Glance that need (trivial) modifications. You can see how trivial the switch can be achieved for a service based on example for Neutron [2]. While this is a Neutron specific proposal, there is an obvious wish to switch to the new library globally throughout all the projects, to reduce devops burden, among other things. My vision is that, ideally, we switch all projects to the new library in Juno, though we still may leave several projects for K in case any issues arise, similar to the way projects switched to oslo.messaging during two cycles instead of one. Though looking at how easy Neutron can be switched to the new library, I wouldn't expect any issues that would postpone the switch till K. It was mentioned in comments to the spec proposal that there were some discussions at the latest summit around possible switch in context of Nova that revealed some concerns, though they do not seem to be documented anywhere. So if you know anything about it, please comment. So, we'd like to hear from other projects what's your take on that move, whether you see any issues or have concerns about it. Thanks for your comments, /Ihar [1]: https://review.openstack.org/#/c/104905/ [2]: https://review.openstack.org/#/c/105209/ -BEGIN PGP SIGNATURE- Version: GnuPG/MacGPG2 v2.0.22 (Darwin) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBCgAGBQJTvSS+AAoJEC5aWaUY1u57uk0IAMNIW4e59fU8uiF7eg8KwIgU 5vjzDP4GX454Oxm0h5q3Olc0nXIeB6zSBGDoomgLk9+4AS250ihGRA/V10iDEJTF yubcvknep/ZfF+lKkmgBA3WNXJgTXffXeN2bimC6t5zJA+8Cmn3lUPu0djt0GWs7 AktufkPbVj7ZauN6w9OpW4AnoZX1fARvynCilTuHYu+lb8nQ/Hatqu5dgqdeDyRp jodLoN1ow3VYR7Cq5jocqhw719aiKLJdlUgWVHNL5A5oTR1Uxu0AdleeUzXVUvFm +EQO0Xe+slMSBgzBsgPJAiX0Vkc6kfJdFHR571QUWCXaXF1nUEIkgra/7j+0Uqs= =bgds -END PGP SIGNATURE- ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [neutron][all] switch from mysqldb to another eventlet aware mysql client
-BEGIN PGP SIGNED MESSAGE- Hash: SHA512 On 09/07/14 15:00, Roman Podoliaka wrote: Hi Ihar, AFAIU, the switch is a matter of pip install + specifying the correct db URI in the config files. I'm not sure why you are filing a spec in Neutron project. IMHO, this has nothing to do with projects, but rather a purely deployment question. E.g. don't we have PostgreSQL+psycopg2 or MySQL+pymysql deployments of OpenStack right now? The issue was raised in Neutron because it suffers a lot from those deadlocks, and because initially I saw the switch as local to this specific project, that would lead other projects by example and not as an enforced rule. I would be glad to put the spec in some other, better place. Do you know any? I don't know whether we have other MySQL deployments not using MySQLdb; if so, they are not configured as per official documentation. See: - - http://docs.openstack.org/icehouse/install-guide/install/yum/content/basics-database-controller.html - - http://docs.openstack.org/icehouse/install-guide/install/yum/content/basics-database-node.html - - http://docs.openstack.org/icehouse/install-guide/install/yum/content/neutron-ml2-controller-node.html If we want people to use a new module, we need to update the docs not to refer to MySQLdb. Or at least inform them that deadlocks may occur when using the library [and I think this is enough to just remove any references to the library from the documentation.] I think what you really want is to change the defaults we test in the gate, which is a different problem. It's lots of different things to do: - - some work to be tracked in oslo.db (and old db code from incubator?); - - update neutron code if needed (though preliminary testing didn't reveal any specific changes for this specific project though, while others may need trivial updates); - - switch defaults in devstack; - - update documentation to refer to the new library. Thanks, Roman On Wed, Jul 9, 2014 at 2:17 PM, Ihar Hrachyshka ihrac...@redhat.com wrote: Hi all, Multiple projects are suffering from db lock timeouts due to deadlocks deep in mysqldb library that we use to interact with mysql servers. In essence, the problem is due to missing eventlet support in mysqldb module, meaning when a db lock is encountered, the library does not yield to the next green thread, allowing other threads to eventually unlock the grabbed lock, and instead it just blocks the main thread, that eventually raises timeout exception (OperationalError). The failed operation is not retried, leaving failing request not served. In Nova, there is a special retry mechanism for deadlocks, though I think it's more a hack than a proper fix. Neutron is one of the projects that suffer from those timeout errors a lot. Partly it's due to lack of discipline in how we do nested calls in l3_db and ml2_plugin code, but that's not something to change in foreseeable future, so we need to find another solution that is applicable for Juno. Ideally, the solution should be applicable for Icehouse too to allow distributors to resolve existing deadlocks without waiting for Juno. We've had several discussions and attempts to introduce a solution to the problem. Thanks to oslo.db guys, we now have more or less clear view on the cause of the failures and how to easily fix them. The solution is to switch mysqldb to something eventlet aware. The best candidate is probably MySQL Connector module that is an official MySQL client for Python and that shows some (preliminary) good results in terms of performance. I've posted a Neutron spec for the switch to the new client in Juno at [1]. Ideally, switch is just a matter of several fixes to oslo.db that would enable full support for the new driver already supported by SQLAlchemy, plus 'connection' string modified in service configuration files, plus documentation updates to refer to the new official way to configure services for MySQL. The database code won't, ideally, require any major changes, though some adaptation for the new client library may be needed. That said, Neutron does not seem to require any changes, though it was revealed that there are some alembic migration rules in Keystone or Glance that need (trivial) modifications. You can see how trivial the switch can be achieved for a service based on example for Neutron [2]. While this is a Neutron specific proposal, there is an obvious wish to switch to the new library globally throughout all the projects, to reduce devops burden, among other things. My vision is that, ideally, we switch all projects to the new library in Juno, though we still may leave several projects for K in case any issues arise, similar to the way projects switched to oslo.messaging during two cycles instead of one. Though looking at how easy Neutron can be switched to the new library, I wouldn't expect any issues that would postpone the switch till K. It was mentioned
Re: [openstack-dev] [neutron][all] switch from mysqldb to another eventlet aware mysql client
On 07/09/2014 09:00 AM, Roman Podoliaka wrote: Hi Ihar, AFAIU, the switch is a matter of pip install + specifying the correct db URI in the config files. I'm not sure why you are filing a spec in Neutron project. IMHO, this has nothing to do with projects, but rather a purely deployment question. E.g. don't we have PostgreSQL+psycopg2 or MySQL+pymysql deployments of OpenStack right now? I think what you really want is to change the defaults we test in the gate, which is a different problem. Because this is really a *new* driver. As you can see by the attempted run, it doesn't work with alembic given the definitions that neutron has. So it's not like this is currently compatible with OpenStack code. Thanks, Roman On Wed, Jul 9, 2014 at 2:17 PM, Ihar Hrachyshka ihrac...@redhat.com wrote: Hi all, Multiple projects are suffering from db lock timeouts due to deadlocks deep in mysqldb library that we use to interact with mysql servers. In essence, the problem is due to missing eventlet support in mysqldb module, meaning when a db lock is encountered, the library does not yield to the next green thread, allowing other threads to eventually unlock the grabbed lock, and instead it just blocks the main thread, that eventually raises timeout exception (OperationalError). The failed operation is not retried, leaving failing request not served. In Nova, there is a special retry mechanism for deadlocks, though I think it's more a hack than a proper fix. Neutron is one of the projects that suffer from those timeout errors a lot. Partly it's due to lack of discipline in how we do nested calls in l3_db and ml2_plugin code, but that's not something to change in foreseeable future, so we need to find another solution that is applicable for Juno. Ideally, the solution should be applicable for Icehouse too to allow distributors to resolve existing deadlocks without waiting for Juno. We've had several discussions and attempts to introduce a solution to the problem. Thanks to oslo.db guys, we now have more or less clear view on the cause of the failures and how to easily fix them. The solution is to switch mysqldb to something eventlet aware. The best candidate is probably MySQL Connector module that is an official MySQL client for Python and that shows some (preliminary) good results in terms of performance. I've posted a Neutron spec for the switch to the new client in Juno at [1]. Ideally, switch is just a matter of several fixes to oslo.db that would enable full support for the new driver already supported by SQLAlchemy, plus 'connection' string modified in service configuration files, plus documentation updates to refer to the new official way to configure services for MySQL. The database code won't, ideally, require any major changes, though some adaptation for the new client library may be needed. That said, Neutron does not seem to require any changes, though it was revealed that there are some alembic migration rules in Keystone or Glance that need (trivial) modifications. You can see how trivial the switch can be achieved for a service based on example for Neutron [2]. While this is a Neutron specific proposal, there is an obvious wish to switch to the new library globally throughout all the projects, to reduce devops burden, among other things. My vision is that, ideally, we switch all projects to the new library in Juno, though we still may leave several projects for K in case any issues arise, similar to the way projects switched to oslo.messaging during two cycles instead of one. Though looking at how easy Neutron can be switched to the new library, I wouldn't expect any issues that would postpone the switch till K. It was mentioned in comments to the spec proposal that there were some discussions at the latest summit around possible switch in context of Nova that revealed some concerns, though they do not seem to be documented anywhere. So if you know anything about it, please comment. So, we'd like to hear from other projects what's your take on that move, whether you see any issues or have concerns about it. Thanks for your comments, /Ihar [1]: https://review.openstack.org/#/c/104905/ [2]: https://review.openstack.org/#/c/105209/ ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- Sean Dague http://dague.net signature.asc Description: OpenPGP digital signature ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [neutron][all] switch from mysqldb to another eventlet aware mysql client
-BEGIN PGP SIGNED MESSAGE- Hash: SHA512 On 09/07/14 15:40, Sean Dague wrote: On 07/09/2014 09:00 AM, Roman Podoliaka wrote: Hi Ihar, AFAIU, the switch is a matter of pip install + specifying the correct db URI in the config files. I'm not sure why you are filing a spec in Neutron project. IMHO, this has nothing to do with projects, but rather a purely deployment question. E.g. don't we have PostgreSQL+psycopg2 or MySQL+pymysql deployments of OpenStack right now? I think what you really want is to change the defaults we test in the gate, which is a different problem. Because this is really a *new* driver. As you can see by the attempted run, it doesn't work with alembic given the definitions that neutron has. So it's not like this is currently compatible with OpenStack code. Well, to fix that, you just need to specify raise_on_warnings=False for connection (it's default for mysqldb but not mysql-connector). I've done it in devstack patch for now, but probably it belongs to oslo.db. Thanks, Roman On Wed, Jul 9, 2014 at 2:17 PM, Ihar Hrachyshka ihrac...@redhat.com wrote: Hi all, Multiple projects are suffering from db lock timeouts due to deadlocks deep in mysqldb library that we use to interact with mysql servers. In essence, the problem is due to missing eventlet support in mysqldb module, meaning when a db lock is encountered, the library does not yield to the next green thread, allowing other threads to eventually unlock the grabbed lock, and instead it just blocks the main thread, that eventually raises timeout exception (OperationalError). The failed operation is not retried, leaving failing request not served. In Nova, there is a special retry mechanism for deadlocks, though I think it's more a hack than a proper fix. Neutron is one of the projects that suffer from those timeout errors a lot. Partly it's due to lack of discipline in how we do nested calls in l3_db and ml2_plugin code, but that's not something to change in foreseeable future, so we need to find another solution that is applicable for Juno. Ideally, the solution should be applicable for Icehouse too to allow distributors to resolve existing deadlocks without waiting for Juno. We've had several discussions and attempts to introduce a solution to the problem. Thanks to oslo.db guys, we now have more or less clear view on the cause of the failures and how to easily fix them. The solution is to switch mysqldb to something eventlet aware. The best candidate is probably MySQL Connector module that is an official MySQL client for Python and that shows some (preliminary) good results in terms of performance. I've posted a Neutron spec for the switch to the new client in Juno at [1]. Ideally, switch is just a matter of several fixes to oslo.db that would enable full support for the new driver already supported by SQLAlchemy, plus 'connection' string modified in service configuration files, plus documentation updates to refer to the new official way to configure services for MySQL. The database code won't, ideally, require any major changes, though some adaptation for the new client library may be needed. That said, Neutron does not seem to require any changes, though it was revealed that there are some alembic migration rules in Keystone or Glance that need (trivial) modifications. You can see how trivial the switch can be achieved for a service based on example for Neutron [2]. While this is a Neutron specific proposal, there is an obvious wish to switch to the new library globally throughout all the projects, to reduce devops burden, among other things. My vision is that, ideally, we switch all projects to the new library in Juno, though we still may leave several projects for K in case any issues arise, similar to the way projects switched to oslo.messaging during two cycles instead of one. Though looking at how easy Neutron can be switched to the new library, I wouldn't expect any issues that would postpone the switch till K. It was mentioned in comments to the spec proposal that there were some discussions at the latest summit around possible switch in context of Nova that revealed some concerns, though they do not seem to be documented anywhere. So if you know anything about it, please comment. So, we'd like to hear from other projects what's your take on that move, whether you see any issues or have concerns about it. Thanks for your comments, /Ihar [1]: https://review.openstack.org/#/c/104905/ [2]: https://review.openstack.org/#/c/105209/ ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [neutron][all] switch from mysqldb to another eventlet aware mysql client
-BEGIN PGP SIGNED MESSAGE- Hash: SHA512 On 09/07/14 16:59, Roman Podoliaka wrote: Hi all, Not sure what issues you are talking about, but I just replaced mysql with mysql+mysqlconnector in my db connection string in neutron.conf and neutron-db-manage upgrade head worked like a charm for an empty schema. Have you enabled metering plugin as devstack in gate does? That's what fails. If it's not enabled, everything succeeds, as you've mentioned. If it's enabled, sqlconnector driver raise error on warning due to 'CREATE TABLE IF NOT EXISTS' statement is used. The statement raises warning because some backends do not support it. Though I don't think there's a way to rewrite the migration rule without using the statement: using alembic.op.create_table does not generate IF NOT EXISTS, and we need it for offline migration where we can't check table presence in runtime. Ihar, could please elaborate on what changes to oslo.db are needed? (as an oslo.db developer I'm very interested in this part :) ) You can find some patches at Angus dashboard [1]. As for the change I refer to, we need to set raise_on_warnings for mysqlconnector dialect in create_engine(). [1]: https://review.openstack.org/#/q/owner:%22Angus+Lees+%253Cgus%2540inodes.org%253E%22+status:open,n,z Thanks, Roman On Wed, Jul 9, 2014 at 5:43 PM, Ihar Hrachyshka ihrac...@redhat.com wrote: On 09/07/14 15:40, Sean Dague wrote: On 07/09/2014 09:00 AM, Roman Podoliaka wrote: Hi Ihar, AFAIU, the switch is a matter of pip install + specifying the correct db URI in the config files. I'm not sure why you are filing a spec in Neutron project. IMHO, this has nothing to do with projects, but rather a purely deployment question. E.g. don't we have PostgreSQL+psycopg2 or MySQL+pymysql deployments of OpenStack right now? I think what you really want is to change the defaults we test in the gate, which is a different problem. Because this is really a *new* driver. As you can see by the attempted run, it doesn't work with alembic given the definitions that neutron has. So it's not like this is currently compatible with OpenStack code. Well, to fix that, you just need to specify raise_on_warnings=False for connection (it's default for mysqldb but not mysql-connector). I've done it in devstack patch for now, but probably it belongs to oslo.db. Thanks, Roman On Wed, Jul 9, 2014 at 2:17 PM, Ihar Hrachyshka ihrac...@redhat.com wrote: Hi all, Multiple projects are suffering from db lock timeouts due to deadlocks deep in mysqldb library that we use to interact with mysql servers. In essence, the problem is due to missing eventlet support in mysqldb module, meaning when a db lock is encountered, the library does not yield to the next green thread, allowing other threads to eventually unlock the grabbed lock, and instead it just blocks the main thread, that eventually raises timeout exception (OperationalError). The failed operation is not retried, leaving failing request not served. In Nova, there is a special retry mechanism for deadlocks, though I think it's more a hack than a proper fix. Neutron is one of the projects that suffer from those timeout errors a lot. Partly it's due to lack of discipline in how we do nested calls in l3_db and ml2_plugin code, but that's not something to change in foreseeable future, so we need to find another solution that is applicable for Juno. Ideally, the solution should be applicable for Icehouse too to allow distributors to resolve existing deadlocks without waiting for Juno. We've had several discussions and attempts to introduce a solution to the problem. Thanks to oslo.db guys, we now have more or less clear view on the cause of the failures and how to easily fix them. The solution is to switch mysqldb to something eventlet aware. The best candidate is probably MySQL Connector module that is an official MySQL client for Python and that shows some (preliminary) good results in terms of performance. I've posted a Neutron spec for the switch to the new client in Juno at [1]. Ideally, switch is just a matter of several fixes to oslo.db that would enable full support for the new driver already supported by SQLAlchemy, plus 'connection' string modified in service configuration files, plus documentation updates to refer to the new official way to configure services for MySQL. The database code won't, ideally, require any major changes, though some adaptation for the new client library may be needed. That said, Neutron does not seem to require any changes, though it was revealed that there are some alembic migration rules in Keystone or Glance that need (trivial) modifications. You can see how trivial the switch can be achieved for a service based on example for Neutron [2]. While this is a Neutron specific proposal, there is an obvious wish to switch to the new library globally throughout all the projects, to