Hi everyone,
A new change (https://review.openstack.org/101982) has been proposed to improve vxlan pool initiation with an improvement on delete of obsolete unallocated vnis using a unique delete SQL command. I've tested performance with the following (delete only) scenario: vxlan range is changed from 0:100000 to 50000:100000. The scenario code is available here: http://paste.openstack.org/show/84882 50k vnis to deletePostgresql MySQL Sqlite current code 6,0 5,5 5,1 proposed code 3,2 3,3 3,2 The gain is from 40% to 50%. Raw results: http://paste.openstack.org/show/84890 On Mon, Jun 9, 2014 at 3:38 PM, Eugene Nikanorov <enikano...@mirantis.com> wrote: > Mike, > > Thanks a lot for your response! > Some comments: > > There's some in-Python filtering following it which does not seem > necessary; the "alloc.vxlan_vni not in vxlan_vnis" phrase > > could just as well be a SQL "NOT IN" expression. > There we have to do specific set intersection between configured ranges > and existing allocation. That could be done in sql, > but that certainly would lead to a huge sql query text as full vxlan range > could consist of 16 millions of ids. > > > The synchronize_session="fetch" is certainly a huge part of the time > spent here > You've actually made a good point about synchronize_session="fetch" which > was obviously misused by me. > It seems to save up to 40% of plain deleting time. > > I've fixed that and get some speedup with deletes for both mysql and > postgress that reduced difference between chunked/non-chunked version: > > 50k vnis to add/deletePg adding vnisPg deleting vnis Pg TotalMysql > adding vnis Mysql deleting vnisMysql totalnon-chunked sql 221537 151530 chuked > in 10020 133314 1428 > > Results of chunked and non-chunked version look closer, but gap increases > with vni range size (based on few tests of 150k vni range) > > So I'm going to fix chunked version that is on review now. If you think > that the benefit doesn't worth complexity - please let me know. > > Thanks, > Eugene. > > On Mon, Jun 9, 2014 at 1:33 AM, Mike Bayer <mba...@redhat.com> wrote: > >> >> On Jun 7, 2014, at 4:38 PM, Eugene Nikanorov <enikano...@mirantis.com> >> wrote: >> >> Hi folks, >> >> There was a small discussion about the better way of doing sql operations >> for vni synchronization with the config. >> Initial proposal was to handle those in chunks. Carl also suggested to >> issue a single sql query. >> I've did some testing with my sql and postgress. >> I've tested the following scenario: vxlan range is changed from >> 50000:150000 to 0:100000 and vice versa. >> That involves adding and deleting 50000 vni in each test. >> >> Here are the numbers: >> 50k vnis to add/deletePg adding vnisPg deleting vnis Pg TotalMysql >> adding vnis Mysql deleting vnisMysql totalnon-chunked sql 232245 142034 >> chunked >> in 10020 173714 1731 >> >> I've done about 5 tries to get each number to minimize random floating >> factor (due to swaps, disc or cpu activity or other factors) >> That might be surprising that issuing multiple sql statements instead one >> big is little bit more efficient, so I would appreciate if someone could >> reproduce those numbers. >> Also I'd like to note that part of code that iterates over vnis fetched >> from db is taking 10 seconds both on mysql and postgress and is a part of >> "deleting vnis" numbers. >> In other words, difference between multiple DELETE sql statements and >> single one is even bigger (in percent) than these numbers show. >> >> The code which I used to test is here: >> http://paste.openstack.org/show/83298/ >> Right now the chunked version is commented out, so to switch between >> versions some lines should be commented and some - uncommented. >> >> >> I've taken a look at this, though I'm not at the point where I have >> things set up to run things like this within full context, and I don't know >> that I have any definitive statements to make, but I do have some >> suggestions: >> >> 1. I do tend to chunk things a lot, selects, deletes, inserts, though the >> chunk size I work with is typically more like 1000, rather than 100. When >> chunking, we're looking to select a size that doesn't tend to overload the >> things that are receiving the data (query buffers, structures internal to >> both SQLAlchemy as well as the DBAPI and the relational database), but at >> the same time doesn't lead to too much repetition on the Python side (where >> of course there's a lot of slowness). >> >> 2. Specifically regarding "WHERE x IN (.....)", I always chunk those. When >> we use IN with a list of values, we're building an actual SQL string that >> becomes enormous. This puts strain on the database's query engine that is >> not optimized for SQL strings that are hundreds of thousands of characters >> long, and on some backends this size is limited; on Oracle, there's a limit >> of 1000 items. So I'd always chunk this kind of thing. >> >> 3. I'm not sure of the broader context of this code, but in fact placing >> a literal list of items in the IN in this case seems unnecessary; the >> "vmis_to_remove" list itself was just SELECTed two lines above. There's >> some in-Python filtering following it which does not seem necessary; the " >> alloc.vxlan_vni not in vxlan_vnis" phrase could just as well be a SQL >> "NOT IN" expression. Not sure if determination of the ".allocated" flag >> can be done in SQL, if that's a plain column, then certainly. Again not >> sure if this is just an artifact of how the test is done here, but if the >> goal is to optimize this code for speed, doing a DELETE...WHERE .. IN (SELECT >> ..) is probably better. I see that the SELECT is using a lockmode, but it >> would seem that if just the rows we care to DELETE are inlined within the >> DELETE itself this wouldn't be needed either. >> >> It's likely that everything in #3 is pretty obvious already and there's >> reasons it's the way it is, but I'm just learning all of these codebases so >> feel free to point out more of the background for me. >> >> 4. The synchronize_session="fetch" is certainly a huge part of the time >> spent here, and it seems unclear why this synchronize is necessary. When I >> use query.delete() I never use "fetch"; I either have synchronization >> turned off, as the operation is not dealing with any set of objects already >> in play, or I use "evaluate" which here is not possible with the IN (though >> there is a SQLAlchemy ticket for many years to implement "evaluate" using >> "IN (values)" that is pretty easy to implement, but if the query became an >> "IN (SELECT ...)" that again would not be feasible). >> >> 5. I don't have a great theory on why chunking does better here on the >> INSERT. My vague notion here is that as with the DELETE, the systems in >> play do better when they aren't tasked with building up very large internal >> buffers for operations, but that's not something I have the background to >> prove. >> >> These are all just some impressions and as I'm totally new to this code >> base I may be way off, so please feel to help me get up to speed ! >> >> - mike >> >> >> >> _______________________________________________ >> OpenStack-dev mailing list >> OpenStack-dev@lists.openstack.org >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >> >> > > _______________________________________________ > OpenStack-dev mailing list > OpenStack-dev@lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > >
_______________________________________________ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev