Re: [openstack-dev] [Neutron] One performance issue about VXLAN pool initiation

ZZelle Wed, 25 Jun 2014 08:02:58 -0700

Hi everyone,


A new change (https://review.openstack.org/101982) has been proposed to
improve vxlan pool initiation with an improvement on delete of obsolete
unallocated vnis using a unique delete SQL command.
I've tested performance with the following (delete only) scenario: vxlan
range is changed from 0:100000 to 50000:100000.
The scenario code is available here: http://paste.openstack.org/show/84882

50k vnis to deletePostgresql
MySQL
Sqlite
current code
6,0
5,5
5,1
proposed code
3,2
3,3
3,2


The gain is from 40% to 50%.


Raw results: http://paste.openstack.org/show/84890







On Mon, Jun 9, 2014 at 3:38 PM, Eugene Nikanorov <enikano...@mirantis.com>
wrote:

> Mike,
>
> Thanks a lot for your response!
> Some comments:
> > There's some in-Python filtering following it which does not seem
> necessary; the "alloc.vxlan_vni not in vxlan_vnis" phrase
> > could just as well be a SQL "NOT IN" expression.
> There we have to do specific set intersection between configured ranges
> and existing allocation. That could be done in sql,
> but that certainly would lead to a huge sql query text as full vxlan range
> could consist of 16 millions of ids.
>
> >  The synchronize_session="fetch" is certainly a huge part of the time
> spent here
> You've actually made a good point about synchronize_session="fetch" which
> was obviously misused by me.
> It seems to save up to 40% of plain deleting time.
>
> I've fixed that and get some speedup with deletes for both mysql and
> postgress that reduced difference between chunked/non-chunked version:
>
>  50k vnis to add/deletePg adding vnisPg deleting vnis Pg TotalMysql
> adding vnis Mysql deleting vnisMysql totalnon-chunked sql 221537 151530 chuked
> in 10020 133314 1428
>
> Results of chunked and non-chunked version look closer, but gap increases
> with vni range size (based on few tests of 150k vni range)
>
> So I'm going to fix chunked version that is on review now. If you think
> that the benefit doesn't worth complexity - please let me know.
>
> Thanks,
> Eugene.
>
> On Mon, Jun 9, 2014 at 1:33 AM, Mike Bayer <mba...@redhat.com> wrote:
>
>>
>> On Jun 7, 2014, at 4:38 PM, Eugene Nikanorov <enikano...@mirantis.com>
>> wrote:
>>
>> Hi folks,
>>
>> There was a small discussion about the better way of doing sql operations
>> for vni synchronization with the config.
>> Initial proposal was to handle those in chunks. Carl also suggested to
>> issue a single sql query.
>> I've did some testing with my sql and postgress.
>> I've tested the following scenario: vxlan range is changed from
>> 50000:150000 to 0:100000 and vice versa.
>> That involves adding and deleting 50000 vni in each test.
>>
>> Here are the numbers:
>>  50k vnis to add/deletePg adding vnisPg deleting vnis Pg TotalMysql
>> adding vnis Mysql deleting vnisMysql totalnon-chunked sql 232245 142034 
>> chunked
>> in 10020 173714 1731
>>
>> I've done about 5 tries to get each number to minimize random floating
>> factor (due to swaps, disc or cpu activity or other factors)
>> That might be surprising that issuing multiple sql statements instead one
>> big is little bit more efficient, so I would appreciate if someone could
>> reproduce those numbers.
>> Also I'd like to note that part of code that iterates over vnis fetched
>> from db is taking 10 seconds both on mysql and postgress and is a part of
>> "deleting vnis" numbers.
>> In other words, difference between multiple DELETE sql statements and
>> single one is even bigger (in percent) than these numbers show.
>>
>> The code which I used to test is here:
>> http://paste.openstack.org/show/83298/
>> Right now the chunked version is commented out, so to switch between
>> versions some lines should be commented and some - uncommented.
>>
>>
>> I've taken a look at this, though I'm not at the point where I have
>> things set up to run things like this within full context, and I don't know
>> that I have any definitive statements to make, but I do have some
>> suggestions:
>>
>> 1. I do tend to chunk things a lot, selects, deletes, inserts, though the
>> chunk size I work with is typically more like 1000, rather than 100.   When
>> chunking, we're looking to select a size that doesn't tend to overload the
>> things that are receiving the data (query buffers, structures internal to
>> both SQLAlchemy as well as the DBAPI and the relational database), but at
>> the same time doesn't lead to too much repetition on the Python side (where
>> of course there's a lot of slowness).
>>
>> 2. Specifically regarding "WHERE x IN (.....)", I always chunk those.  When
>> we use IN with a list of values, we're building an actual SQL string that
>> becomes enormous.  This puts strain on the database's query engine that is
>> not optimized for SQL strings that are hundreds of thousands of characters
>> long, and on some backends this size is limited; on Oracle, there's a limit
>> of 1000 items.   So I'd always chunk this kind of thing.
>>
>> 3. I'm not sure of the broader context of this code, but in fact placing
>> a literal list of items in the IN in this case seems unnecessary; the
>> "vmis_to_remove" list itself was just SELECTed two lines above.   There's
>> some in-Python filtering following it which does not seem necessary; the "
>> alloc.vxlan_vni not in vxlan_vnis" phrase could just as well be a SQL
>> "NOT IN" expression.  Not sure if determination of the ".allocated" flag
>> can be done in SQL, if that's a plain column, then certainly.    Again not
>> sure if this is just an artifact of how the test is done here, but if the
>> goal is to optimize this code for speed, doing a DELETE...WHERE .. IN (SELECT
>> ..) is probably better.   I see that the SELECT is using a lockmode, but it
>> would seem that if just the rows we care to DELETE are inlined within the
>> DELETE itself this wouldn't be needed either.
>>
>> It's likely that everything in #3 is pretty obvious already and there's
>> reasons it's the way it is, but I'm just learning all of these codebases so
>> feel free to point out more of the background for me.
>>
>> 4. The synchronize_session="fetch" is certainly a huge part of the time
>> spent here, and it seems unclear why this synchronize is necessary.  When I
>> use query.delete() I never use "fetch"; I either have synchronization
>> turned off, as the operation is not dealing with any set of objects already
>> in play, or I use "evaluate" which here is not possible with the IN (though
>> there is a SQLAlchemy ticket for many years to implement "evaluate" using
>> "IN (values)" that is pretty easy to implement, but if the query became an
>> "IN (SELECT ...)" that again would not be feasible).
>>
>> 5. I don't have a great theory on why chunking does better here on the
>> INSERT.   My vague notion here is that as with the DELETE, the systems in
>> play do better when they aren't tasked with building up very large internal
>> buffers for operations, but that's not something I have the background to
>> prove.
>>
>> These are all just some impressions and as I'm totally new to this code
>> base I may be way off, so please feel to help me get up to speed !
>>
>> - mike
>>
>>
>>
>> _______________________________________________
>> OpenStack-dev mailing list
>> OpenStack-dev@lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>>
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>

_______________________________________________
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Neutron] One performance issue about VXLAN pool initiation

Reply via email to