On 07/22/2015 03:41 PM, Morgan Fainberg wrote:
This is an indicator that the bottleneck is not the db strictly speaking, but also related to the way we match. This means we need to spend some serious cycles on improving both the stored record(s) for revocation events and the matching algorithm.

The simplest approach to revocation checking is to do a linear search through the events. I think the old version of the code that did that is in a code review, and I will pull it out.

If we remove the tree, then the matching will have to run through each of the records and see if there is a match; the test will be linear with the number of records (slightly shorter if a token is actually revoked).






Sent via mobile

On Jul 22, 2015, at 11:51, Matt Fischer <[email protected] <mailto:[email protected]>> wrote:

Dolph,

Per our IRC discussion, I was unable to see any performance improvement here although not calling DELETE so often will reduce the number of deadlocks when we're under heavy load especially given the globally replicated DB we use.



On Tue, Jul 21, 2015 at 5:26 PM, Dolph Mathews <[email protected] <mailto:[email protected]>> wrote:

    Well, you might be in luck! Morgan Fainberg actually implemented
    an improvement that was apparently documented by Adam Young way
    back in March:

    https://bugs.launchpad.net/keystone/+bug/1287757

    There's a link to the stable/kilo backport in comment #2 - I'd be
    eager to hear how it performs for you!

    On Tue, Jul 21, 2015 at 5:58 PM, Matt Fischer
    <[email protected] <mailto:[email protected]>> wrote:

        Dolph,

        Excuse the delayed reply, was waiting for a brilliant
        solution from someone. Without one, personally I'd prefer the
        cronjob as it seems to be the type of thing cron was designed
        for. That will be a painful change as people now rely on this
        behavior so I don't know if its feasible. I will be setting
        up monitoring for the revocation count and alerting me if it
        crosses probably 500 or so. If the problem gets worse then I
        think a custom no-op or sql driver is the next step.

        Thanks.


        On Wed, Jul 15, 2015 at 4:00 PM, Dolph Mathews
        <[email protected] <mailto:[email protected]>> wrote:



            On Wed, Jul 15, 2015 at 4:51 PM, Matt Fischer
            <[email protected] <mailto:[email protected]>> wrote:

                I'm having some issues with keystone revocation
                events. The bottom line is that due to the way
                keystone handles the clean-up of these events[1],
                having more than a few leads to:

                 - bad performance, up to 2x slower token validation
                with about 600 events based on my perf measurements.
                 - database deadlocks, which cause API calls to fail,
                more likely with more events it seems

                I am seeing this behavior in code from trunk on June
                11 using Fernet tokens, but the token backend does
                not seem to make a difference.

                Here's what happens to the db in terms of deadlock:
                2015-07-15 21:25:41.082 31800 TRACE
                keystone.common.wsgi DBDeadlock: (OperationalError)
                (1213, 'Deadlock found when trying to get lock; try
                restarting transaction') 'DELETE FROM
                revocation_event WHERE revocation_event.revoked_at <
                %s' (datetime.datetime(2015, 7, 15, 18, 55, 41, 55186),)

                When this starts happening, I just go truncate the
                table, but this is not ideal. If [1] is really true
                then the design is not great, it sounds like keystone
                is doing a revocation event clean-up on every token
                validation call. Reading and deleting/locking from my
                db cluster is not something I want to do on every
                validate call.


            Unfortunately, that's *exactly* what keystone is doing.
            Adam and I had a conversation about this problem in
            Vancouver which directly resulted in opening the bug
            referenced on the operator list:

            https://bugs.launchpad.net/keystone/+bug/1456797

            Neither of us remembered the actual implemented behavior,
            which is what you've run into and Deepti verified in the
            bug's comments.


                So, can I turn of token revocation for now? I didn't
                see an obvious no-op driver.


            Not sure how, other than writing your own no-op driver,
            or perhaps an extended driver that doesn't try to clean
            the table on every read?

                And in the long-run can this be fixed? I'd rather do
                almost anything else, including writing a cronjob
                than what happens now.


            If anyone has a better solution than the current one,
            that's also better than requiring a cron job on something
            like keystone-manage revocation_flush I'd love to hear it.


                [1] -
                
http://lists.openstack.org/pipermail/openstack-operators/2015-June/007210.html

                
__________________________________________________________________________
                OpenStack Development Mailing List (not for usage
                questions)
                Unsubscribe:
                [email protected]?subject:unsubscribe
                
<http://[email protected]?subject:unsubscribe>
                
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



            
__________________________________________________________________________
            OpenStack Development Mailing List (not for usage questions)
            Unsubscribe:
            [email protected]?subject:unsubscribe
            
<http://[email protected]?subject:unsubscribe>
            http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



        
__________________________________________________________________________
        OpenStack Development Mailing List (not for usage questions)
        Unsubscribe:
        [email protected]?subject:unsubscribe
        <http://[email protected]?subject:unsubscribe>
        http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



    __________________________________________________________________________
    OpenStack Development Mailing List (not for usage questions)
    Unsubscribe:
    [email protected]?subject:unsubscribe
    <http://[email protected]?subject:unsubscribe>
    http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [email protected] <mailto:[email protected]>?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [email protected]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [email protected]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Reply via email to