-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 04/14/2011 01:13 PM, Jason L Connor wrote: > On Thu, 2011-04-14 at 11:50 -0400, Jay Dobies wrote: >> Why do we need to query the database for tasks? Can we keep the task >> stuff in memory and snap out its state when it changes? Or are you >> trying to solve the cross-process task question at the same time? > > There's a few of reasons: > 1. I want to get the task persistence stuff working, we can look at > what optimizations we need once it does
I'd argue this, but the other points make it not worth it. > 2. I'm currently trying to keep the multi-process deployment option > open and volatile memory storage is not conducive to that Fair enough. You know if you didn't keep that door open, we'd run into a case where we need it. > 3. I'm trying not to introduce any task state consistency bugs, at > least, not initially > 4. To be honest, dequeueing tasks, running tasks, timing out tasks, > and canceling tasks (i.e. what the dispatcher does), all > represent state changes and most would have to hit the db anyway That makes sense. A possibility would be to have the writes to update those states on a separate frequency to update the database, queuing them up in the meantime. But that's probably overkill; we're not in the business of designing uber python tasking systems. > I'm think that once the persistent stuff actually works I can revisit it > looking for optimizations and features needed to support multi-process > access (if we decided to go that route). > > In the meantime, I was thinking about a 30 second delay between task > queue checks. With an on demand dispatcher wake up whenever a new task > is enqueued. This should keep our async sub-system fairly responsive in > terms of repo syncs and the like while keep db io down to something > reasonable. You lost me after the 30 seconds part. How will the on demand dispatcher resolve handling situations where there are other tasks ahead of it in line? So let's take a case of a single thread doing stuff. It just popped a new task off the queue, beginning the 30 second timer. Left on the queue are tasks B, C, and D. During that 30 second timeframe, a request to sync a repo comes in. I'd expect it to be put on the queue to execute after D, but the way I read your comment it'll jump to the front of the line by being run by the on demand dispatcher. I'm probably missing something though, so maybe some more explanation will help clear it up. Also (again, I don't know how you're implementing it), based on the discussions in this thread it sounds like the queue of tasks will only exist in the database and not in memory. How is that going to affect the uniqueness check? Does that become a series of database retrievals or can we do that check atomically in the DB? > It fits in with the time granularity of 1 minute that I've been > advertising as well. > > > Any other thoughts? 17 minutes. We'll let the star wars ascii movie keep them entertained in the meantime. > > > > _______________________________________________ > Pulp-list mailing list > [email protected] > https://www.redhat.com/mailman/listinfo/pulp-list - -- Jay Dobies RHCE# 805008743336126 Freenode: jdob http://pulpproject.org -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.14 (GNU/Linux) Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/ iQEcBAEBAgAGBQJNpy7FAAoJEOMmcTqOSQHCByQIAJOidUxxtSXuN7MHKLooK0jm cexlytaKnaAVew4U6hZ6lJCF+b3SlExV4wk3gwCeEHwlR501wQbmsFKyAuu+2jvr uoEHU0+5kN9LNwjyM7wVpmXihCkvYVfZcn8Gwp/wnyNSlKAWBxmG3JE29F4UACsJ 2rdqB8locXMcNtMwXx1SP1CvQ7u13ufeGXPZDsHEvtFfjO+W2ztL3nhZZRSoIUe/ s4SqvkhKOoCzJMdI6LEiGtdQBLxUftcNgfPqRivEO+mNFOmgxnT9/ssNEcn9cTFm YDJdSp4a5kXSN9+sI+Qn2z94V19MHW94IaUxbI8xFXJgaL/XtrKCvciidCPZNWM= =wL4b -----END PGP SIGNATURE----- _______________________________________________ Pulp-list mailing list [email protected] https://www.redhat.com/mailman/listinfo/pulp-list
