Hey Joshua,
Some thoughts / responses / comments inline.
On Wed, Sep 7, 2011 at 10:18, Joshua Smith <[email protected]> wrote:
>
> I'm trying to port my existing M/S app to HR because I have a gun to my head
> with "Threaded Python Only for HR Apps" written on the bullets.
>
> My system will schedule meetings automatically. Scheduling a meeting can
> take some time, because a bunch of records are created, and a bunch of emails
> need to go out. So the code to schedule one looked like this:
>
> class MeetingAutoHandler(webapp.RequestHandler):
> def get(self):
> schedule = ScheduleModel.gql("WHERE next != :1 AND next < :2", None,
> datetime.datetime.now()).get()
> if schedule:
> schedule.cronAuto()
> taskqueue.add(url='/admin/meetingAuto', method='GET', countdown=1)
>
> The query looks for a schedule object that needs a meeting to to be scheduled
> now. There might be a few of these when the cron runs. So it does the hard
> work for one of them (in cronAuto()), and schedules another call to itself to
> get the next one using the task queue.
If you use the filter methods, you can do a keys only query. I
personally prefer it cosmetically as well. You might want to change
the not equal filter to an equality by either adding a has_next
boolean field or doing a lower-bound on the range filter too. It will
probably perform better for you. You could also use cursors when
inserting the next job, or continuing your work. That way you're not
getting stuck in a loop on the same entity. Let the next cron start a
the first record and pick up any left-overs from the previous run.
>
> This isn't going to work in HR because that query is going to keep finding
> the same meeting. I could trivially tweak this by setting the countdown=60,
> but I've yet to hear any of our google overlords commit to a maximum value of
> when "eventually" happens in "eventually consistent". I presume there might
> be cases, like during data center transitions, when "eventually" could be a
> very long time indeed. It is essentially unbounded. Right?
What you actually care about here is the index lag. I see it go up to
a second or so fairly often. I've not observed it going over a few
seconds though. You could just use a periodic clean-up job of some
sort to catch missed items. That would likely be the safest solution.
>
> But I like the pattern I'm using here, and I'm trying to change as little
> code as possible, so I want to put together a HR-resilient version. Here's
> what I came up with:
>
> class MeetingAutoHandler(webapp.RequestHandler):
> def get(self):
> now = datetime.datetime.now()
> for s in db.gql("SELECT __key__ FROM ScheduleModel WHERE next != :1 AND
> next < :2", None, now):
> schedule = db.get(s)
> if schedule.next and schedule.next < now:
> schedule.cronAuto()
> taskqueue.add(url='/admin/meetingAuto', method='GET', countdown=5)
> return
>
> So I'm doing a keys-only query and then doing a get() on the key. (I've
> never done a keys-only GQL query before, but I think I got it right. Note to
> google: There should be an option to Model.gql() to do keys-only queries!)
>
> The way I understand HR, that get is going to get the real Model, which might
> not meet the criteria in the gql, because the index might be out of date.
> Right?
As I recall, the entities are fetched in transactions in the
background anyway. So doing a keys only query then fetching the
results doesn't buy you anything here. You'll still need to check
each to ensure it meets your criteria and you still might miss items.
Also, as I noted above using Model.all will let you do a keys only
query.
Robert
>
> So I check that the model meets the criteria that I just specified. (Note to
> google: It'd be cool if there was a way to test an object against a query, so
> I don't have to write the same code twice!)
>
> Finally, I pushed the next task out a bit, to make it less likely that I'll
> have to look at the same objects over and over.
>
> So what do you think? Any suggestions? (I have a couple things that work
> this way, so I want to choose a good design pattern to apply to each of them.)
>
> The complexity would be lessened if I could to this:
>
> class MeetingAutoHandler(webapp.RequestHandler):
> def get(self):
> q = ScheduleModel.gql_keys_only("WHERE next != :1 AND next < :2", None,
> datetime.datetime.now())
> for s in q:
> schedule = db.get(s)
> if q.matches(schedule):
> schedule.cronAuto()
> taskqueue.add(url='/admin/meetingAuto', method='GET', countdown=5)
> return
>
> This would require two changes: the db.Model would need to support
> gql_keys_only (that's probably trivial); GqlQuery would need a matches()
> method (that's probably not trivial).
>
> It's still a few more lines, but the complexity is about the same as the old
> one.
>
> Worth the trouble of a couple feature request issues?
>
> -Joshua
>
> --
> You received this message because you are subscribed to the Google Groups
> "Google App Engine" group.
> To post to this group, send email to [email protected].
> To unsubscribe from this group, send email to
> [email protected].
> For more options, visit this group at
> http://groups.google.com/group/google-appengine?hl=en.
>
>
--
You received this message because you are subscribed to the Google Groups
"Google App Engine" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to
[email protected].
For more options, visit this group at
http://groups.google.com/group/google-appengine?hl=en.