I just want to re-iterate that I still agree with myself here. Though, one issue that I can see with using asynchronous urlfetch... it will be returning results to your callback handler.. thus, I am guessing, you'll be stuck with one result per urlfetch.. and if you are putting this information to the datastore... a single db.put per urlfetch.
Which.. would be bad, and very costly for 100,000+ tasks every five minutes. So.. you're stuck balancing the cost of fetching and putting all of these items as fast as possible.. with maybe doing it a little bit slower (in batches of 100 or 1000).. and putting items to the datastore in batches. Now, I can imagine a number of fanciful approaches to trying to weasel out of this issue.. but its best for you to just start testing and see what happens. With that said.. one fun approach would be to have the callback handler stick the async fetch result into the process cache.. using some global variable.. and once that var fills up with enough items.. put them to the datastore in a batch. Sadly.. you then run into the problem of an item being orphaned in one of your GAE instances.. (the callback sticks it in memory.. but there aren't enough items for a put yet.. and that particular instance never handles another callback). Though, this could be very useful for getting maybe 75% - 95% of the work done very fast.. then you could have a follow up task that did the remaining work in a more meticulous manner (but, its hard to imagine an easy or efficient way to determine which urlfetches didn't get put to the datastore.) On Sun, Jan 16, 2011 at 12:56 AM, Robert Kluin <[email protected]>wrote: > I think Eli has a good suggestion (again), use task-chaining with > countdowns + async urlfetches in small batches. Just beware, > countdowns are only an estimate, and if the queue is backing up the > task may not run when you want. > > Just thinking about this, I would probably try to batch similarly > performing websites into small batches to monitor together. So if you > got sites that typically respond very fast group them, like-wise for > slow sites. I suspect that will help you optimize your queue layouts, > maybe you could use some queues for 'fast' and others for 'slow' > groups. Just some thoughts. > > I also agree with some of the other commenters, you should setup some > tests and see if you still feel like this is the right platform for > your app. > > > Robert > > > > > > > > > On Sat, Jan 15, 2011 at 11:03, supercobra <[email protected]> wrote: > > The countdown parameter of TaskQueue is indeed a big help here. Thanks > > for pointing that out. > > > > -- [email protected] > > http://supercobrablogger.blogspot.com/ > > > > > > > > On Fri, Jan 14, 2011 at 3:41 PM, Uros Trebec <[email protected]> > wrote: > >> re > >> > >> On Jan 14, 7:24 pm, supercobra <[email protected]> wrote: > >>> One of the challenge is to wait for 5 minutes. E.g. Fetch a URL, store > >>> results, wait 5 min, do it again. Since a queue will execute the task > >>> almost immediately (if it is empty) this would not work unless the > >>> queue is filled w/ a known number of tasks. > >>> > >>> Any suggestion welcome. > >> > >> You can use the 'countdown' parameter in Task constructor ( > >> http://code.google.com/appengine/docs/python/taskqueue/tasks.html#Task > >> ) to set the number of seconds for the Task to wait in the queue > >> before executing. I use this for scheduling a task a few minutes in > >> the future when UrlFetch returns the data I already have. > >> > >> lp, > >> Uros > >> > >> -- > >> You received this message because you are subscribed to the Google > Groups "Google App Engine" group. > >> To post to this group, send email to [email protected]. > >> To unsubscribe from this group, send email to > [email protected]<google-appengine%[email protected]> > . > >> For more options, visit this group at > http://groups.google.com/group/google-appengine?hl=en. > >> > >> > > > > -- > > You received this message because you are subscribed to the Google Groups > "Google App Engine" group. > > To post to this group, send email to [email protected]. > > To unsubscribe from this group, send email to > [email protected]<google-appengine%[email protected]> > . > > For more options, visit this group at > http://groups.google.com/group/google-appengine?hl=en. > > > > > > -- > You received this message because you are subscribed to the Google Groups > "Google App Engine" group. > To post to this group, send email to [email protected]. > To unsubscribe from this group, send email to > [email protected]<google-appengine%[email protected]> > . > For more options, visit this group at > http://groups.google.com/group/google-appengine?hl=en. > > -- You received this message because you are subscribed to the Google Groups "Google App Engine" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.
