On Mon, Jan 17, 2011 at 3:08 AM, Eli Jones <[email protected]> wrote:
> I just want to re-iterate that I still agree with myself here. > > Though, one issue that I can see with using asynchronous urlfetch... it > will be returning results to your callback handler.. thus, I am guessing, > you'll be stuck with one result per urlfetch.. and if you are putting this > information to the datastore... a single db.put per urlfetch. > > Which.. would be bad, and very costly for 100,000+ tasks every five > minutes. So.. you're stuck balancing the cost of fetching and putting all > of these items as fast as possible.. with maybe doing it a little bit slower > (in batches of 100 or 1000).. and putting items to the datastore in batches. > An easy way to avoid this is to kick off a batch of asynchronous URLFetch requests, wait for all of them to complete, then put the results to the datastore in a single batch put. -Nick Johnson > > Now, I can imagine a number of fanciful approaches to trying to weasel out > of this issue.. but its best for you to just start testing and see what > happens. > > With that said.. one fun approach would be to have the callback handler > stick the async fetch result into the process cache.. using some global > variable.. and once that var fills up with enough items.. put them to the > datastore in a batch. > > Sadly.. you then run into the problem of an item being orphaned in one of > your GAE instances.. (the callback sticks it in memory.. but there aren't > enough items for a put yet.. and that particular instance never handles > another callback). > > Though, this could be very useful for getting maybe 75% - 95% of the work > done very fast.. then you could have a follow up task that did the remaining > work in a more meticulous manner (but, its hard to imagine an easy or > efficient way to determine which urlfetches didn't get put to the > datastore.) > > > On Sun, Jan 16, 2011 at 12:56 AM, Robert Kluin <[email protected]>wrote: > >> I think Eli has a good suggestion (again), use task-chaining with >> countdowns + async urlfetches in small batches. Just beware, >> countdowns are only an estimate, and if the queue is backing up the >> task may not run when you want. >> >> Just thinking about this, I would probably try to batch similarly >> performing websites into small batches to monitor together. So if you >> got sites that typically respond very fast group them, like-wise for >> slow sites. I suspect that will help you optimize your queue layouts, >> maybe you could use some queues for 'fast' and others for 'slow' >> groups. Just some thoughts. >> >> I also agree with some of the other commenters, you should setup some >> tests and see if you still feel like this is the right platform for >> your app. >> >> >> Robert >> >> >> >> >> >> >> >> >> On Sat, Jan 15, 2011 at 11:03, supercobra <[email protected]> wrote: >> > The countdown parameter of TaskQueue is indeed a big help here. Thanks >> > for pointing that out. >> > >> > -- [email protected] >> > http://supercobrablogger.blogspot.com/ >> > >> > >> > >> > On Fri, Jan 14, 2011 at 3:41 PM, Uros Trebec <[email protected]> >> wrote: >> >> re >> >> >> >> On Jan 14, 7:24 pm, supercobra <[email protected]> wrote: >> >>> One of the challenge is to wait for 5 minutes. E.g. Fetch a URL, store >> >>> results, wait 5 min, do it again. Since a queue will execute the task >> >>> almost immediately (if it is empty) this would not work unless the >> >>> queue is filled w/ a known number of tasks. >> >>> >> >>> Any suggestion welcome. >> >> >> >> You can use the 'countdown' parameter in Task constructor ( >> >> http://code.google.com/appengine/docs/python/taskqueue/tasks.html#Task >> >> ) to set the number of seconds for the Task to wait in the queue >> >> before executing. I use this for scheduling a task a few minutes in >> >> the future when UrlFetch returns the data I already have. >> >> >> >> lp, >> >> Uros >> >> >> >> -- >> >> You received this message because you are subscribed to the Google >> Groups "Google App Engine" group. >> >> To post to this group, send email to [email protected] >> . >> >> To unsubscribe from this group, send email to >> [email protected]<google-appengine%[email protected]> >> . >> >> For more options, visit this group at >> http://groups.google.com/group/google-appengine?hl=en. >> >> >> >> >> > >> > -- >> > You received this message because you are subscribed to the Google >> Groups "Google App Engine" group. >> > To post to this group, send email to [email protected]. >> > To unsubscribe from this group, send email to >> [email protected]<google-appengine%[email protected]> >> . >> > For more options, visit this group at >> http://groups.google.com/group/google-appengine?hl=en. >> > >> > >> >> -- >> You received this message because you are subscribed to the Google Groups >> "Google App Engine" group. >> To post to this group, send email to [email protected]. >> To unsubscribe from this group, send email to >> [email protected]<google-appengine%[email protected]> >> . >> For more options, visit this group at >> http://groups.google.com/group/google-appengine?hl=en. >> >> > -- > You received this message because you are subscribed to the Google Groups > "Google App Engine" group. > To post to this group, send email to [email protected]. > To unsubscribe from this group, send email to > [email protected]<google-appengine%[email protected]> > . > For more options, visit this group at > http://groups.google.com/group/google-appengine?hl=en. > -- Nick Johnson, Developer Programs Engineer, App Engine Google Ireland Ltd. :: Registered in Dublin, Ireland, Registration Number: 368047 -- You received this message because you are subscribed to the Google Groups "Google App Engine" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.
