On Thursday, August 4, 2016 at 1:17:04 PM UTC-7, Andre wrote: > > Thank you for the thoughtful response - some great points I need to mull > over some more. > > I think there needs to be a Niphlod tip jar. >
I think the way we can best make him happy is by adding test coverage. (I have that on my list, but I'm not yet over-achieving. ) /dps > On Wednesday, August 3, 2016 at 4:29:20 PM UTC-4, Niphlod wrote: >> >> 15 minutes is a LOT for expecting a reply from an external source: sure >> you can't reduce that timespan (maybe using "cascading" steps) ? >> Assume this when working with ANY task-queue solution: if your 15 minutes >> task fails at the 14th minute, you wasted 14 minutes for nothing. It holds >> especially true for anything calling external resources (that may very well >> be unavailable for some time, e.g. network hiccups). If instead you can >> break up that in 5 steps of 3 minutes each, you "waste" 3 minutes at most. >> When you face a possibly-parallelizable scenario (which is quite often true >> in task-queue solutions), you get the additional benefit of being able to >> balance among available "processors" each and every step. >> >> That being said a few points on "sizing" the scheduler processes: the >> "standard" scheduler can't really support more than 20-30 workers (no >> SQLite, please! :P). Yep, with 50, they'll be running, but they won't churn >> more tasks than 20, and they'll bash your backend pretty heavily. The >> "redis-backed" one works always better, but with this, too, won't get you >> more than 50. Now that you have the MAX limit, let's speak about what >> really matters, that is how many concurrent tasks you'll need. >> A single worker can process a task at a time. But it will happily process >> 5 tasks per second (given the task ACTUALLY processes something, and >> doesn't wait around): this translates to a single worker processing 300 >> tasks per minute, if they are already queued and fast. >> The "sweet point" you want to reach (assuming all you queue needs to be >> processed as soon as you queue it) is where you have at least one worker >> available to do the actual job (i.e. you have one "slot" available at the >> moment you queue tasks). >> Let's say you are in the lower-end on the "sweet point", and assume every >> task ends takes 5 minutes, with only one worker available (the others are >> churning a task already)...you queue a task, and the result will be >> available in 5 minutes. During that period, any other queued task won't be >> processed, and if you queue 2 tasks at the same time, the result of the >> second queued task will be available in 10 minutes, unless some workers >> frees itself because another task has been completed. >> With 4 available workers, you can basically queue 4 tasks at the same >> time and get back each result within 5 minutes. The fifth queued tasks' >> results will be available in 10 minutes (again, unless some other workers >> frees themselves). >> >> Going up the ladder one more step, from personal experience...I feel >> inclined to say that if your users are willing to wait from 30 seconds to >> 15 minutes, I'd hardly spin up lots of workers and leave them without work >> to do: IMHO anything that goes on the upper end of 2 minutes doesn't need >> to get reported to the user in 2 minutes for the simple fact it won't be >> around to read it 2 minutes later (they probably went somewhere else in the >> meantime and they'll be back maybe in 10 minutes, maybe the next day). A >> simple mail at the conclusion of the whole process with "hey, the thing you >> wanted is ready" seals the deal. >> >> tl;dr: staying on the "lower" side won't consume unneeded resources and >> EVEN if the task took only 5 minutes to process for some users AND your >> server spitted up the result after 10 because it was busy processing some >> other user's tasks. >> >> On Sunday, July 31, 2016 at 6:42:13 PM UTC+2, Andre wrote: >>> >>> Hi, >>> >>> I've created a website that utilizes the facebook api and I'd like to >>> move my facebook requests out of the webserver request handling loop. I've >>> played around with the scheduler and I have a working prototype in place >>> however I'm not sure how many workers I should spawn for the scheduler. >>> Between waiting for a response from facebook and processing the results, >>> these "processes" can take as little as 30 seconds to upwards of 15 >>> minutes. Anyone else run into a similar problem? Would the built-in >>> scheduler be appropriate to use? I'm thinking of just spawning a bunch of >>> workers (25-50 or so?)... and using trial and error to hone in the right >>> number. >>> >>> -Andre >>> >> -- Resources: - http://web2py.com - http://web2py.com/book (Documentation) - http://github.com/web2py/web2py (Source code) - https://code.google.com/p/web2py/issues/list (Report Issues) --- You received this message because you are subscribed to the Google Groups "web2py-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.

