[web2py] Re: Using scheduler with facebook api calls

Andre Thu, 04 Aug 2016 13:18:12 -0700

Thank you for the thoughtful response - some great points I need to mull 
over some more.


I think there needs to be a Niphlod tip jar.

On Wednesday, August 3, 2016 at 4:29:20 PM UTC-4, Niphlod wrote:
>
> 15 minutes is a LOT for expecting a reply from an external source: sure 
> you can't reduce that timespan (maybe using "cascading" steps) ? 
> Assume this when working with ANY task-queue solution: if your 15 minutes 
> task fails at the 14th minute, you wasted 14 minutes for nothing. It holds 
> especially true for anything calling external resources (that may very well 
> be unavailable for some time, e.g. network hiccups). If instead you can 
> break up that in 5 steps of 3 minutes each, you "waste" 3 minutes at most. 
> When you face a possibly-parallelizable scenario (which is quite often true 
> in task-queue solutions), you get the additional benefit of being able to 
> balance among available "processors" each and every step. 
>
> That being said a few points on "sizing" the scheduler processes: the 
> "standard" scheduler can't really support more than 20-30 workers (no 
> SQLite, please! :P). Yep, with 50, they'll be running, but they won't churn 
> more tasks than 20, and they'll bash your backend pretty heavily. The 
> "redis-backed" one works always better, but with this, too, won't get you 
> more than 50. Now that you have the MAX limit, let's speak about what 
> really matters, that is how many concurrent tasks you'll need. 
> A single worker can process a task at a time. But it will happily process 
> 5 tasks per second (given the task ACTUALLY processes something, and 
> doesn't wait around): this translates to a single worker processing 300 
> tasks per minute, if they are already queued and fast.
> The "sweet point" you want to reach (assuming all you queue needs to be 
> processed as soon as you queue it) is where you have at least one worker 
> available to do the actual job (i.e. you have one "slot" available at the 
> moment you queue tasks).
> Let's say you are in the lower-end on the "sweet point", and assume every 
> task ends takes 5 minutes, with only one worker available (the others are 
> churning a task already)...you queue a task, and the result will be 
> available in 5 minutes. During that period, any other queued task won't be 
> processed, and if you queue 2 tasks at the same time, the result of the 
> second queued task will be available in 10 minutes, unless some workers 
> frees itself because another task has been completed.
> With 4 available workers, you can basically queue 4 tasks at the same time 
> and get back each result within 5 minutes. The fifth queued tasks' results 
> will be available in 10 minutes (again, unless some other workers frees 
> themselves).
>
> Going up the ladder one more step, from personal experience...I feel 
> inclined to say that if your users are willing to wait from 30 seconds to 
> 15 minutes, I'd hardly spin up lots of workers and leave them without work 
> to do: IMHO anything that goes on the upper end of 2 minutes doesn't need 
> to get reported to the user in 2 minutes for the simple fact it won't be 
> around to read it 2 minutes later (they probably went somewhere else in the 
> meantime and they'll be back maybe in 10 minutes, maybe the next day). A 
> simple mail at the conclusion of the whole process with "hey, the thing you 
> wanted is ready" seals the deal. 
>
> tl;dr: staying on the "lower" side won't consume unneeded resources and 
> EVEN if the task took only 5 minutes to process for some users AND your 
> server spitted up the result after 10 because it was busy processing some 
> other user's tasks.
>
> On Sunday, July 31, 2016 at 6:42:13 PM UTC+2, Andre wrote:
>>
>> Hi,
>>
>> I've created a website that utilizes the facebook api and I'd like to 
>> move my facebook requests out of the webserver request handling loop. I've 
>> played around with the scheduler and I have a working prototype in place 
>> however I'm not sure how many workers I should spawn for the scheduler. 
>> Between waiting for a response from facebook and processing the results, 
>> these "processes" can take as little as 30 seconds to upwards of 15 
>> minutes. Anyone else run into a similar problem? Would the built-in 
>> scheduler be appropriate to use? I'm thinking of just spawning a bunch of 
>> workers (25-50 or so?)... and using trial and error to hone in the right 
>> number.
>>
>> -Andre
>>
>

-- 
Resources:
- http://web2py.com
- http://web2py.com/book (Documentation)
- http://github.com/web2py/web2py (Source code)
- https://code.google.com/p/web2py/issues/list (Report Issues)
--- 
You received this message because you are subscribed to the Google Groups 
"web2py-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

[web2py] Re: Using scheduler with facebook api calls

Reply via email to