Re: [web2py] Re: Scheduler: help us test it while learning

Niphlod Mon, 20 Aug 2012 14:44:53 -0700

hey, more point of views, more eyes on the code = less errors in the code, 
more understandable docs, etc. 
Opensource development bases.
And I like smart questions :-P


On Monday, August 20, 2012 11:41:15 PM UTC+2, Yarin wrote:
>
> OK i've come around- agree this is the right set up, let's just make sure 
> it's clear in the eventual documentation, as it wasn't obvious to me (not 
> much is these days..) - both retries and repeats respect the period. Cool, 
> i like it. 
>
>
> On Sunday, August 19, 2012 7:13:15 AM UTC-4, Niphlod wrote:
>>
>> I didn't say that you must have to handle exceptions exclusively in your 
>> functions, but that if you want a functionality of the kind "execute this 
>> for the next 2 minutes and retry ASAP 3 times at most" and still you want 
>> to have a single scheduler_task record it's the way to go.  Sometimes your 
>> functions relies on third-party services that are not "handable" in your 
>> functions: you can manage the exception but you still want to execute that 
>> function (e.g. you want to send an email but your email server doesn't 
>> reply). The mail should be sent anyway, possibly as soon as the email 
>> server is available again.... here's where the retry_failed comes handy. Of 
>> course if it fails e.g. for 10 times it's better to stop trying and inspect 
>> the email server :P
>>
>> On Saturday, August 18, 2012 11:48:41 PM UTC+2, Yarin wrote:
>>>
>>> OK i didn't understand that retries happened periodically- i indeed 
>>> thought that it would retry right away, though i agree with you that that 
>>> should be handled at the function level. But if we're handling failures 
>>> within the scheduled function, then now im wondering what is the value in 
>>> having retries at all? Just because the scheduler is running asynchronously 
>>> does not mean it should necessarily be responsible for the scheduled 
>>> functions' unhandled exceptions (which is what the failures are, right)? In 
>>> other words, since our scheduler is scheduling we2py functions in a known 
>>> environment (unlike environment agnostic task-queue systems, which don't 
>>> know how their operations will resolve), shouldn't the onus be on the 
>>> scheduled function to handle failures and reschedule if necessary? Maybe we 
>>> should clarify this before discussing the rest- i may be missing something.
>>>
>>> btw im leaving for the night but am interested in finishing the 
>>> discussion- ill be back in the morning if you dont hear from me.. 
>>>
>>> On Saturday, August 18, 2012 5:14:46 PM UTC-4, Niphlod wrote:
>>>>
>>>> Ok, got the example (but not the "the last go-round is forgotten").
>>>> Let's start saying that your requirements can be fullfilled (simply) 
>>>> decorating your function in a loop and break after the first successful 
>>>> attempt (and repeats=0, retry_failed=-1). Given that, the current 
>>>> behaviour 
>>>> is not properly a limit to what you are trying to achieve, it's only a 
>>>> matter on how implement the requeue facilities on the scheduler.
>>>>
>>>> Lets keep the discussion open...if I got it correctly you're basically 
>>>> asking to ignore period for failed tasks (requeue them and execute ASAP) 
>>>> and reset counters accordingly... right ? Period right now ensures that no 
>>>> more than one task gets executed in n period seconds (and protects you 
>>>> from 
>>>> "flapping", i.e. a continously failing function, and is somewhat required 
>>>> e.g. for respecting webservices API limits, avoid "db pressure" if you're 
>>>> doing heavy operations, etc, etc). 
>>>> Respecting period in every case is "consistency" for me (because I 
>>>> decided that I can "afford" (or "consume" resources) executing that 
>>>> function only one time every hour). 
>>>> You are suggesting to alter this for repeating tasks....what I didn't 
>>>> get is that is required always or only when repeats=0 (that is, 
>>>> incidentally, not consistent :P) ?!
>>>>  
>>>> i.e. What behaviour should you expect from (repeats=2, retry_failed=3, 
>>>> period=3600) ? 
>>>> 2.00 am, failed
>>>> 2.00 am, failed
>>>> 2.00 am, completed
>>>> 3.00 am, failed
>>>> 3.00 am, failed
>>>> 3.00 am, failed 
>>>> ? 
>>>> This is basically what I'm missing. What could possibly be wrong at 
>>>> 2.00am and be right a few seconds later ?
>>>>
>>>> On Saturday, August 18, 2012 10:32:14 PM UTC+2, Yarin wrote:
>>>>>
>>>>> I think retry_failed and repeats are two distinct concepts and 
>>>>> shouldn't be mixed.
>>>>>
>>>>> For example, a task set to (repeats=0, retry_failed=0, period=3600) 
>>>>> should be able to fail at 2:00pm, but  will try again at 3:00pm 
>>>>> regardless 
>>>>> of what happened at 2:00. Likewise, if it was set to (repeats=0, 
>>>>> retry_failed=2,period=3600), and failed all three times at 2:00pm, the 
>>>>> retry count should be reset on the next go around. 
>>>>>
>>>>> I think it's safer to presume that if a task is set up for indefinite 
>>>>> repitition, a failure on one repeat should not bring down the whole task- 
>>>>> rather the transactional unit that constitutes a failure should be 
>>>>> limited 
>>>>> to the any given attempt, repeated or not.
>>>>>
>>>>> This was one of the reasons i pressed for renaming repeats_failed to 
>>>>> retry_failed- distinct concepts
>>>>>
>>>>>

--

Re: [web2py] Re: Scheduler: help us test it while learning

Reply via email to