hey, more point of views, more eyes on the code = less errors in the code, more understandable docs, etc. Opensource development bases. And I like smart questions :-P
On Monday, August 20, 2012 11:41:15 PM UTC+2, Yarin wrote: > > OK i've come around- agree this is the right set up, let's just make sure > it's clear in the eventual documentation, as it wasn't obvious to me (not > much is these days..) - both retries and repeats respect the period. Cool, > i like it. > > > On Sunday, August 19, 2012 7:13:15 AM UTC-4, Niphlod wrote: >> >> I didn't say that you must have to handle exceptions exclusively in your >> functions, but that if you want a functionality of the kind "execute this >> for the next 2 minutes and retry ASAP 3 times at most" and still you want >> to have a single scheduler_task record it's the way to go. Sometimes your >> functions relies on third-party services that are not "handable" in your >> functions: you can manage the exception but you still want to execute that >> function (e.g. you want to send an email but your email server doesn't >> reply). The mail should be sent anyway, possibly as soon as the email >> server is available again.... here's where the retry_failed comes handy. Of >> course if it fails e.g. for 10 times it's better to stop trying and inspect >> the email server :P >> >> On Saturday, August 18, 2012 11:48:41 PM UTC+2, Yarin wrote: >>> >>> OK i didn't understand that retries happened periodically- i indeed >>> thought that it would retry right away, though i agree with you that that >>> should be handled at the function level. But if we're handling failures >>> within the scheduled function, then now im wondering what is the value in >>> having retries at all? Just because the scheduler is running asynchronously >>> does not mean it should necessarily be responsible for the scheduled >>> functions' unhandled exceptions (which is what the failures are, right)? In >>> other words, since our scheduler is scheduling we2py functions in a known >>> environment (unlike environment agnostic task-queue systems, which don't >>> know how their operations will resolve), shouldn't the onus be on the >>> scheduled function to handle failures and reschedule if necessary? Maybe we >>> should clarify this before discussing the rest- i may be missing something. >>> >>> btw im leaving for the night but am interested in finishing the >>> discussion- ill be back in the morning if you dont hear from me.. >>> >>> On Saturday, August 18, 2012 5:14:46 PM UTC-4, Niphlod wrote: >>>> >>>> Ok, got the example (but not the "the last go-round is forgotten"). >>>> Let's start saying that your requirements can be fullfilled (simply) >>>> decorating your function in a loop and break after the first successful >>>> attempt (and repeats=0, retry_failed=-1). Given that, the current >>>> behaviour >>>> is not properly a limit to what you are trying to achieve, it's only a >>>> matter on how implement the requeue facilities on the scheduler. >>>> >>>> Lets keep the discussion open...if I got it correctly you're basically >>>> asking to ignore period for failed tasks (requeue them and execute ASAP) >>>> and reset counters accordingly... right ? Period right now ensures that no >>>> more than one task gets executed in n period seconds (and protects you >>>> from >>>> "flapping", i.e. a continously failing function, and is somewhat required >>>> e.g. for respecting webservices API limits, avoid "db pressure" if you're >>>> doing heavy operations, etc, etc). >>>> Respecting period in every case is "consistency" for me (because I >>>> decided that I can "afford" (or "consume" resources) executing that >>>> function only one time every hour). >>>> You are suggesting to alter this for repeating tasks....what I didn't >>>> get is that is required always or only when repeats=0 (that is, >>>> incidentally, not consistent :P) ?! >>>> >>>> i.e. What behaviour should you expect from (repeats=2, retry_failed=3, >>>> period=3600) ? >>>> 2.00 am, failed >>>> 2.00 am, failed >>>> 2.00 am, completed >>>> 3.00 am, failed >>>> 3.00 am, failed >>>> 3.00 am, failed >>>> ? >>>> This is basically what I'm missing. What could possibly be wrong at >>>> 2.00am and be right a few seconds later ? >>>> >>>> On Saturday, August 18, 2012 10:32:14 PM UTC+2, Yarin wrote: >>>>> >>>>> I think retry_failed and repeats are two distinct concepts and >>>>> shouldn't be mixed. >>>>> >>>>> For example, a task set to (repeats=0, retry_failed=0, period=3600) >>>>> should be able to fail at 2:00pm, but will try again at 3:00pm >>>>> regardless >>>>> of what happened at 2:00. Likewise, if it was set to (repeats=0, >>>>> retry_failed=2,period=3600), and failed all three times at 2:00pm, the >>>>> retry count should be reset on the next go around. >>>>> >>>>> I think it's safer to presume that if a task is set up for indefinite >>>>> repitition, a failure on one repeat should not bring down the whole task- >>>>> rather the transactional unit that constitutes a failure should be >>>>> limited >>>>> to the any given attempt, repeated or not. >>>>> >>>>> This was one of the reasons i pressed for renaming repeats_failed to >>>>> retry_failed- distinct concepts >>>>> >>>>> --

