On Wed, Apr 24, 2013 at 9:25 PM, Ozgur Akgun <[email protected]> wrote:
> I want to be able to say, something like `parallel --timeout (fastest * 2)` > and let get the same output. I have been pondering if I could somehow make a '--timeout 5%'. It should: 1. Run the first 3 jobs to completion (no --timeout) 2. Compute the average and standard deviation for all completed jobs 3. Adjust --timeout based on the new average, standard deviation and user input 4. Go to 2 until all jobs are finished The user input would be a percentage e.g. 5% - meaning "I want the job killed if takes longer to run than the 95% fastest jobs". We can statistically compute that limit if we assume that the run time of the jobs is normally distributed (the bell curve https://en.wikipedia.org/wiki/File:Normal_Distribution_PDF.svg) and that the run time of the jobs does not depend on the order (e.g. it will not work if we get all the fast jobs first). I am not sure if the run times of jobs generally are normally distributed or if they are more like Chi-square or another continuous distribution, and in this case it probably does not matter, because the percentage of jobs that people want timed out will always be < 30%. If you have some insight in this, please speak up. With the above '--timeout 5%' will normally kill 5% of the jobs - even if they are not "bad", and that might less useful than just a percent of the median run time: --timeout 200% which would kill all jobs taking more than twice as long as the median run time (using remedian to do median in finite memory). I do not think looking at the fastest jobs is a good indicator: You can have an odd job that is extremely fast while the median is much slower. /Ole
