On Sunday, August 03, 2014 12:10:49 PM Martin Vaeth wrote:
> J. Roeleveld <jo...@antarean.org> wrote:
> > A useful addition to your schedule-tool would be to store the
> > scripts in a way that makes editing simpler
> 
> Since it is an arbitrary script in an arbitrary language,
> I think this is not in the scope of this project to do this.
> In most cases I used it so far, 1-2 more or less complex lines
> (maybe a few more if they would not be complex)
> in an interactive zsh were enough, and these are very simple
> enough to edit in zsh, i.e. I even did not write any script "file"
> in the classical sense.
> 
> > I might be mistaken, but I think the server keeps the entire
> > queue in-memory and when the process dies, the status is lost?
> 
> Yes, the server process must not die.
> 
> If it dies, not only the queue is lost but also the waiting processes
> (that is: queued but not yet started) cannot be reached anymore:
> These waiting processes do not have their own TCP socket but just
> keep their established connection with the server's socket until
> the server tells them through this connection to start or to cancel;
> if this connection gets lost, the waiting processes die:
> What else could they do, reasonably?
> 
> The already started processes have a unique ID (into which the
> server's process is encoded): They reestablish the connection to report
> the exit status according to this ID. If the server is stopped,
> they cannot report this status, of course, and moreover,
> a new server does not know their IDs either and thus will ignore these
> "status reports".
> 
> Maybe this "protocol" is not the most clever solution, but it is
> one which could be implemented without lots of overhead:
> Mainly, I was up to a "quick" solution which is working good enough
> for me: If the server has no bugs, why should it die?
> Moreover, if the server dies for some strange reasons, it is probably
> safer to re-queue the jobs again, anyway.

With the kind of schedules I am working with (and I believe Alan will also end 
up with), restarting the whole process from the start can lead to issues.
Finding out how far the process got before the service crashed can become 
rather complex.

--
Joost

Reply via email to