"Loris Bennett" <loris.benn...@fu-berlin.de> writes: > Hi, > > Is it possible to find jobs which both started and completed in a given > interval? > > I am investigating an incident, during which an abnormally high load > occurred on one of our storage servers. To this end I would like to > know whether the beginning and and of any jobs correspond to the > beginning and end of the high-load period. > > I can do something like > > sacct -S 2016-07-13T22:20 -E 2016-07-14T06:20 -s RUNNING -X | grep COMPLETED > > to get jobs which were running in the period and subsequently completed, > but this includes jobs which were running both before and after the > period in question.
As this specific question didn't elicit any responses, I would be interested in answers to these more general ones: Do you try to relate events within your system to specific, possibly misbehaving jobs? If so, how? If not, why not? Cheers, Loris -- Dr. Loris Bennett (Mr.) ZEDAT, Freie Universität Berlin Email loris.benn...@fu-berlin.de