Hi folks, Our instance of (quite old, 2011.11p1_155) SGE rolled over 10,000,000 jobs at the start of the month, and then started again at 1 as expected. About ten days later we started the qmaster a few times (it was segfaulting, originally we thought that a user was using newer qstat binaries to query an old qmaster) with JID nearing ~20k, only after each of the restarts the JID started at about 1100, not the number we were expecting. Because of this there's duplicate JID entries in accounting and it's causing a bit of a problem for people who monitor for failed jobs.
Because of the nature of the workload the currently-running JIDs are now all over the place, with some JIDs in the queue still in the 9,99n,nnn range and some in four figures. If we need to restart the qmaster again, will the jobseqnum file be overwritten with the largest JID still in the queue (as suggested in http://arc.liv.ac.uk/pipermail/gridengine-users/2010-January/028661.html)? Am aware that this is an old version of SGE and we're in the middle of transitioning to a much newer one, but this is a bit of an issue while we're still shifting workloads over. Thanks, Mike -- Mike Wallis x503305 University of Edinburgh, Research Services, Argyle House, 3 Lady Lawson Street, Edinburgh, EH3 9DR The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. _______________________________________________ users mailing list email@example.com https://gridengine.org/mailman/listinfo/users