Hi,
I think this is a bug, but would like to hear other opinions :)
Over the years, we've run various versions of SGE and SoGE with a share
tree policy. I think it's always been the case that jobs in an error state
still attract tickets from the share tree policy - despite the fact that
the job isn't eligible to run.
e.g.
$ qstat -ext -s p | head -5
job-ID prior ntckts name user project department state cpu mem io tckts ovrts otckt ftckt stckt share queue slots ja-task-ID
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
1 0.50050 0.50000 job bob DEF defaultdep Eqw
137170 0 0 0 137170 0.59
16
2 0.39808 0.39758 job sue DEF defaultdep qw
46411 0 0 0 46411 0.06
50
3 0.34129 0.34079 job sue DEF defaultdep qw
39781 0 0 0 39781 0.05
50
This means there are fewer tickets available for the scheduler to
distribute to other waiting jobs. In the extreme case, it could mean other
pending jobs have the same priority when they shouldn't.
Given that ticket allocations are calculated afresh every scheduling
interval, I don't think there's any point in errored jobs attracting
tickets like this.
Is that right?
Mark
--
-----------------------------------------------------------------
Mark Dixon Email : [email protected]
HPC/Grid Systems Support Tel (int): 35429
Information Systems Services Tel (ext): +44(0)113 343 5429
University of Leeds, LS2 9JT, UK
-----------------------------------------------------------------
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users