Am 26.04.2012 um 19:06 schrieb Stuart Barkley: > On Thu, 26 Apr 2012 at 12:35 -0000, Rayson Ho wrote: > >> Is fairshare not used for a reason?? Sounds to me that in your >> scenario it is only 1 account using most of the resources of the >> cluster. > > I didn't go into it because I don't think fairshare is directly > related to the issue (I could be wrong). I think I have fairshare > working now, but our usage is bursty and I'm still attempting to learn > about fairshare. I'll write up my experiences once I learn more. > > Fairshare is just a component of priority. I used qalter to force the > job priority up as a way to bypass possibly faulty fairshare > configuration. > > We also do not have preemption configured (another discussion for > another time). > > We do not limit users artificially, if there is nothing else running > on the cluster a user can get 100% of it. In the past when I have > tried to limit the users to not being able to allocate more than ~75% > of total resources I've received push back for having idle systems > when jobs where waiting to run. This is more a management/user > counsel issue than technical. > > Some resources are reserved for shorter jobs and currently there are > other user jobs flowing through the system which have shorter h_rt > values. > > For this specific case: > > The first large array job was submitted when the system was idle, so > this user was able to grab almost all of the resources. > > Of 1500+ array job instances, a couple may end every hour, but these > will mostly release small fragmented resources. > > The top priority job (the one I concerned about) needs larger memory > resources so cannot fit into the resources released by a single array > task of the large job. > > The scheduler then moves down the job list and finds another of the > first array job tasks. This fits in available resources so get > started.
IIRC the array tasks are not handled as individual jobs but one job. I.e. once the first was scheduled, all other will follow. -- Reuti > I was hoping/expecting that reservations (qsub -R and sched_conf > max_reservations) was something that would help with this problem. > > Stuart > -- > I've never been lost; I was once bewildered for three days, but never lost! > -- Daniel Boone > _______________________________________________ > users mailing list > [email protected] > https://gridengine.org/mailman/listinfo/users _______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
