You may want to look at my setup.  See:

    http://www.mail-archive.com/[email protected]/msg04430.html

One of our clusters currently has nodes divided as:

    171 large     < 4 weeks
     15 medium    < 2 days
      9 small     < 4 hours

Other comments below...

On Fri, 31 Aug 2012 at 14:41 -0000, S Joe wrote:

> The  majority (90%) of jobs we get take < 1hr, are CPU bound, and
> use up all 8 cores on one host.  But because of the nature of the
> work we do infrequently users will submit jobs that run the same
> program bu that can take multiple days to weeks.

This is like us.

> The setup I have right now has an "all.q" and have fair sharing
> turned on (and an urgent.q queue for urgent jobs).

I have no urgent queue.  When something truly urgent arises we will
just manually force things (it hasn't happened yet).  We can bump job
priority, kill other jobs, move hosts between host groups, etc.

I do occasionally adjust the node count in the different host groups
according to workload.

> So naturally what happens every week or so one of the users submits
> a batch of long running jobs, and these generally take over the
> whole cluster locking out everyone else until they are done.

This used to happen with us, now we limit the number of nodes running
the larger/longer jobs...

> So I thought I'd turn to queue subordination and create a long.q
> that has no time limit and is subordinate to the all.q which would
> have an 1 hr limit.

I don't see much point in subordination for our system, many jobs are
memory intensive and job suspension doesn't free memory resources (our
nodes are diskless with no swap space).

> This works well except that we have periods of time where we have
> *lots* of 1 hr jobs.  What ends up happening is that the long.q jobs
> stay suspended for...well...long periods of time and are essentially
> "locked out". 

When this happens with us within 4 hours any short jobs running on the
large job nodes will have finished and new jobs will start based upon
the scheduling parameters.

> What I'd like is to not have to dedicate 100% of the resources to
> the short jobs when we become inundated with them.

This works for us, but I think of it the other way: the short jobs are
able to use the resources reserved for large jobs when there are no
large jobs waiting.

> I looked into adjusting the slot counts, and using slot
> subordination but that doesn't appear to do what I need as it seems
> to function on queue instances, not cluster queues (correct me if
> I'm wrong here).  Is there a better solution?  Maybe using
> load/suspend parameters and just letting the jobs run?

When it becomes a bigger issue, I'll push that the large jobs need to
support checkpointing and the checkpointable jobs will have higher
priority since they can be moved out of the way as needed.  I haven't
explored checkpointing yet.

Stuart Barkley
-- 
I've never been lost; I was once bewildered for three days, but never lost!
                                        --  Daniel Boone
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to