Re: [Beowulf] Avoiding/mitigating fragmentation of systems by small jobs?

2018-06-12 Thread Skylar Thompson
On Tue, Jun 12, 2018 at 11:08:44AM -0400, Prentice Bisbal wrote: > On 06/12/2018 12:33 AM, Chris Samuel wrote: > > >Hi Prentice! > > > >On Tuesday, 12 June 2018 4:11:55 AM AEST Prentice Bisbal wrote: > > > >>I to make this work, I will be using job_submit.lua to apply this logic > >>and assign a

Re: [Beowulf] Avoiding/mitigating fragmentation of systems by small jobs?

2018-06-12 Thread Ryan Novosielski
> On Jun 12, 2018, at 11:08 AM, Prentice Bisbal wrote: > > On 06/12/2018 12:33 AM, Chris Samuel wrote: > >> Hi Prentice! >> >> On Tuesday, 12 June 2018 4:11:55 AM AEST Prentice Bisbal wrote: >> >>> I to make this work, I will be using job_submit.lua to apply this logic >>> and assign a job

Re: [Beowulf] Avoiding/mitigating fragmentation of systems by small jobs?

2018-06-12 Thread Prentice Bisbal
On 06/12/2018 12:33 AM, Chris Samuel wrote: Hi Prentice! On Tuesday, 12 June 2018 4:11:55 AM AEST Prentice Bisbal wrote: I to make this work, I will be using job_submit.lua to apply this logic and assign a job to a partition. If a user requests a specific partition not in line with these

Re: [Beowulf] Avoiding/mitigating fragmentation of systems by small jobs?

2018-06-12 Thread Skylar Thompson
On Tue, Jun 12, 2018 at 02:28:25PM +1000, Chris Samuel wrote: > On Sunday, 10 June 2018 1:48:18 AM AEST Skylar Thompson wrote: > > > Unfortunately we don't have a mechanism to limit > > network usage or local scratch usage > > Our trick in Slurm is to use the slurmdprolog script to set an XFS

Re: [Beowulf] Avoiding/mitigating fragmentation of systems by small jobs?

2018-06-12 Thread Chris Samuel
On Tuesday, 12 June 2018 6:13:49 PM AEST Kilian Cavalotti wrote: > Slurm has a scheduler option that could probably help with that: > https://slurm.schedmd.com/slurm.conf.html#OPT_pack_serial_at_end Ah I knew I'd seen something like that before! I got fixated on CR_Pack_Nodes which is not for

Re: [Beowulf] Avoiding/mitigating fragmentation of systems by small jobs?

2018-06-12 Thread Kilian Cavalotti
On Tue, Jun 12, 2018 at 6:33 AM, Chris Samuel wrote: > However, I do think Scott's approach is potentially very useful, by directing > jobs < full node to one end of a list of nodes and jobs that want full nodes > to the other end of the list (especially if you use the partition idea to > ensure

Re: [Beowulf] Avoiding/mitigating fragmentation of systems by small jobs?

2018-06-12 Thread John Hearns via Beowulf
> However, I do think Scott's approach is potentially very useful, by directing > jobs < full node to one end of a list of nodes and jobs that want full nodes > to the other end of the list (especially if you use the partition idea to > ensure that not all nodes are accessible to small jobs).

Re: [Beowulf] Avoiding/mitigating fragmentation of systems by small jobs?

2018-06-11 Thread Chris Samuel
Hi Prentice! On Tuesday, 12 June 2018 4:11:55 AM AEST Prentice Bisbal wrote: > I to make this work, I will be using job_submit.lua to apply this logic > and assign a job to a partition. If a user requests a specific partition > not in line with these specifications, job_submit.lua will reassign

Re: [Beowulf] Avoiding/mitigating fragmentation of systems by small jobs?

2018-06-11 Thread Chris Samuel
On Sunday, 10 June 2018 1:48:18 AM AEST Skylar Thompson wrote: > Unfortunately we don't have a mechanism to limit > network usage or local scratch usage Our trick in Slurm is to use the slurmdprolog script to set an XFS project quota for that job ID on the per-job directory (created by a plugin

Re: [Beowulf] Avoiding/mitigating fragmentation of systems by small jobs?

2018-06-11 Thread Prentice Bisbal
Chris, I'm dealing with this problem myself right now. We use Slurm here. We really have one large, very heterogeneous cluster that's treated as multiple smaller clusters through creating multiple partitions, each with their own QOS. We also have some users who don't understand the

Re: [Beowulf] Avoiding/mitigating fragmentation of systems by small jobs?

2018-06-11 Thread Skylar Thompson
On Mon, Jun 11, 2018 at 02:36:14PM +0200, John Hearns via Beowulf wrote: > Skylar Thomson wrote: > >Unfortunately we don't have a mechanism to limit > >network usage or local scratch usage, but the former is becoming less of a > >problem with faster edge networking, and we have an opt-in

Re: [Beowulf] Avoiding/mitigating fragmentation of systems by small jobs?

2018-06-11 Thread John Hearns via Beowulf
Skylar Thomson wrote: >Unfortunately we don't have a mechanism to limit >network usage or local scratch usage, but the former is becoming less of a >problem with faster edge networking, and we have an opt-in bookkeeping mechanism >for the latter that isn't enforced but works well enough to keep

Re: [Beowulf] Avoiding/mitigating fragmentation of systems by small jobs?

2018-06-11 Thread Chris Samuel
On Sunday, 10 June 2018 10:33:22 PM AEST Scott Atchley wrote: [lists] > Yes. It may be specific to Cray/Moab. No, I think that applies quite nicely to Slurm too. > Good luck. If you want to discuss, please do not hesitate to ask. We have > another paper pending along the same lines. Thanks!

Re: [Beowulf] Avoiding/mitigating fragmentation of systems by small jobs?

2018-06-10 Thread Scott Atchley
On Sun, Jun 10, 2018 at 4:53 AM, Chris Samuel wrote: > On Sunday, 10 June 2018 1:22:07 AM AEST Scott Atchley wrote: > > > Hi Chris, > > Hey Scott, > > > We have looked at this _a_ _lot_ on Titan: > > > > A Multi-faceted Approach to Job Placement for Improved Performance on > > Extreme-Scale

Re: [Beowulf] Avoiding/mitigating fragmentation of systems by small jobs?

2018-06-10 Thread Skylar Thompson
On Sun, Jun 10, 2018 at 06:46:04PM +1000, Chris Samuel wrote: > On Sunday, 10 June 2018 1:48:18 AM AEST Skylar Thompson wrote: > > > We're a Grid Engine shop, and we have the execd/shepherds place each job in > > its own cgroup with CPU and memory limits in place. > > Slurm has supports cgroups

Re: [Beowulf] Avoiding/mitigating fragmentation of systems by small jobs?

2018-06-10 Thread Chris Samuel
On Sunday, 10 June 2018 1:22:07 AM AEST Scott Atchley wrote: > Hi Chris, Hey Scott, > We have looked at this _a_ _lot_ on Titan: > > A Multi-faceted Approach to Job Placement for Improved Performance on > Extreme-Scale Systems > > https://ieeexplore.ieee.org/document/7877165/ Thanks! IEEE has

Re: [Beowulf] Avoiding/mitigating fragmentation of systems by small jobs?

2018-06-10 Thread Chris Samuel
On Sunday, 10 June 2018 1:48:18 AM AEST Skylar Thompson wrote: > We're a Grid Engine shop, and we have the execd/shepherds place each job in > its own cgroup with CPU and memory limits in place. Slurm has supports cgroups as well (and we use it extensively), the idea here is more to try and

Re: [Beowulf] Avoiding/mitigating fragmentation of systems by small jobs?

2018-06-09 Thread Skylar Thompson
We're a Grid Engine shop, and we have the execd/shepherds place each job in its own cgroup with CPU and memory limits in place. This lets our users make efficient use of our HPC resources whether they're running single-slot jobs, or multi-node jobs. Unfortunately we don't have a mechanism to limit

Re: [Beowulf] Avoiding/mitigating fragmentation of systems by small jobs?

2018-06-09 Thread Scott Atchley
Hi Chris, We have looked at this _a_ _lot_ on Titan: A Multi-faceted Approach to Job Placement for Improved Performance on Extreme-Scale Systems https://ieeexplore.ieee.org/document/7877165/ This issue we have is small jobs "inside" large jobs interfering with the larger jobs. The item that is

Re: [Beowulf] Avoiding/mitigating fragmentation of systems by small jobs?

2018-06-09 Thread Chris Samuel
On Saturday, 9 June 2018 12:39:02 AM AEST Bill Abbott wrote: > We set PriorityFavorSmall=NO and PriorityWeightJobSize to some > appropriately large value in slurm.conf, which helps. I guess that helps getting jobs going (and we use something similar), but my question was more about placement.

Re: [Beowulf] Avoiding/mitigating fragmentation of systems by small jobs?

2018-06-09 Thread Chris Samuel
On Saturday, 9 June 2018 12:16:16 AM AEST Paul Edmon wrote: > Yeah this one is tricky. In general we take the wildwest approach here, but > I've had users use --contiguous and their job takes forever to run. :-) > I suppose one method would would be enforce that each job take a full node > and

Re: [Beowulf] Avoiding/mitigating fragmentation of systems by small jobs?

2018-06-08 Thread David Mathog
This isn't quite the same issue, but several times I have observed a large multiCPU machine lock up because the accounting records associates with a zillion tiny rapidly launched jobs made an enormous /var/account/pacct file and filled the small root filesystem. Actually it wasn't usually

Re: [Beowulf] Avoiding/mitigating fragmentation of systems by small jobs?

2018-06-08 Thread Bill Abbott
We set PriorityFavorSmall=NO and PriorityWeightJobSize to some appropriately large value in slurm.conf, which helps. We also used to limit the number of total jobs a single user could run to something like 30% of the cluster, so a user could run a single mpi job that takes all nodes, but

Re: [Beowulf] Avoiding/mitigating fragmentation of systems by small jobs?

2018-06-08 Thread Paul Edmon
Yeah this one is tricky.  In general we take the wildwest approach here, but I've had users use --contiguous and their job takes forever to run. I suppose one method would would be enforce that each job take a full node and parallel jobs always have contiguous.  As I recall Slurm will

Re: [Beowulf] Avoiding/mitigating fragmentation of systems by small jobs?

2018-06-08 Thread Andrew Mather
Hi Chris, > Message: 2 > Date: Fri, 08 Jun 2018 17:21:56 +1000 > From: Chris Samuel > To: beowulf@beowulf.org > Subject: [Beowulf] Avoiding/mitigating fragmentation of systems by > small jobs? > Message-ID: <2427060.afPWsf2KXH@quad> > Content-Type: text/plain; charset="us-ascii" > >

Re: [Beowulf] Avoiding/mitigating fragmentation of systems by small jobs?

2018-06-08 Thread John Hearns via Beowulf
Chris, good question. I can't give a direct asnwer there, but let me share my experiences. In the past I managed SGI ICE clusters and a large memory UV system with PBSPro queuing. The engineers submitted CFD solver jobs using scripts, and we only allowed them to use a multiple of N cpus, in fact