[slurm-dev] Re: Slurm Diamond Collectors

2016-07-07 Thread Eliot Eshelman


Very cool!

Not to hijack your thread, but this invites the question: is Ganglia on 
its way out or do you use both in parallel? The Grafana dashboards are 
so beautiful that it's hard not to put all the data there.


Eliot


On 07/07/2016 03:07 PM, Paul Edmon wrote:


For those using graphite and diamond, check this out as they may be 
useful.


https://github.com/fasrc/slurm-diamond-collector

-Paul Edmon-



--
Eliot Eshelman
Microway, Inc.


[slurm-dev] Slurm Diamond Collectors

2016-07-07 Thread Paul Edmon


For those using graphite and diamond, check this out as they may be useful.

https://github.com/fasrc/slurm-diamond-collector

-Paul Edmon-


[slurm-dev] Re: slurmdbd association lifetime/expiry

2016-07-07 Thread Paddy Doyle

Hi Stuart,

Exceeding late reply! I very sorry; I missed your question completely, and only
found it just now while searching for something else. It's probably too late to
address your issue, but hopefully it might help someone else.

Yes, the negative balance in sbank was something we didn't anticipate properly.
It actually doesn't make sense to combine data from two sources together like
that (local usage data which can decay, and historical slurmdbd usage which
always increases).

So we (recently) changed sbank to only use the local non-dbd usage values.
That will give an accurate bank balance, in that the local usage values are what
slurm itself uses to decide if a job has enough TRES CPU balance to start (or if
the job will have to wait until the usage decays or is reset).

https://github.com/paddydoyle/slurm-bank

(sbank can be used as a simple sreport wrapper with the '-s -mm-dd' if you
still want something like the previous sbank behaviour)

Paddy

On Sun, May 31, 2015 at 06:21:53AM -0700, Stuart Rankin wrote:

> 
> Hi Paddy,
> 
> Out of curiosity, have you made further modifications to sbank to stop 
> balances becoming negative
> after they reach zero but the assocation usage decays and the actual usage as 
> seen by sreport is
> allowed to continue to increase?
> 
> Best regards
> 
> Stuart
> 
> On 23/04/15 10:24, Paddy Doyle wrote:
> > This works reasonably well for us, but we are investigating other tweaks. We
> > would actually like to emphasise accounting/reporting more than restricting 
> > use;
> > we want to encourage as much usage of the systems as possible, and don't 
> > want
> > project limits to prevent people from running jobs when the resources are 
> > idle.
> > In our current sbank setup, people's `available balance' can still drop to 0
> > when their usage is high enough, and before the half-life decays it, and so 
> > with
> > a 0 balance their jobs will wait in the queue until their usage decays 
> > enough.
> > We are investigating the expired/normal QoS approach, to see if that will 
> > give
> > us a better system usage.
> 
> -- 
> Dr. Stuart Rankin
> 
> Senior System Administrator
> High Performance Computing Service
> University of Cambridge
> Email: sj...@cam.ac.uk
> Tel: (+)44 1223 763517
> 

-- 
Paddy Doyle
Trinity Centre for High Performance Computing,
Lloyd Building, Trinity College Dublin, Dublin 2, Ireland.
Phone: +353-1-896-3725
http://www.tchpc.tcd.ie/


[slurm-dev] Slurm support for intel Phi

2016-07-07 Thread Andy Kociolek
Hi,

I'm testing slurm on Ubuntu 14 and 2 intel Phi cards (knights corner) I can't 
find any info on how Slurm addresses the cards is it through the config file? I 
see configuration examples for GPU support but not intel MIC architecture.

Could you help?

Andy