I've got a real mixture of nodes, some older than Sandybridge, and I find
that IPMI does a better job of getting 'whole system' power use.  I was
hoping to pick up power from my GPU nodes too, but those nodes don't report
via IPMI either


-- 
*Nathan Harper* // IT Systems Architect

*e: * [email protected] // *t: * 0117 906 1104 // *m: * 07875 510891 //
*w: * www.cfms.org.uk <http://www.cfms.org.uk%22> // [image: Linkedin grey
icon scaled] <http://uk.linkedin.com/pub/nathan-harper/21/696/b81>
CFMS Services Ltd // Bristol & Bath Science Park // Dirac Crescent // Emersons
Green // Bristol // BS16 7FR

[image: 4.2 CFMS_Artwork_RGB] <http://www.cfms.org.uk>

------------------------------
CFMS Services Ltd is registered in England and Wales No 05742022 - a
subsidiary of CFMS Ltd
CFMS Services Ltd registered office // Victoria House // 51 Victoria Street
// Bristol // BS1 6AD

On 10 November 2014 13:13, Antony Cleave <[email protected]> wrote:

>
> It turns out it wasn't configured with --with-freeipmi and it's not in the
> standard path. I now have a new version with this configured properly
>
> However while I was confirming that I saw the options to use the Intel
> RAPL api instead and that looks to be running perfectly now on all the
> nodes and the pdf I have says this method should have a lower overhead.
>
> Is there any advantage to using the IPMI plugin over the RAPL plugin? I
> have just Intel nodes here, no AMD but I do have some GPUs in a subset of
> nodes.
>
> Thanks!
>
> Antony
>
>
> On 07/11/2014 09:54, Thomas Cadeau wrote:
>
>> Hi Antony,
>>
>> Can you check freeipmi-devel is installed where you are compiling?
>> If it's not in a regular path, you have to use the option
>> --with-freeipmi=/path/to/freeipmi-devel when configure.
>>
>> Anyway, if acct_gather_energy/ipmi is not compiled, you should have a
>> message on config.log saying freeipmi-devel is not here.
>>
>> Thomas
>>
>> ________________________________________
>> De : Antony Cleave [[email protected]]
>> Date d'envoi : jeudi 6 novembre 2014 17:00
>> À : slurm-dev
>> Objet : [slurm-dev] Struggling with configuration of
>> acct_gather_energy/ipmi
>>
>> Hi All
>>
>> is there a specific configure option to build the
>> acct_gather_energy/ipmi plugin?
>>
>> I'm using Slurm 14.03.0 and I'm trying to configure
>> acct_gather_energy/ipmi with the following settings in slurm.conf
>> JobAcctGatherType=jobacct_gather/linux
>> JobAcctGatherFrequency=30
>> AcctGatherEnergyType=acct_gather_energy/ipmi
>>
>> and the following in acct_gather.conf
>> EnergyIPMIFrequency=30
>> EnergyIPMICalcAdjustment=yes
>> EnergyIPMIUsername=****
>> EnergyIPMIPassword=****
>>
>> it seems to accept the configuration:
>>
>> # scontrol show config
>> Configuration data as of 2014-11-06T15:34:26
>> . . .
>> AcctGatherEnergyType    = acct_gather_energy/ipmi
>> . . .
>> JobAcctGatherFrequency  = 30
>> JobAcctGatherType       = jobacct_gather/linux
>>
>> However when I run a job I get the following in the logs on the slave
>> nodes:
>>
>> [2014-11-06T15:17:20.727] Launching batch job 1055 for UID 1000
>> [2014-11-06T15:17:20.739] Received cpu frequency information for 16 cpus
>> [2014-11-06T15:17:20.743] Couldn't find the specified plugin name for
>> acct_gather_energy/ipmi looking at all files
>> [2014-11-06T15:17:20.766] cannot find acct_gather_energy plugin for
>> acct_gather_energy/ipmi
>> [2014-11-06T15:17:20.766] cannot create acct_gather_energy context for
>> acct_gather_energy/ipmi
>> [2014-11-06T15:17:20.768] Couldn't find the specified plugin name for
>> acct_gather_energy/ipmi looking at all files
>> [2014-11-06T15:17:20.769] cannot find acct_gather_energy plugin for
>> acct_gather_energy/ipmi
>> [2014-11-06T15:17:20.769] cannot create acct_gather_energy context for
>> acct_gather_energy/ipmi
>> [2014-11-06T15:17:20.770] WARNING: We will use a much slower algorithm
>> with proctrack/pgid, use Proctracktype=proctrack/linuxproc or some other
>> proctrack when using jobacct_gather/linux
>> [2014-11-06T15:17:20.771] [1055] gres/mic unable to set OFFLOAD_DEVICES,
>> no device files configured
>> [2014-11-06T15:17:20.823] [1055] Couldn't find the specified plugin name
>> for acct_gather_energy/ipmi looking at all files
>> [2014-11-06T15:17:20.825] [1055] cannot find acct_gather_energy plugin
>> for acct_gather_energy/ipmi
>> [2014-11-06T15:17:20.826] [1055] cannot create acct_gather_energy
>> context for acct_gather_energy/ipmi
>> [2014-11-06T15:17:21.843] [1055] Couldn't find the specified plugin name
>> for acct_gather_energy/ipmi looking at all files
>> [2014-11-06T15:17:21.845] [1055] cannot find acct_gather_energy plugin
>> for acct_gather_energy/ipmi
>> [2014-11-06T15:17:21.845] [1055] cannot create acct_gather_energy
>> context for acct_gather_energy/ipmi
>> [2014-11-06T15:17:41.771] [1055] *** JOB 1055 CANCELLED AT
>> 2014-11-06T15:17:41 ***
>> [2014-11-06T15:17:41.829] [1055] sending REQUEST_COMPLETE_BATCH_SCRIPT,
>> error:0 status 15
>> [2014-11-06T15:17:41.831] [1055] done with job
>>
>> I can get power reporting with both the ipmi-sensors and ipmi-dcmi
>> commands
>>
>> # ipmi-sensors --non-abbreviated-units | grep Watts
>> 95  | Pwr Consumption  | Current                  | 126.00     |
>> Watts     | 'OK'
>>
>> # ipmi-dcmi --get-system-power-statistics
>> Current Power                        : 140 Watts
>> Minimum Power over sampling duration : 72 watts
>> Maximum Power over sampling duration : 361 watts
>> Average Power over sampling duration : 132 watts
>> Time Stamp                           : 11/06/2014 - 12:39:20
>> Statistics reporting time period     : 1270838 milliseconds
>> Power Measurement                    : Active
>>
>> I'm obviously missing something but I cannot find anything more in the
>> documentation
>>
>> Thanks
>>
>> Antony
>>
>

Reply via email to