Hi David,
Baker D.J. <[email protected]> writes:

> Hi Loris,
>
> I've pasted the output of "sshare -la" below. It's not too easy to
> read as the lines are looped around. Does this make any sense to you?
> It is disconcerting that a lot of the values are shown as undefined
> (nan).
>
> Best regards,
> David.
>
> [root@blue30 etc]# sshare -la
>              Account       User  RawShares  NormShares    RawUsage   
> NormUsage  EffectvUsage  FairShare                    GrpTRESMins             
>        TRESRunMins 
> -------------------- ---------- ---------- ----------- ----------- 
> ----------- ------------- ---------- ------------------------------ 
> ------------------------------ 
> root                                          1.000000 9223372036854775808    
>               1.000000   0.500000                                
> cpu=15838674,mem=0,energy=0,n+ 
>  root                      root          1    0.055556           0         
> nan           nan        nan                                   
> cpu=0,mem=0,energy=0,node=0 
>  gpuusers                                2    0.111111 9223372036854775808    
>      nan           nan        nan                                   
> cpu=0,mem=0,energy=0,node=0 
>   gpuusers                 djb1          1    0.055556 9223372036854775808    
>      nan           nan        nan                                   
> cpu=0,mem=0,energy=0,node=0 
>   gpuusers                  hpc          1    0.055556           0         
> nan           nan        nan                                   
> cpu=0,mem=0,energy=0,node=0 
>  research                               15    0.833333 9223372036854775808    
>      nan           nan        nan                                
> cpu=15838674,mem=0,energy=0,n+ 
>   research              ab24g12          1    0.055556   875866239         
> nan           nan        nan                                
> cpu=5347208,mem=0,energy=0,no+ 
>   research             cica1d14          1    0.055556 9223372036854775808    
>      nan           nan        nan                                   
> cpu=0,mem=0,energy=0,node=0 
>   research                 djb1          1    0.055556 9223372036854775808    
>      nan           nan        nan                                 
> cpu=80,mem=0,energy=0,node=80 
>   research              dpm1u13          1    0.055556 9223372036854775808    
>      nan           nan        nan                                   
> cpu=0,mem=0,energy=0,node=0 
>   research              gtj1y12          1    0.055556 9223372036854775808    
>      nan           nan        nan                                   
> cpu=0,mem=0,energy=0,node=0 
>   research                  hpc          1    0.055556           0         
> nan           nan        nan                                   
> cpu=0,mem=0,energy=0,node=0 
>   research                  icw          1    0.055556 9223372036854775808    
>      nan           nan        nan                                   
> cpu=0,mem=0,energy=0,node=0 
>   research              jag1g13          1    0.055556 9223372036854775808    
>      nan           nan        nan                                   
> cpu=0,mem=0,energy=0,node=0 
>   research              jec1f12          1    0.055556 9223372036854775808    
>      nan           nan        nan                                   
> cpu=0,mem=0,energy=0,node=0 
>   research              lmr1u16          1    0.055556 9223372036854775808    
>      nan           nan        nan                                   
> cpu=0,mem=0,energy=0,node=0 
>   research               mb1a10          1    0.055556 9223372036854775808    
>      nan           nan        nan                                   
> cpu=0,mem=0,energy=0,node=0 
>   research              mjp1m12          1    0.055556 9223372036854775808    
>      nan           nan        nan                                
> cpu=10491385,mem=0,energy=0,n+ 
>   research               ph1m12          1    0.055556 9223372036854775808    
>      nan           nan        nan                                   
> cpu=0,mem=0,energy=0,node=0 
>   research              srw1g10          1    0.055556 9223372036854775808    
>      nan           nan        nan                                   
> cpu=0,mem=0,energy=0,node=0 
>   research               tp1v09          1    0.055556 9223372036854775808    
>      nan           nan        nan                                   
> cpu=0,mem=0,energy=0,node=0

That looks very odd.  The RawUsage values can't be correct and are
probably causing the NaNs as calculating the NormUsage is probably
failing because the sum of the RawUsages is not a sensible value.  The
number 9223372036854775808 equals 2**63, i.e. 1 larger than the largest
signed 64-bit integer, which looks like some sort of overflow or type
mismatch.

Unless anyone else has any ideas, I would be tempted to say that your
database is borked and you need to start over again.

Sorry not be more helpful :-(

Loris

-- 
Dr. Loris Bennett (Mr.)
ZEDAT, Freie Universität Berlin         Email [email protected]

Reply via email to