Hi,
I want to know how the shares are getting reduced during subsequent job
submission. In the last part of this mail, the output is pasted:
----- Original Message -----
> Hi,
>
> Am 02.05.2013 um 11:28 schrieb Sangamesh Banappa:
>
> > ----- Original Message -----
> >> Hi,
> >>
> >> Am 01.05.2013 um 16:51 schrieb Sangamesh Banappa:
> >>
> >>> The cluster is configured with share based policy (user
> >>> based) with equal shares (100 shares) for all.
> >>>
> >>> As per the below link:
> >>>
> >>> http://arc.liv.ac.uk/SGE/htmlman/htmlman5/sge_priority.html
> >>>
> >>> The job's priority is calculated as follows:
> >>>
> >>> prio = weight_priority * npprio +
> >>> weight_urgency * nurg +
> >>> weight_ticket * ntckts
> >>>
> >>> Let's take an example, a parallel job of 20 cores to
> >>> analyze
> >>> this. There are no other requests like #$ -l in the
> >>> script..
> >>>
> >>> The value of npprio would be zero. Because there is no #$ -p
> >>> <posix priority value> mentioned in job script & admin also
> >>> does not set priority for a job manually.
> >>>
> >>> Further,
> >>>
> >>> nurg = normalized urgency
> >>>
> >>> urg = rrcontr + wtcontr + dlcontr
> >>>
> >>> There is no user configured under deadline users group. So
> >>> dlcontr should be zero. ??
> >>>
> >>> rrcontr = sum of all (hrr)
> >>>
> >>> hrr = rurg * assumed_slot_allocation * request
> >>>
> >>> rurg -> taken from qconf -sc | grep slots. 1000 is the
> >>> value
> >>> under urgency column.
> >>>
> >>> assumed_slot_allocation = 20 (taken from #$ -pe orte 20)
> >>
> >> This is of course an implied resource request for slots.
> >>
> >>
> >>> request = NONE, hence it is 0 (There is no other resource
> >>> request #$ -l )
> >>>
> >>> Next is "wtcontr". How to get the value for waiting_time
> >>> contribution? Is it calculated on this job only or is it a
> >>> sum of waiting_time of all previous jobs by the same user?
> >>>
> >>> I'm stuck here. Please guide me to do further calculation.
> >>
> >> One way to check this behavior could be to set
> >> "weight_waiting_time
> >> 1.00000" together with "report_pjob_tickets TRUE" in the scheduler
> >> configuration and have a look at `qstat -urg`. Also worth to note
> >> are `qstat -pri` and `qstat -ext` for the overall computation.
> >>
> > Both of the above mentioned settings are set.
> >
> > This is the output. (1 node, 4 cores. The same job of 4 cores is
> > submitted multiple times)
> >
> >
> > # qstat -u "*" -urg
> > job-ID prior nurg urg rrcontr wtcontr dlcontr name
> > user state submit/start at deadline
> > queue slots ja-task-ID
> > ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> > 47 0.61000 0.66667 4098 4000 98 0 mpi
> > user1 r 05/02/2013 13:12:36
> > [email protected] 4
> > 48 0.60667 1.00000 4113 4000 113 0 mpi
> > user1 qw 05/02/2013 13:10:43
> > 4
> > 49 0.55500 0.50000 4112 4000 112 0 mpi
> > user1 qw 05/02/2013 13:10:44
> > 4
>
> This is of course strange, as I don't see the entry for the nurg with
> 0.00000. In my `qstat` I can spot:
>
> 8393 1.50167 1.00000 4063 4000 63 0 test.sh
> reuti qw 05/02/2013 12:53:58
> 4
> 8409 1.00026 0.50000 4062 4000 62 0 test.sh
> reuti qw 05/02/2013 12:53:59
> 4
> 8422 0.50016 0.00000 4061 4000 61 0 test.sh
> reuti qw 05/02/2013 12:54:00
> 4
>
> like it should be. The lowest overall urgency is normalized to zero,
> and the highest to one.
>
> -- Reuti
>
>
> > Here I did understand the values of "urg". But not getting how
> > "nurg" is calculated.
> >
> > # qstat -u "*" -ext
> > job-ID prior ntckts name user project
> > department state cpu mem io tckts ovrts
> > otckt ftckt stckt share queue slots
> > ja-task-ID
> > ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> > 47 0.61000 1.00000 mpi user1 NA
> > defaultdep r 0:00:00:00 0.00371 0.00078
> > 100 0 0 0 100 1.00 [email protected]
> > 4
> > 48 0.60500 0.50000 mpi user1 NA
> > defaultdep qw
> > 50 0 0 0 50
> > 0.34 4
> > 49 0.55333 0.33333 mpi user1 NA
> > defaultdep qw
> > 33 0 0 0 33
> > 0.23 4
> > 50 0.55250 0.25000 mpi user1 NA
> > defaultdep qw
> > 25 0 0 0 25
> > 0.17 4
> > 51 0.55200 0.20000 mpi user1 NA
> > defaultdep qw
> > 20 0 0 0 20
> > 0.14 4
> > 52 0.50167 0.16667 mpi user1 NA
> > defaultdep qw
> > 16 0 0 0 16
> > 0.11 4
> >
> > In this output why the value of tckts is getting changed? 100, 50,
> > 33. sharetree has 100 shares.
Can some one explain this: why share values are getting reduced?
The halftime is set to 168 which is 7days. How the past usage is considered
here?
> >> -- Reuti
> >>
> >>
> >>>
> >>>
> >>> Thanks in advance
> >>> _______________________________________________
> >>> users mailing list
> >>> [email protected]
> >>> https://gridengine.org/mailman/listinfo/users
> >>
> >>
>
>
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users