Re: [PATCH v3 0/5] sync a se with its cfs_rq when att(det)aching it
On Thu, Aug 20, 2015 at 11:11:31PM +0200, Peter Zijlstra wrote: > On Thu, Aug 20, 2015 at 07:46:09PM +0900, Byungchul Park wrote: > > On Thu, Aug 20, 2015 at 05:38:41PM +0900, Byungchul Park wrote: > > > On Thu, Aug 20, 2015 at 03:17:21AM +0200, Peter Zijlstra wrote: > > > > > > > > I did something like this on top.. please have a look at the XXX and > > > > integrate. > > > > > > i am not sure, what do you intend for me to do. > > > > > > do you mean that i am supposed to integrate this cleanup patch you gave me > > > including the XXX comment? > > No, the intent was for you to think about the point marked XXX, which > you've done below. > > > > > +* > > > > +* XXX this appears wrong!! check history, > > > > +* we appear to always set queued and RUNNING under the same > > > > lock instance > > > > +* might be from before TASK_WAKING ? > > > > */ > > > > > > is it impossible to happen to check if vruntime is normalized, when doing > > > something like e.g. active load balance where queued != TASK_ON_RQ_QUEUED > > > and p->state == TASK_RUNNING? > > > > furthermore, in any migration by load balance, it seems to be possible.. > > > > > > > > i think it can happen.. > > OK, then we need to change the comment to reflect the actual reason the > test is needed. Because I think the currently described scenario is > incorrect. what is the currently described scenario to need to change? then.. did i change patches as what you suggested, in v4? > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3 0/5] sync a se with its cfs_rq when att(det)aching it
On Thu, Aug 20, 2015 at 07:46:09PM +0900, Byungchul Park wrote: > On Thu, Aug 20, 2015 at 05:38:41PM +0900, Byungchul Park wrote: > > On Thu, Aug 20, 2015 at 03:17:21AM +0200, Peter Zijlstra wrote: > > > > > > I did something like this on top.. please have a look at the XXX and > > > integrate. > > > > i am not sure, what do you intend for me to do. > > > > do you mean that i am supposed to integrate this cleanup patch you gave me > > including the XXX comment? No, the intent was for you to think about the point marked XXX, which you've done below. > > > + * > > > + * XXX this appears wrong!! check history, > > > + * we appear to always set queued and RUNNING under the same lock > > > instance > > > + * might be from before TASK_WAKING ? > > >*/ > > > > is it impossible to happen to check if vruntime is normalized, when doing > > something like e.g. active load balance where queued != TASK_ON_RQ_QUEUED > > and p->state == TASK_RUNNING? > > furthermore, in any migration by load balance, it seems to be possible.. > > > > > i think it can happen.. OK, then we need to change the comment to reflect the actual reason the test is needed. Because I think the currently described scenario is incorrect. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3 0/5] sync a se with its cfs_rq when att(det)aching it
On Thu, Aug 20, 2015 at 05:38:41PM +0900, Byungchul Park wrote: > On Thu, Aug 20, 2015 at 03:17:21AM +0200, Peter Zijlstra wrote: > > > > I did something like this on top.. please have a look at the XXX and > > integrate. > > i am not sure, what do you intend for me to do. > > do you mean that i am supposed to integrate this cleanup patch you gave me > including the XXX comment? > > > +* > > +* XXX this appears wrong!! check history, > > +* we appear to always set queued and RUNNING under the same lock > > instance > > +* might be from before TASK_WAKING ? > > */ > > is it impossible to happen to check if vruntime is normalized, when doing > something like e.g. active load balance where queued != TASK_ON_RQ_QUEUED > and p->state == TASK_RUNNING? furthermore, in any migration by load balance, it seems to be possible.. > > i think it can happen.. > > thanks, > byungchul > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3 0/5] sync a se with its cfs_rq when att(det)aching it
On Thu, Aug 20, 2015 at 03:17:21AM +0200, Peter Zijlstra wrote: > > I did something like this on top.. please have a look at the XXX and > integrate. i am not sure, what do you intend for me to do. do you mean that i am supposed to integrate this cleanup patch you gave me including the XXX comment? > + * > + * XXX this appears wrong!! check history, > + * we appear to always set queued and RUNNING under the same lock > instance > + * might be from before TASK_WAKING ? >*/ is it impossible to happen to check if vruntime is normalized, when doing something like e.g. active load balance where queued != TASK_ON_RQ_QUEUED and p->state == TASK_RUNNING? i think it can happen.. thanks, byungchul -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3 0/5] sync a se with its cfs_rq when att(det)aching it
On Thu, Aug 20, 2015 at 03:17:21AM +0200, Peter Zijlstra wrote: I did something like this on top.. please have a look at the XXX and integrate. i am not sure, what do you intend for me to do. do you mean that i am supposed to integrate this cleanup patch you gave me including the XXX comment? + * + * XXX this appears wrong!! check history, + * we appear to always set queued and RUNNING under the same lock instance + * might be from before TASK_WAKING ? */ is it impossible to happen to check if vruntime is normalized, when doing something like e.g. active load balance where queued != TASK_ON_RQ_QUEUED and p-state == TASK_RUNNING? i think it can happen.. thanks, byungchul -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3 0/5] sync a se with its cfs_rq when att(det)aching it
On Thu, Aug 20, 2015 at 05:38:41PM +0900, Byungchul Park wrote: On Thu, Aug 20, 2015 at 03:17:21AM +0200, Peter Zijlstra wrote: I did something like this on top.. please have a look at the XXX and integrate. i am not sure, what do you intend for me to do. do you mean that i am supposed to integrate this cleanup patch you gave me including the XXX comment? +* +* XXX this appears wrong!! check history, +* we appear to always set queued and RUNNING under the same lock instance +* might be from before TASK_WAKING ? */ is it impossible to happen to check if vruntime is normalized, when doing something like e.g. active load balance where queued != TASK_ON_RQ_QUEUED and p-state == TASK_RUNNING? furthermore, in any migration by load balance, it seems to be possible.. i think it can happen.. thanks, byungchul -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3 0/5] sync a se with its cfs_rq when att(det)aching it
On Thu, Aug 20, 2015 at 11:11:31PM +0200, Peter Zijlstra wrote: On Thu, Aug 20, 2015 at 07:46:09PM +0900, Byungchul Park wrote: On Thu, Aug 20, 2015 at 05:38:41PM +0900, Byungchul Park wrote: On Thu, Aug 20, 2015 at 03:17:21AM +0200, Peter Zijlstra wrote: I did something like this on top.. please have a look at the XXX and integrate. i am not sure, what do you intend for me to do. do you mean that i am supposed to integrate this cleanup patch you gave me including the XXX comment? No, the intent was for you to think about the point marked XXX, which you've done below. +* +* XXX this appears wrong!! check history, +* we appear to always set queued and RUNNING under the same lock instance +* might be from before TASK_WAKING ? */ is it impossible to happen to check if vruntime is normalized, when doing something like e.g. active load balance where queued != TASK_ON_RQ_QUEUED and p-state == TASK_RUNNING? furthermore, in any migration by load balance, it seems to be possible.. i think it can happen.. OK, then we need to change the comment to reflect the actual reason the test is needed. Because I think the currently described scenario is incorrect. what is the currently described scenario to need to change? then.. did i change patches as what you suggested, in v4? -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3 0/5] sync a se with its cfs_rq when att(det)aching it
On Thu, Aug 20, 2015 at 07:46:09PM +0900, Byungchul Park wrote: On Thu, Aug 20, 2015 at 05:38:41PM +0900, Byungchul Park wrote: On Thu, Aug 20, 2015 at 03:17:21AM +0200, Peter Zijlstra wrote: I did something like this on top.. please have a look at the XXX and integrate. i am not sure, what do you intend for me to do. do you mean that i am supposed to integrate this cleanup patch you gave me including the XXX comment? No, the intent was for you to think about the point marked XXX, which you've done below. + * + * XXX this appears wrong!! check history, + * we appear to always set queued and RUNNING under the same lock instance + * might be from before TASK_WAKING ? */ is it impossible to happen to check if vruntime is normalized, when doing something like e.g. active load balance where queued != TASK_ON_RQ_QUEUED and p-state == TASK_RUNNING? furthermore, in any migration by load balance, it seems to be possible.. i think it can happen.. OK, then we need to change the comment to reflect the actual reason the test is needed. Because I think the currently described scenario is incorrect. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3 0/5] sync a se with its cfs_rq when att(det)aching it
On Thu, Aug 20, 2015 at 03:17:21AM +0200, Peter Zijlstra wrote: > > I did something like this on top.. please have a look at the XXX and > integrate. yes, i will do it. > > --- > > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -2664,8 +2664,8 @@ static inline u64 cfs_rq_clock_task(stru > /* Group cfs_rq's load_avg is used for task_h_load and update_cfs_share */ > static inline int update_cfs_rq_load_avg(u64 now, struct cfs_rq *cfs_rq) > { > - int decayed; > struct sched_avg *sa = _rq->avg; > + int decayed; > > if (atomic_long_read(_rq->removed_load_avg)) { > long r = atomic_long_xchg(_rq->removed_load_avg, 0); > @@ -2695,15 +2695,16 @@ static inline int update_cfs_rq_load_avg > static inline void update_load_avg(struct sched_entity *se, int update_tg) > { > struct cfs_rq *cfs_rq = cfs_rq_of(se); > - int cpu = cpu_of(rq_of(cfs_rq)); > u64 now = cfs_rq_clock_task(cfs_rq); > + int cpu = cpu_of(rq_of(cfs_rq)); > > /* >* Track task load average for carrying it to new CPU after migrated, > and >* track group sched_entity load average for task_h_load calc in > migration >*/ > __update_load_avg(now, cpu, >avg, > - se->on_rq * scale_load_down(se->load.weight), cfs_rq->curr == > se, NULL); > + se->on_rq * scale_load_down(se->load.weight), > + cfs_rq->curr == se, NULL); > > if (update_cfs_rq_load_avg(now, cfs_rq) && update_tg) > update_tg_load_avg(cfs_rq, 0); > @@ -2718,9 +2719,10 @@ static void attach_entity_load_avg(struc >* update here. we have to update it with prev cfs_rq just before > changing >* se's cfs_rq, and get here soon. >*/ > - if (se->avg.last_update_time) > + if (se->avg.last_update_time) { > __update_load_avg(cfs_rq->avg.last_update_time, > cpu_of(rq_of(cfs_rq)), > - >avg, 0, 0, NULL); > + >avg, 0, 0, NULL); > + } > > se->avg.last_update_time = cfs_rq->avg.last_update_time; > cfs_rq->avg.load_avg += se->avg.load_avg; > @@ -2732,17 +2734,13 @@ static void attach_entity_load_avg(struc > static void detach_entity_load_avg(struct cfs_rq *cfs_rq, struct > sched_entity *se) > { > __update_load_avg(cfs_rq->avg.last_update_time, cpu_of(rq_of(cfs_rq)), > - >avg, se->on_rq * scale_load_down(se->load.weight), > - cfs_rq->curr == se, NULL); > + >avg, se->on_rq * > scale_load_down(se->load.weight), > + cfs_rq->curr == se, NULL); > > - cfs_rq->avg.load_avg = > - max_t(long, cfs_rq->avg.load_avg - se->avg.load_avg, 0); > - cfs_rq->avg.load_sum = > - max_t(s64, cfs_rq->avg.load_sum - se->avg.load_sum, 0); > - cfs_rq->avg.util_avg = > - max_t(long, cfs_rq->avg.util_avg - se->avg.util_avg, 0); > - cfs_rq->avg.util_sum = > - max_t(s32, cfs_rq->avg.util_sum - se->avg.util_sum, 0); > + cfs_rq->avg.load_avg = max_t(long, cfs_rq->avg.load_avg - > se->avg.load_avg, 0); > + cfs_rq->avg.load_sum = max_t(s64, cfs_rq->avg.load_sum - > se->avg.load_sum, 0); > + cfs_rq->avg.util_avg = max_t(long, cfs_rq->avg.util_avg - > se->avg.util_avg, 0); > + cfs_rq->avg.util_sum = max_t(s32, cfs_rq->avg.util_sum - > se->avg.util_sum, 0); > } > > /* Add the load generated by se into cfs_rq's load average */ > @@ -2751,14 +2749,14 @@ enqueue_entity_load_avg(struct cfs_rq *c > { > struct sched_avg *sa = >avg; > u64 now = cfs_rq_clock_task(cfs_rq); > - int migrated = 0, decayed; > + int migrated, decayed; > > - if (sa->last_update_time == 0) > - migrated = 1; > - else > + migrated = !sa->last_update_time; > + if (!migrated) { > __update_load_avg(now, cpu_of(rq_of(cfs_rq)), sa, > se->on_rq * scale_load_down(se->load.weight), > cfs_rq->curr == se, NULL); > + } > > decayed = update_cfs_rq_load_avg(now, cfs_rq); > > @@ -2781,7 +2779,7 @@ dequeue_entity_load_avg(struct cfs_rq *c > cfs_rq->runnable_load_avg = > max_t(long, cfs_rq->runnable_load_avg - se->avg.load_avg, 0); > cfs_rq->runnable_load_sum = > - max_t(s64, cfs_rq->runnable_load_sum - se->avg.load_sum, 0); > + max_t(s64, cfs_rq->runnable_load_sum - se->avg.load_sum, 0); > } > > /* > @@ -2849,6 +2847,11 @@ static inline void > dequeue_entity_load_avg(struct cfs_rq *cfs_rq, struct sched_entity *se) {} > static inline void remove_entity_load_avg(struct sched_entity *se) {} > > +static inline void > +attach_entity_load_avg(struct cfs_rq *cfs_rq, struct sched_entity *se) {} > +static inline void > +detach_entity_load_avg(struct cfs_rq *cfs_rq, struct sched_entity *se) {} > + > static inline int idle_balance(struct rq
Re: [PATCH v3 0/5] sync a se with its cfs_rq when att(det)aching it
I did something like this on top.. please have a look at the XXX and integrate. --- --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -2664,8 +2664,8 @@ static inline u64 cfs_rq_clock_task(stru /* Group cfs_rq's load_avg is used for task_h_load and update_cfs_share */ static inline int update_cfs_rq_load_avg(u64 now, struct cfs_rq *cfs_rq) { - int decayed; struct sched_avg *sa = _rq->avg; + int decayed; if (atomic_long_read(_rq->removed_load_avg)) { long r = atomic_long_xchg(_rq->removed_load_avg, 0); @@ -2695,15 +2695,16 @@ static inline int update_cfs_rq_load_avg static inline void update_load_avg(struct sched_entity *se, int update_tg) { struct cfs_rq *cfs_rq = cfs_rq_of(se); - int cpu = cpu_of(rq_of(cfs_rq)); u64 now = cfs_rq_clock_task(cfs_rq); + int cpu = cpu_of(rq_of(cfs_rq)); /* * Track task load average for carrying it to new CPU after migrated, and * track group sched_entity load average for task_h_load calc in migration */ __update_load_avg(now, cpu, >avg, - se->on_rq * scale_load_down(se->load.weight), cfs_rq->curr == se, NULL); + se->on_rq * scale_load_down(se->load.weight), + cfs_rq->curr == se, NULL); if (update_cfs_rq_load_avg(now, cfs_rq) && update_tg) update_tg_load_avg(cfs_rq, 0); @@ -2718,9 +2719,10 @@ static void attach_entity_load_avg(struc * update here. we have to update it with prev cfs_rq just before changing * se's cfs_rq, and get here soon. */ - if (se->avg.last_update_time) + if (se->avg.last_update_time) { __update_load_avg(cfs_rq->avg.last_update_time, cpu_of(rq_of(cfs_rq)), - >avg, 0, 0, NULL); + >avg, 0, 0, NULL); + } se->avg.last_update_time = cfs_rq->avg.last_update_time; cfs_rq->avg.load_avg += se->avg.load_avg; @@ -2732,17 +2734,13 @@ static void attach_entity_load_avg(struc static void detach_entity_load_avg(struct cfs_rq *cfs_rq, struct sched_entity *se) { __update_load_avg(cfs_rq->avg.last_update_time, cpu_of(rq_of(cfs_rq)), - >avg, se->on_rq * scale_load_down(se->load.weight), - cfs_rq->curr == se, NULL); + >avg, se->on_rq * scale_load_down(se->load.weight), + cfs_rq->curr == se, NULL); - cfs_rq->avg.load_avg = - max_t(long, cfs_rq->avg.load_avg - se->avg.load_avg, 0); - cfs_rq->avg.load_sum = - max_t(s64, cfs_rq->avg.load_sum - se->avg.load_sum, 0); - cfs_rq->avg.util_avg = - max_t(long, cfs_rq->avg.util_avg - se->avg.util_avg, 0); - cfs_rq->avg.util_sum = - max_t(s32, cfs_rq->avg.util_sum - se->avg.util_sum, 0); + cfs_rq->avg.load_avg = max_t(long, cfs_rq->avg.load_avg - se->avg.load_avg, 0); + cfs_rq->avg.load_sum = max_t(s64, cfs_rq->avg.load_sum - se->avg.load_sum, 0); + cfs_rq->avg.util_avg = max_t(long, cfs_rq->avg.util_avg - se->avg.util_avg, 0); + cfs_rq->avg.util_sum = max_t(s32, cfs_rq->avg.util_sum - se->avg.util_sum, 0); } /* Add the load generated by se into cfs_rq's load average */ @@ -2751,14 +2749,14 @@ enqueue_entity_load_avg(struct cfs_rq *c { struct sched_avg *sa = >avg; u64 now = cfs_rq_clock_task(cfs_rq); - int migrated = 0, decayed; + int migrated, decayed; - if (sa->last_update_time == 0) - migrated = 1; - else + migrated = !sa->last_update_time; + if (!migrated) { __update_load_avg(now, cpu_of(rq_of(cfs_rq)), sa, se->on_rq * scale_load_down(se->load.weight), cfs_rq->curr == se, NULL); + } decayed = update_cfs_rq_load_avg(now, cfs_rq); @@ -2781,7 +2779,7 @@ dequeue_entity_load_avg(struct cfs_rq *c cfs_rq->runnable_load_avg = max_t(long, cfs_rq->runnable_load_avg - se->avg.load_avg, 0); cfs_rq->runnable_load_sum = - max_t(s64, cfs_rq->runnable_load_sum - se->avg.load_sum, 0); + max_t(s64, cfs_rq->runnable_load_sum - se->avg.load_sum, 0); } /* @@ -2849,6 +2847,11 @@ static inline void dequeue_entity_load_avg(struct cfs_rq *cfs_rq, struct sched_entity *se) {} static inline void remove_entity_load_avg(struct sched_entity *se) {} +static inline void +attach_entity_load_avg(struct cfs_rq *cfs_rq, struct sched_entity *se) {} +static inline void +detach_entity_load_avg(struct cfs_rq *cfs_rq, struct sched_entity *se) {} + static inline int idle_balance(struct rq *rq) { return 0; @@ -7915,29 +7918,50 @@ prio_changed_fair(struct rq *rq, struct check_preempt_curr(rq, p, 0); } -static void detach_task_cfs_rq(struct task_struct *p, int
Re: [PATCH v3 0/5] sync a se with its cfs_rq when att(det)aching it
On Wed, Aug 19, 2015 at 03:47:11PM +0900, byungchul.p...@lge.com wrote: > From: Byungchul Park > > hello, > > there are 3 problems when att(det)aching a se to(from) its cfs_rq. > > problem 1. se's average load is not accounted with new cfs_rq in queued case, > when a task changes its cgroup. > > problem 2. cfs_rq->avg.load_avg becomes larger and larger whenever changing > cgroup to another. we have to sync se's average load with prev cfs_rq. > > problem 3. current code does not sync it with its cfs_rq when switching > sched class to fair class. if we can always assume that a se has been > detached from fair class for a long time enough for its average load to > become useless, at the time attaching it to its fair class cfs_rq, then > current code is acceptable. but the assumption is not always true. > > patch 1/5, does code refactoring for further patches. > patch 2/5, solves the problem 1. > patch 3/5, solves the problem 2. > patch 4/5, solves the problem 3. > patch 5/5, does code refactoring for better readability. > > change from v2 to v1 (logic is not changed at all) change from v2 to v3 (logic is not changed at all) i am sorry for typo. > * fix up my poor english in commit message and comment > * break down big patches into more patches for being reviewed easily > * supplement cover letter messages > > change from v1 to v2 > * introduce two functions for adjusting vruntime and load when attaching > and detaching. > * call the introduced functions instead of switched_from(to)_fair() directly > in task_move_group_fair(). > * add decaying logic for a se which has detached from a cfs_rq. > > thanks, > byungchul > > Byungchul Park (5): > sched: add two functions adjusting cfs_rq's load when att(det)aching > a se > sched: make task_move_group_fair adjust cfs_rq's load in case of > queued > sched: sync a se with prev cfs_rq when changing cgroup > sched: sync a se with its cfs_rq when switching sched class to fair > class > sched: add two functions for att(det)aching a task to(from) a cfs_rq > > kernel/sched/fair.c | 209 > +++ > 1 file changed, 110 insertions(+), 99 deletions(-) > > -- > 1.7.9.5 > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3 0/5] sync a se with its cfs_rq when att(det)aching it
I did something like this on top.. please have a look at the XXX and integrate. --- --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -2664,8 +2664,8 @@ static inline u64 cfs_rq_clock_task(stru /* Group cfs_rq's load_avg is used for task_h_load and update_cfs_share */ static inline int update_cfs_rq_load_avg(u64 now, struct cfs_rq *cfs_rq) { - int decayed; struct sched_avg *sa = cfs_rq-avg; + int decayed; if (atomic_long_read(cfs_rq-removed_load_avg)) { long r = atomic_long_xchg(cfs_rq-removed_load_avg, 0); @@ -2695,15 +2695,16 @@ static inline int update_cfs_rq_load_avg static inline void update_load_avg(struct sched_entity *se, int update_tg) { struct cfs_rq *cfs_rq = cfs_rq_of(se); - int cpu = cpu_of(rq_of(cfs_rq)); u64 now = cfs_rq_clock_task(cfs_rq); + int cpu = cpu_of(rq_of(cfs_rq)); /* * Track task load average for carrying it to new CPU after migrated, and * track group sched_entity load average for task_h_load calc in migration */ __update_load_avg(now, cpu, se-avg, - se-on_rq * scale_load_down(se-load.weight), cfs_rq-curr == se, NULL); + se-on_rq * scale_load_down(se-load.weight), + cfs_rq-curr == se, NULL); if (update_cfs_rq_load_avg(now, cfs_rq) update_tg) update_tg_load_avg(cfs_rq, 0); @@ -2718,9 +2719,10 @@ static void attach_entity_load_avg(struc * update here. we have to update it with prev cfs_rq just before changing * se's cfs_rq, and get here soon. */ - if (se-avg.last_update_time) + if (se-avg.last_update_time) { __update_load_avg(cfs_rq-avg.last_update_time, cpu_of(rq_of(cfs_rq)), - se-avg, 0, 0, NULL); + se-avg, 0, 0, NULL); + } se-avg.last_update_time = cfs_rq-avg.last_update_time; cfs_rq-avg.load_avg += se-avg.load_avg; @@ -2732,17 +2734,13 @@ static void attach_entity_load_avg(struc static void detach_entity_load_avg(struct cfs_rq *cfs_rq, struct sched_entity *se) { __update_load_avg(cfs_rq-avg.last_update_time, cpu_of(rq_of(cfs_rq)), - se-avg, se-on_rq * scale_load_down(se-load.weight), - cfs_rq-curr == se, NULL); + se-avg, se-on_rq * scale_load_down(se-load.weight), + cfs_rq-curr == se, NULL); - cfs_rq-avg.load_avg = - max_t(long, cfs_rq-avg.load_avg - se-avg.load_avg, 0); - cfs_rq-avg.load_sum = - max_t(s64, cfs_rq-avg.load_sum - se-avg.load_sum, 0); - cfs_rq-avg.util_avg = - max_t(long, cfs_rq-avg.util_avg - se-avg.util_avg, 0); - cfs_rq-avg.util_sum = - max_t(s32, cfs_rq-avg.util_sum - se-avg.util_sum, 0); + cfs_rq-avg.load_avg = max_t(long, cfs_rq-avg.load_avg - se-avg.load_avg, 0); + cfs_rq-avg.load_sum = max_t(s64, cfs_rq-avg.load_sum - se-avg.load_sum, 0); + cfs_rq-avg.util_avg = max_t(long, cfs_rq-avg.util_avg - se-avg.util_avg, 0); + cfs_rq-avg.util_sum = max_t(s32, cfs_rq-avg.util_sum - se-avg.util_sum, 0); } /* Add the load generated by se into cfs_rq's load average */ @@ -2751,14 +2749,14 @@ enqueue_entity_load_avg(struct cfs_rq *c { struct sched_avg *sa = se-avg; u64 now = cfs_rq_clock_task(cfs_rq); - int migrated = 0, decayed; + int migrated, decayed; - if (sa-last_update_time == 0) - migrated = 1; - else + migrated = !sa-last_update_time; + if (!migrated) { __update_load_avg(now, cpu_of(rq_of(cfs_rq)), sa, se-on_rq * scale_load_down(se-load.weight), cfs_rq-curr == se, NULL); + } decayed = update_cfs_rq_load_avg(now, cfs_rq); @@ -2781,7 +2779,7 @@ dequeue_entity_load_avg(struct cfs_rq *c cfs_rq-runnable_load_avg = max_t(long, cfs_rq-runnable_load_avg - se-avg.load_avg, 0); cfs_rq-runnable_load_sum = - max_t(s64, cfs_rq-runnable_load_sum - se-avg.load_sum, 0); + max_t(s64, cfs_rq-runnable_load_sum - se-avg.load_sum, 0); } /* @@ -2849,6 +2847,11 @@ static inline void dequeue_entity_load_avg(struct cfs_rq *cfs_rq, struct sched_entity *se) {} static inline void remove_entity_load_avg(struct sched_entity *se) {} +static inline void +attach_entity_load_avg(struct cfs_rq *cfs_rq, struct sched_entity *se) {} +static inline void +detach_entity_load_avg(struct cfs_rq *cfs_rq, struct sched_entity *se) {} + static inline int idle_balance(struct rq *rq) { return 0; @@ -7915,29 +7918,50 @@ prio_changed_fair(struct rq *rq, struct check_preempt_curr(rq, p, 0); } -static void detach_task_cfs_rq(struct task_struct *p, int queued) +static inline bool
Re: [PATCH v3 0/5] sync a se with its cfs_rq when att(det)aching it
On Thu, Aug 20, 2015 at 03:17:21AM +0200, Peter Zijlstra wrote: I did something like this on top.. please have a look at the XXX and integrate. yes, i will do it. --- --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -2664,8 +2664,8 @@ static inline u64 cfs_rq_clock_task(stru /* Group cfs_rq's load_avg is used for task_h_load and update_cfs_share */ static inline int update_cfs_rq_load_avg(u64 now, struct cfs_rq *cfs_rq) { - int decayed; struct sched_avg *sa = cfs_rq-avg; + int decayed; if (atomic_long_read(cfs_rq-removed_load_avg)) { long r = atomic_long_xchg(cfs_rq-removed_load_avg, 0); @@ -2695,15 +2695,16 @@ static inline int update_cfs_rq_load_avg static inline void update_load_avg(struct sched_entity *se, int update_tg) { struct cfs_rq *cfs_rq = cfs_rq_of(se); - int cpu = cpu_of(rq_of(cfs_rq)); u64 now = cfs_rq_clock_task(cfs_rq); + int cpu = cpu_of(rq_of(cfs_rq)); /* * Track task load average for carrying it to new CPU after migrated, and * track group sched_entity load average for task_h_load calc in migration */ __update_load_avg(now, cpu, se-avg, - se-on_rq * scale_load_down(se-load.weight), cfs_rq-curr == se, NULL); + se-on_rq * scale_load_down(se-load.weight), + cfs_rq-curr == se, NULL); if (update_cfs_rq_load_avg(now, cfs_rq) update_tg) update_tg_load_avg(cfs_rq, 0); @@ -2718,9 +2719,10 @@ static void attach_entity_load_avg(struc * update here. we have to update it with prev cfs_rq just before changing * se's cfs_rq, and get here soon. */ - if (se-avg.last_update_time) + if (se-avg.last_update_time) { __update_load_avg(cfs_rq-avg.last_update_time, cpu_of(rq_of(cfs_rq)), - se-avg, 0, 0, NULL); + se-avg, 0, 0, NULL); + } se-avg.last_update_time = cfs_rq-avg.last_update_time; cfs_rq-avg.load_avg += se-avg.load_avg; @@ -2732,17 +2734,13 @@ static void attach_entity_load_avg(struc static void detach_entity_load_avg(struct cfs_rq *cfs_rq, struct sched_entity *se) { __update_load_avg(cfs_rq-avg.last_update_time, cpu_of(rq_of(cfs_rq)), - se-avg, se-on_rq * scale_load_down(se-load.weight), - cfs_rq-curr == se, NULL); + se-avg, se-on_rq * scale_load_down(se-load.weight), + cfs_rq-curr == se, NULL); - cfs_rq-avg.load_avg = - max_t(long, cfs_rq-avg.load_avg - se-avg.load_avg, 0); - cfs_rq-avg.load_sum = - max_t(s64, cfs_rq-avg.load_sum - se-avg.load_sum, 0); - cfs_rq-avg.util_avg = - max_t(long, cfs_rq-avg.util_avg - se-avg.util_avg, 0); - cfs_rq-avg.util_sum = - max_t(s32, cfs_rq-avg.util_sum - se-avg.util_sum, 0); + cfs_rq-avg.load_avg = max_t(long, cfs_rq-avg.load_avg - se-avg.load_avg, 0); + cfs_rq-avg.load_sum = max_t(s64, cfs_rq-avg.load_sum - se-avg.load_sum, 0); + cfs_rq-avg.util_avg = max_t(long, cfs_rq-avg.util_avg - se-avg.util_avg, 0); + cfs_rq-avg.util_sum = max_t(s32, cfs_rq-avg.util_sum - se-avg.util_sum, 0); } /* Add the load generated by se into cfs_rq's load average */ @@ -2751,14 +2749,14 @@ enqueue_entity_load_avg(struct cfs_rq *c { struct sched_avg *sa = se-avg; u64 now = cfs_rq_clock_task(cfs_rq); - int migrated = 0, decayed; + int migrated, decayed; - if (sa-last_update_time == 0) - migrated = 1; - else + migrated = !sa-last_update_time; + if (!migrated) { __update_load_avg(now, cpu_of(rq_of(cfs_rq)), sa, se-on_rq * scale_load_down(se-load.weight), cfs_rq-curr == se, NULL); + } decayed = update_cfs_rq_load_avg(now, cfs_rq); @@ -2781,7 +2779,7 @@ dequeue_entity_load_avg(struct cfs_rq *c cfs_rq-runnable_load_avg = max_t(long, cfs_rq-runnable_load_avg - se-avg.load_avg, 0); cfs_rq-runnable_load_sum = - max_t(s64, cfs_rq-runnable_load_sum - se-avg.load_sum, 0); + max_t(s64, cfs_rq-runnable_load_sum - se-avg.load_sum, 0); } /* @@ -2849,6 +2847,11 @@ static inline void dequeue_entity_load_avg(struct cfs_rq *cfs_rq, struct sched_entity *se) {} static inline void remove_entity_load_avg(struct sched_entity *se) {} +static inline void +attach_entity_load_avg(struct cfs_rq *cfs_rq, struct sched_entity *se) {} +static inline void +detach_entity_load_avg(struct cfs_rq *cfs_rq, struct sched_entity *se) {} + static inline int idle_balance(struct rq *rq) { return 0; @@ -7915,29 +7918,50 @@ prio_changed_fair(struct rq *rq, struct check_preempt_curr(rq, p, 0); } -static void
Re: [PATCH v3 0/5] sync a se with its cfs_rq when att(det)aching it
On Wed, Aug 19, 2015 at 03:47:11PM +0900, byungchul.p...@lge.com wrote: From: Byungchul Park byungchul.p...@lge.com hello, there are 3 problems when att(det)aching a se to(from) its cfs_rq. problem 1. se's average load is not accounted with new cfs_rq in queued case, when a task changes its cgroup. problem 2. cfs_rq-avg.load_avg becomes larger and larger whenever changing cgroup to another. we have to sync se's average load with prev cfs_rq. problem 3. current code does not sync it with its cfs_rq when switching sched class to fair class. if we can always assume that a se has been detached from fair class for a long time enough for its average load to become useless, at the time attaching it to its fair class cfs_rq, then current code is acceptable. but the assumption is not always true. patch 1/5, does code refactoring for further patches. patch 2/5, solves the problem 1. patch 3/5, solves the problem 2. patch 4/5, solves the problem 3. patch 5/5, does code refactoring for better readability. change from v2 to v1 (logic is not changed at all) change from v2 to v3 (logic is not changed at all) i am sorry for typo. * fix up my poor english in commit message and comment * break down big patches into more patches for being reviewed easily * supplement cover letter messages change from v1 to v2 * introduce two functions for adjusting vruntime and load when attaching and detaching. * call the introduced functions instead of switched_from(to)_fair() directly in task_move_group_fair(). * add decaying logic for a se which has detached from a cfs_rq. thanks, byungchul Byungchul Park (5): sched: add two functions adjusting cfs_rq's load when att(det)aching a se sched: make task_move_group_fair adjust cfs_rq's load in case of queued sched: sync a se with prev cfs_rq when changing cgroup sched: sync a se with its cfs_rq when switching sched class to fair class sched: add two functions for att(det)aching a task to(from) a cfs_rq kernel/sched/fair.c | 209 +++ 1 file changed, 110 insertions(+), 99 deletions(-) -- 1.7.9.5 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/