Re: Integrating cpusets and cpu isolation [was Re: [CPUISOL] CPU isolation extensions]

Max Krasnyansky Sun, 03 Feb 2008 22:05:14 -0800

Paul Jackson wrote:
> Max wrote:
>> Paul, I actually mentioned at the beginning of my email that I did read that 
>> thread
>> started by Peter. I did learn quite a bit from it :)
> 
> Ah - sorry - I missed that part.  However, I'm still getting the feeling
> that there were some key points in that thread that we have not managed
> to communicate successfully.
I think you are assuming that I only need to deal with RT scheduler and 
scheduler
domains which is not correct. See below.


>> Sounds like at this point we're in agreement that sched_load_balance is not 
>> suitable
>> for what I'd like to achieve.
> 
> I don't think we're in agreement; I think we're in confusion ;)
Yeah. I don't believe I'm the confused side though ;-)

> Yes, sched_load_balance does not *directly* have anything to do with this.
> 
> But indirectly it is a critical element in what I think you'd like to
> achieve.  It affects how the cpuset code sets up sched_domains, and
> if I understand correctly, you require either (1) some sched_domains to
> only contain RT tasks, or (2) some CPUs to be in no sched_domain at all.
> 
> Proper configuration of the cpuset hierarchy, including the setting of
> the per-cpuset sched_load_balance flag, can provide either of these
> sched_domain partitions, as desired.
Again you're assuming that scheduling domain partitioning satisfies my 
requirements
or addresses my use case. It does not. See below for more details.
 
>> But how about making cpusets aware of the cpu_isolated_map ?
> 
> No.  That's confusing cpusets and the scheduler again.
> 
> The cpu_isolated_map is a file static variable known only within
> the kernel/sched.c file; this should not change.
I completely disagree. In fact I think all the cpu_xxx_map (online, present, 
isolated)
variables do not belong in the scheduler code. I'm thinking of submitting a 
patch that
factors them out into kernel/cpumask.c We already have cpumask.h.

> Presently, the boot parameter isolcpus= is just used to initialize
> what CPUs are isolated at boot, and then the sched_domain partitioning,
> as done in kernel/sched.c:partition_sched_domains() (the hook into
> the sched code that cpusets uses) determines which CPUs are isolated
> from that point forward.  I doubt that this should change either.
Sure, I did not even touch that part. I just proposed to extend the meaning of 
the 
'isolated' bit.

> In that thread referenced above, did you see the part where RT is
> achieved not by isolating CPUs from any scheduler, but rather by
> polymorphically having several schedulers available to operate on each
> sched_domain, and having RT threads self-select the RT scheduler?
Absolutely. Yes that is. I saw that part. But it has nothing to do with my use 
case.

Looks like I failed to explain what I'm trying to achieve. So let me try again.
I'd like to be able to run a CPU intensive (%100) RT task on one of the 
processors without 
adversely affecting or being affected by the other system activities. System 
activities 
here include _kernel_ activities as well. Hence the proposal is to extend 
current CPU 
isolation feature.

The new definition of the CPU isolation would be:
---
1. Isolated CPU(s) must not be subject to scheduler load balancing
   Users must explicitly bind threads in order to run on those CPU(s).

2. By default interrupts must not be routed to the isolated CPU(s)
   User must route interrupts (if any) explicitly.

3. In general kernel subsystems must avoid activity on the isolated CPU(s) as 
much as possible
   Includes workqueues, per CPU threads, etc.
   This feature is configurable and is disabled by default.  
---

#1 affects scheduler and scheduler domains. It's already supported either by 
using isolcpus= boot
option or by setting "sched_load_balance" in cpusets. I'm totally happy with 
the current behavior
and my original patch did not mess with this functionality in any way.

#2 and #3 have _nothing_ to do with the scheduler or scheduler domains. I've 
been trying to explain 
that for a few days now ;-). When you saw my patches for #2 and #3 you told me 
that you'd be interested 
to see them implemented on top of the "sched_load_balance" flag. Here is your 
original reply
        http://marc.info/?l=linux-kernel&m=120153260217699&w=2

So I looked into that and provided an explanation why it would not work or 
would work but would add 
lots of complexity (access to internal cpuset structures, locking, etc).
My email on that is here:
        http://marc.info/?l=linux-kernel&m=120180692331461&w=2

Now, I felt from the beginning that cpusets is not the right mechanism to 
address number #2 and #3.
The best mechanism IMO is to simply provide an access to the cpu_isolated_map 
to the rest of the kernel.
Again the fact that cpu_isolated_map currently lives in the scheduler code does 
not change anything 
here because as I explained I'm proposing to extend the meaning of the "CPU 
isolation". I provided 
dynamic access to the "isolated" bit only for convince, it does _not_ change 
existing scheduler/sched 
domain/cpuset logic in any way.

Hopefully we're on the same page with regards to the "CPU isolation" now.
If not please let me know what I missed from the earlier discussions or other 
scheduler related threads.

---

If you think that making cpusets aware of isolated cpus is not the right thing 
to do that's perfectly
fine by me. I think it'd be better if they were but we can keep things the way 
they are right now.

Max
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Integrating cpusets and cpu isolation [was Re: [CPUISOL] CPU isolation extensions]

Reply via email to