Re: [PATCH 0/6] support "dataplane" mode for nohz_full

2015-05-26 Thread Mike Galbraith
On Tue, 2015-05-26 at 15:51 -0400, Chris Metcalf wrote:

> On balance I suspect it's still better to make command line arguments
> handle the common cases most succinctly.

I prefer user specifies precisely, but yeah, that entails more typing.  

Idle curiosity: can SGI monster from hell boot a NO_HZ_FULL_ALL kernel,
w/wo it implying isolcpus?  Readers having same and a reactor to power
it in their basement, please test.

-Mike

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/6] support "dataplane" mode for nohz_full

2015-05-26 Thread Chris Metcalf

Thanks for the clarification, and sorry for the slow reply; I had a busy
week of meetings last week, and then the long weekend in the U.S.

On 05/15/2015 02:44 PM, Mike Galbraith wrote:

Just because the nohz_full feature itself is currently static is no
reason to put users thereof in a straight jacket by mandating that any
set they define irrevocably disappears from the generic resource pool .
Those CPUS are useful until the moment someone cripples them, which
making nohz_full imply isolcpus does if isolcpus then also becomes
immutable, which Rik's patch does.  Making nohz_full imply isolcpus
sounds perfectly fine until someone comes along and makes isolcpus
immutable (Rik's patch), at which point the user loses a choice due to
two people making it imply things that _alone_ sound perfectly fine.

See what I'm saying now?


That does make sense; my argument was that 99% of the time when
someone specifies nohz_full they also need isolcpus.  You're right
that someone playing with nohz_full would be unpleasantly surprised.
And of course having more flexibility always feels like a plus.
On balance I suspect it's still better to make command line arguments
handle the common cases most succinctly.

Hopefully we'll get a to a point where all of this is dynamic and how
we play with the boot arguments no longer matters.  If not, perhaps
we revisit this and make a cpu_isolation=1-15 type command line
argument that enables isolcpus and nohz_full both.


Thomas has nuked the hrtimer softirq.

Yes, this I didn't know.  So I will drop my "no ksoftirqd" patch and
we will see if ksoftirqs emerge as an issue for my "cpu isolation"
stuff in the future; it may be that that was the only issue.


Inlining softirqs may save a context switch, but adds cycles that we may
consume at higher frequency than the thing we're avoiding.

Yes but consuming cycles is not nearly as much of a concern
as avoiding interrupts or scheduling, certainly for the case of
userspace drivers that I described above.

If you're raising softirqs in an SMP kernel, you're also doing something
that puts you at very serious risk of meeting the jitter monster, locks,
and worse, sleeping locks, no?


The softirqs were being raised by third parties for hrtimer, not by
the application code itself, if I remember correctly.  In any case
this appears not to be an issue for nohz_full any more now.

--
Chris Metcalf, EZChip Semiconductor
http://www.ezchip.com

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/6] support dataplane mode for nohz_full

2015-05-26 Thread Mike Galbraith
On Tue, 2015-05-26 at 15:51 -0400, Chris Metcalf wrote:

 On balance I suspect it's still better to make command line arguments
 handle the common cases most succinctly.

I prefer user specifies precisely, but yeah, that entails more typing.  

Idle curiosity: can SGI monster from hell boot a NO_HZ_FULL_ALL kernel,
w/wo it implying isolcpus?  Readers having same and a reactor to power
it in their basement, please test.

-Mike

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/6] support dataplane mode for nohz_full

2015-05-26 Thread Chris Metcalf

Thanks for the clarification, and sorry for the slow reply; I had a busy
week of meetings last week, and then the long weekend in the U.S.

On 05/15/2015 02:44 PM, Mike Galbraith wrote:

Just because the nohz_full feature itself is currently static is no
reason to put users thereof in a straight jacket by mandating that any
set they define irrevocably disappears from the generic resource pool .
Those CPUS are useful until the moment someone cripples them, which
making nohz_full imply isolcpus does if isolcpus then also becomes
immutable, which Rik's patch does.  Making nohz_full imply isolcpus
sounds perfectly fine until someone comes along and makes isolcpus
immutable (Rik's patch), at which point the user loses a choice due to
two people making it imply things that _alone_ sound perfectly fine.

See what I'm saying now?


That does make sense; my argument was that 99% of the time when
someone specifies nohz_full they also need isolcpus.  You're right
that someone playing with nohz_full would be unpleasantly surprised.
And of course having more flexibility always feels like a plus.
On balance I suspect it's still better to make command line arguments
handle the common cases most succinctly.

Hopefully we'll get a to a point where all of this is dynamic and how
we play with the boot arguments no longer matters.  If not, perhaps
we revisit this and make a cpu_isolation=1-15 type command line
argument that enables isolcpus and nohz_full both.


Thomas has nuked the hrtimer softirq.

Yes, this I didn't know.  So I will drop my no ksoftirqd patch and
we will see if ksoftirqs emerge as an issue for my cpu isolation
stuff in the future; it may be that that was the only issue.


Inlining softirqs may save a context switch, but adds cycles that we may
consume at higher frequency than the thing we're avoiding.

Yes but consuming cycles is not nearly as much of a concern
as avoiding interrupts or scheduling, certainly for the case of
userspace drivers that I described above.

If you're raising softirqs in an SMP kernel, you're also doing something
that puts you at very serious risk of meeting the jitter monster, locks,
and worse, sleeping locks, no?


The softirqs were being raised by third parties for hrtimer, not by
the application code itself, if I remember correctly.  In any case
this appears not to be an issue for nohz_full any more now.

--
Chris Metcalf, EZChip Semiconductor
http://www.ezchip.com

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/6] support "dataplane" mode for nohz_full

2015-05-15 Thread Mike Galbraith
On Fri, 2015-05-15 at 11:05 -0400, Chris Metcalf wrote:
> On 05/11/2015 09:47 PM, Mike Galbraith wrote:
> > On Mon, 2015-05-11 at 15:25 -0400, Chris Metcalf wrote:
> >> On 05/11/2015 03:19 PM, Mike Galbraith wrote:
> >>> I really shouldn't have acked nohz_full -> isolcpus.  Beside the fact
> >>> that old static isolcpus was_supposed_  to crawl off and die, I know
> >>> beyond doubt that having isolated a cpu as well as you can definitely
> >>> does NOT imply that said cpu should become tickless.
> >> True, at a high level, I agree that it would be better to have a
> >> top-level concept like Frederic's proposed ISOLATION that includes
> >> isolcpus and nohz_cpu (and other stuff as needed).
> >>
> >> That said, what you wrote above is wrong; even with the patch you
> >> acked, setting isolcpus does not automatically turn on nohz_full for
> >> a given cpu.  The patch made it true the other way around: when
> >> you say nohz_full, you automatically get isolcpus on that cpu too.
> >> That does, at least, make sense for the semantics of nohz_full.
> > I didn't write that, I wrote nohz_full implies (spelled '->') isolcpus.
> > Yes, with nohz_full currently being static, the old allegedly dying but
> > also static isolcpus scheduler off switch is a convenient thing to wire
> > the nohz_full CPU SET (<- hint;) property to.
> 
> Yes, I was responding to the bit where you said "having isolated a
> cpu as well as you can does NOT imply it should become tickless",
> but indeed, the "nohz_full -> isolcpus" patch didn't make that true.
> In any case sounds like we were just talking past each other.

Yup.

> > BTW, another facet of this: Rik wants to make isolcpus immune to
> > cpusets, which makes some sense, user did say isolcpus=, but that also
> > makes isolcpus truly static.  If the user now says nohz_full=, they lose
> > the ability to deactivate CPU isolation, making the set fairly useless
> > for anything other than HPC.  Currently, the user can flip the isolation
> > switch as he sees fit.  He takes a size extra large performance hit for
> > having said nohz_full=, but he doesn't lose generic utility.
> 
> I don't I follow this completely.  If the user says nohz_full=, he
> probably doesn't care about deactivating isolcpus later, since that
> defeats the entire purpose of the nohz_full= in the first place,
> as far as I can tell.  And when you say "anything other than HPC",
> I'm not sure what you mean; as far as I know high-performance
> computing only cares because it wants that extra 0.5% of the
> cpu or whatever interrupts eat up, but just as a nice-to-have.
> The real use case is high-performance userspace drivers where
> the nohz_full cores are responding to real-time things like packet
> arrivals with almost no latency to spare.

Ok, verbosity on.

Currently, nohz_full is static, meaning in a dynamic environment, where
the user may not have a constant need for it, if you make it imply
isolcpus, then make isolcpus immutable, you have just needlessly taken
an option from the user.  Those CPUS are no longer part of his generic
resource pool, and he has nothing to say about it.

> What is the generic utility you're envisioning for nohz_full cores
> that have turned off scheduler isolation?  I assume it's some
> workload where you'd prefer not to have too many interrupts
> but still are running multiple tasks, but in that case does it really
> make much difference in practice?

Again, I think we're talking past one another.

I'm saying there is no need to mandate, nothing more.  For your needs,
my needs whatever, that immutable may sound good, but in fact, it
removes flexibility, and for no good reason.

This shows immediately in simple testing.  Do I need nohz_full?  Hell
no, only for testing.  If I want to test, I obviously need it for a
while, and yes, I can reboot... but what's the difference between me the
silly tester who needs it only to see if it works at all, and how well,
and some guy who does something critical once in a while, or a company
with a pool of big boxen that they reconfigure on the fly to meet
whatever dynamic needs?

Just because the nohz_full feature itself is currently static is no
reason to put users thereof in a straight jacket by mandating that any
set they define irrevocably disappears from the generic resource pool .
Those CPUS are useful until the moment someone cripples them, which
making nohz_full imply isolcpus does if isolcpus then also becomes
immutable, which Rik's patch does.  Making nohz_full imply isolcpus
sounds perfectly fine until someone comes along and makes isolcpus
immutable (Rik's patch), at which point the user loses a choice due to
two people making it imply things that _alone_ sound perfectly fine.

See what I'm saying now?

> > Thomas has nuked the hrtimer softirq.
> 
> Yes, this I didn't know.  So I will drop my "no ksoftirqd" patch and
> we will see if ksoftirqs emerge as an issue for my "cpu isolation"
> stuff in the future; it may be that that was the 

Re: [PATCH 0/6] support "dataplane" mode for nohz_full

2015-05-15 Thread Chris Metcalf

On 05/12/2015 06:46 AM, Peter Zijlstra wrote:

On Mon, May 11, 2015 at 08:57:59AM -0400, Steven Rostedt wrote:

Please lets get NO_HZ_FULL up to par. That should be the main focus.


ACK, much of this dataplane stuff is (useful) hacks working around the
fact that nohz_full just isn't complete.


There are enough disjoint threads on this topic that I want
to just touch base here and see if you have been convinced
on other threads that there is stuff beyond the hacks here:
in particular

1. The basic "dataplane" mode to arrange to do extra work on
return to kernel space that normally isn't warranted, to avoid
future IPIs, and additionally to wait in the kernel until any timer
interrupts required by the kernel invocation itself are done; and

2. The "strict" mode to allow a task to tell the kernel it isn't
planning on making any more such calls, and have the kernel
help diagnose any resulting application bugs.

--
Chris Metcalf, EZChip Semiconductor
http://www.ezchip.com

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/6] support "dataplane" mode for nohz_full

2015-05-15 Thread Chris Metcalf

On 05/11/2015 09:47 PM, Mike Galbraith wrote:

On Mon, 2015-05-11 at 15:25 -0400, Chris Metcalf wrote:

On 05/11/2015 03:19 PM, Mike Galbraith wrote:

I really shouldn't have acked nohz_full -> isolcpus.  Beside the fact
that old static isolcpus was_supposed_  to crawl off and die, I know
beyond doubt that having isolated a cpu as well as you can definitely
does NOT imply that said cpu should become tickless.

True, at a high level, I agree that it would be better to have a
top-level concept like Frederic's proposed ISOLATION that includes
isolcpus and nohz_cpu (and other stuff as needed).

That said, what you wrote above is wrong; even with the patch you
acked, setting isolcpus does not automatically turn on nohz_full for
a given cpu.  The patch made it true the other way around: when
you say nohz_full, you automatically get isolcpus on that cpu too.
That does, at least, make sense for the semantics of nohz_full.

I didn't write that, I wrote nohz_full implies (spelled '->') isolcpus.
Yes, with nohz_full currently being static, the old allegedly dying but
also static isolcpus scheduler off switch is a convenient thing to wire
the nohz_full CPU SET (<- hint;) property to.


Yes, I was responding to the bit where you said "having isolated a
cpu as well as you can does NOT imply it should become tickless",
but indeed, the "nohz_full -> isolcpus" patch didn't make that true.
In any case sounds like we were just talking past each other.


BTW, another facet of this: Rik wants to make isolcpus immune to
cpusets, which makes some sense, user did say isolcpus=, but that also
makes isolcpus truly static.  If the user now says nohz_full=, they lose
the ability to deactivate CPU isolation, making the set fairly useless
for anything other than HPC.  Currently, the user can flip the isolation
switch as he sees fit.  He takes a size extra large performance hit for
having said nohz_full=, but he doesn't lose generic utility.


I don't I follow this completely.  If the user says nohz_full=, he
probably doesn't care about deactivating isolcpus later, since that
defeats the entire purpose of the nohz_full= in the first place,
as far as I can tell.  And when you say "anything other than HPC",
I'm not sure what you mean; as far as I know high-performance
computing only cares because it wants that extra 0.5% of the
cpu or whatever interrupts eat up, but just as a nice-to-have.
The real use case is high-performance userspace drivers where
the nohz_full cores are responding to real-time things like packet
arrivals with almost no latency to spare.

What is the generic utility you're envisioning for nohz_full cores
that have turned off scheduler isolation?  I assume it's some
workload where you'd prefer not to have too many interrupts
but still are running multiple tasks, but in that case does it really
make much difference in practice?


Thomas has nuked the hrtimer softirq.


Yes, this I didn't know.  So I will drop my "no ksoftirqd" patch and
we will see if ksoftirqs emerge as an issue for my "cpu isolation"
stuff in the future; it may be that that was the only issue.


Inlining softirqs may save a context switch, but adds cycles that we may
consume at higher frequency than the thing we're avoiding.


Yes but consuming cycles is not nearly as much of a concern
as avoiding interrupts or scheduling, certainly for the case of
userspace drivers that I described above.

--
Chris Metcalf, EZChip Semiconductor
http://www.ezchip.com

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/6] support dataplane mode for nohz_full

2015-05-15 Thread Chris Metcalf

On 05/12/2015 06:46 AM, Peter Zijlstra wrote:

On Mon, May 11, 2015 at 08:57:59AM -0400, Steven Rostedt wrote:

Please lets get NO_HZ_FULL up to par. That should be the main focus.


ACK, much of this dataplane stuff is (useful) hacks working around the
fact that nohz_full just isn't complete.


There are enough disjoint threads on this topic that I want
to just touch base here and see if you have been convinced
on other threads that there is stuff beyond the hacks here:
in particular

1. The basic dataplane mode to arrange to do extra work on
return to kernel space that normally isn't warranted, to avoid
future IPIs, and additionally to wait in the kernel until any timer
interrupts required by the kernel invocation itself are done; and

2. The strict mode to allow a task to tell the kernel it isn't
planning on making any more such calls, and have the kernel
help diagnose any resulting application bugs.

--
Chris Metcalf, EZChip Semiconductor
http://www.ezchip.com

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/6] support dataplane mode for nohz_full

2015-05-15 Thread Chris Metcalf

On 05/11/2015 09:47 PM, Mike Galbraith wrote:

On Mon, 2015-05-11 at 15:25 -0400, Chris Metcalf wrote:

On 05/11/2015 03:19 PM, Mike Galbraith wrote:

I really shouldn't have acked nohz_full - isolcpus.  Beside the fact
that old static isolcpus was_supposed_  to crawl off and die, I know
beyond doubt that having isolated a cpu as well as you can definitely
does NOT imply that said cpu should become tickless.

True, at a high level, I agree that it would be better to have a
top-level concept like Frederic's proposed ISOLATION that includes
isolcpus and nohz_cpu (and other stuff as needed).

That said, what you wrote above is wrong; even with the patch you
acked, setting isolcpus does not automatically turn on nohz_full for
a given cpu.  The patch made it true the other way around: when
you say nohz_full, you automatically get isolcpus on that cpu too.
That does, at least, make sense for the semantics of nohz_full.

I didn't write that, I wrote nohz_full implies (spelled '-') isolcpus.
Yes, with nohz_full currently being static, the old allegedly dying but
also static isolcpus scheduler off switch is a convenient thing to wire
the nohz_full CPU SET (- hint;) property to.


Yes, I was responding to the bit where you said having isolated a
cpu as well as you can does NOT imply it should become tickless,
but indeed, the nohz_full - isolcpus patch didn't make that true.
In any case sounds like we were just talking past each other.


BTW, another facet of this: Rik wants to make isolcpus immune to
cpusets, which makes some sense, user did say isolcpus=, but that also
makes isolcpus truly static.  If the user now says nohz_full=, they lose
the ability to deactivate CPU isolation, making the set fairly useless
for anything other than HPC.  Currently, the user can flip the isolation
switch as he sees fit.  He takes a size extra large performance hit for
having said nohz_full=, but he doesn't lose generic utility.


I don't I follow this completely.  If the user says nohz_full=, he
probably doesn't care about deactivating isolcpus later, since that
defeats the entire purpose of the nohz_full= in the first place,
as far as I can tell.  And when you say anything other than HPC,
I'm not sure what you mean; as far as I know high-performance
computing only cares because it wants that extra 0.5% of the
cpu or whatever interrupts eat up, but just as a nice-to-have.
The real use case is high-performance userspace drivers where
the nohz_full cores are responding to real-time things like packet
arrivals with almost no latency to spare.

What is the generic utility you're envisioning for nohz_full cores
that have turned off scheduler isolation?  I assume it's some
workload where you'd prefer not to have too many interrupts
but still are running multiple tasks, but in that case does it really
make much difference in practice?


Thomas has nuked the hrtimer softirq.


Yes, this I didn't know.  So I will drop my no ksoftirqd patch and
we will see if ksoftirqs emerge as an issue for my cpu isolation
stuff in the future; it may be that that was the only issue.


Inlining softirqs may save a context switch, but adds cycles that we may
consume at higher frequency than the thing we're avoiding.


Yes but consuming cycles is not nearly as much of a concern
as avoiding interrupts or scheduling, certainly for the case of
userspace drivers that I described above.

--
Chris Metcalf, EZChip Semiconductor
http://www.ezchip.com

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/6] support dataplane mode for nohz_full

2015-05-15 Thread Mike Galbraith
On Fri, 2015-05-15 at 11:05 -0400, Chris Metcalf wrote:
 On 05/11/2015 09:47 PM, Mike Galbraith wrote:
  On Mon, 2015-05-11 at 15:25 -0400, Chris Metcalf wrote:
  On 05/11/2015 03:19 PM, Mike Galbraith wrote:
  I really shouldn't have acked nohz_full - isolcpus.  Beside the fact
  that old static isolcpus was_supposed_  to crawl off and die, I know
  beyond doubt that having isolated a cpu as well as you can definitely
  does NOT imply that said cpu should become tickless.
  True, at a high level, I agree that it would be better to have a
  top-level concept like Frederic's proposed ISOLATION that includes
  isolcpus and nohz_cpu (and other stuff as needed).
 
  That said, what you wrote above is wrong; even with the patch you
  acked, setting isolcpus does not automatically turn on nohz_full for
  a given cpu.  The patch made it true the other way around: when
  you say nohz_full, you automatically get isolcpus on that cpu too.
  That does, at least, make sense for the semantics of nohz_full.
  I didn't write that, I wrote nohz_full implies (spelled '-') isolcpus.
  Yes, with nohz_full currently being static, the old allegedly dying but
  also static isolcpus scheduler off switch is a convenient thing to wire
  the nohz_full CPU SET (- hint;) property to.
 
 Yes, I was responding to the bit where you said having isolated a
 cpu as well as you can does NOT imply it should become tickless,
 but indeed, the nohz_full - isolcpus patch didn't make that true.
 In any case sounds like we were just talking past each other.

Yup.

  BTW, another facet of this: Rik wants to make isolcpus immune to
  cpusets, which makes some sense, user did say isolcpus=, but that also
  makes isolcpus truly static.  If the user now says nohz_full=, they lose
  the ability to deactivate CPU isolation, making the set fairly useless
  for anything other than HPC.  Currently, the user can flip the isolation
  switch as he sees fit.  He takes a size extra large performance hit for
  having said nohz_full=, but he doesn't lose generic utility.
 
 I don't I follow this completely.  If the user says nohz_full=, he
 probably doesn't care about deactivating isolcpus later, since that
 defeats the entire purpose of the nohz_full= in the first place,
 as far as I can tell.  And when you say anything other than HPC,
 I'm not sure what you mean; as far as I know high-performance
 computing only cares because it wants that extra 0.5% of the
 cpu or whatever interrupts eat up, but just as a nice-to-have.
 The real use case is high-performance userspace drivers where
 the nohz_full cores are responding to real-time things like packet
 arrivals with almost no latency to spare.

Ok, verbosity on.

Currently, nohz_full is static, meaning in a dynamic environment, where
the user may not have a constant need for it, if you make it imply
isolcpus, then make isolcpus immutable, you have just needlessly taken
an option from the user.  Those CPUS are no longer part of his generic
resource pool, and he has nothing to say about it.

 What is the generic utility you're envisioning for nohz_full cores
 that have turned off scheduler isolation?  I assume it's some
 workload where you'd prefer not to have too many interrupts
 but still are running multiple tasks, but in that case does it really
 make much difference in practice?

Again, I think we're talking past one another.

I'm saying there is no need to mandate, nothing more.  For your needs,
my needs whatever, that immutable may sound good, but in fact, it
removes flexibility, and for no good reason.

This shows immediately in simple testing.  Do I need nohz_full?  Hell
no, only for testing.  If I want to test, I obviously need it for a
while, and yes, I can reboot... but what's the difference between me the
silly tester who needs it only to see if it works at all, and how well,
and some guy who does something critical once in a while, or a company
with a pool of big boxen that they reconfigure on the fly to meet
whatever dynamic needs?

Just because the nohz_full feature itself is currently static is no
reason to put users thereof in a straight jacket by mandating that any
set they define irrevocably disappears from the generic resource pool .
Those CPUS are useful until the moment someone cripples them, which
making nohz_full imply isolcpus does if isolcpus then also becomes
immutable, which Rik's patch does.  Making nohz_full imply isolcpus
sounds perfectly fine until someone comes along and makes isolcpus
immutable (Rik's patch), at which point the user loses a choice due to
two people making it imply things that _alone_ sound perfectly fine.

See what I'm saying now?

  Thomas has nuked the hrtimer softirq.
 
 Yes, this I didn't know.  So I will drop my no ksoftirqd patch and
 we will see if ksoftirqs emerge as an issue for my cpu isolation
 stuff in the future; it may be that that was the only issue.
 
  Inlining softirqs may save a context switch, but adds cycles that we may
  consume at higher 

Re: [PATCH 0/6] support "dataplane" mode for nohz_full

2015-05-12 Thread Paul E. McKenney
On Mon, May 11, 2015 at 03:52:37PM -0400, Chris Metcalf wrote:
> On 05/09/2015 03:19 AM, Andy Lutomirski wrote:
> >Naming aside, I don't think this should be a per-task flag at all.  We
> >already have way too much overhead per syscall in nohz mode, and it
> >would be nice to get the per-syscall overhead as low as possible.  We
> >should strive, for all tasks, to keep syscall overhead down*and*
> >avoid as many interrupts as possible.
> >
> >That being said, I do see a legitimate use for a way to tell the
> >kernel "I'm going to run in userspace for a long time; stay away".
> >But shouldn't that be a single operation, not an ongoing flag?  IOW, I
> >think that we should have a new syscall quiesce() or something rather
> >than a prctl.
> 
> Yes, if all you are concerned about is quiescing the tick, we could
> probably do it as a new syscall.
> 
> I do note that you'd want to try to actually do the quiesce as late as
> possible - in particular, if you just did it in the usual syscall, you
> might miss out on a timer that is set by softirq, or even something
> that happened when you called schedule() on the syscall exit path.
> Doing it as late as we are doing helps to ensure that that doesn't
> happen.  We could still arrange for this semantics by having a new
> quiesce() syscall set a temporary task bit that was cleared on
> return to userspace, but as you pointed out in a different email,
> that gets tricky if you end up doing multiple user_exit() calls on
> your way back to userspace.
> 
> More to the point, I think it's actually important to know when an
> application believes it's in userspace-only mode as an actual state
> bit, rather than just during its transitional moment.  If an
> application calls the kernel at an unexpected time (third-party code
> is the usual culprit for our customers, whether it's syscalls, page
> faults, or other things) we would prefer to have the "quiesce"
> semantics stay in force and cause the third-party code to be
> visibly very slow, rather than cause a totally unexpected and
> hard-to-diagnose interrupt show up later as we are still going
> around the loop that we thought was safely userspace-only.
> 
> And, for debugging the kernel, it's crazy helpful to have that state
> bit in place: see patch 6/6 in the series for how we can diagnose
> things like "a different core just queued an IPI that will hit a
> dataplane core unexpectedly".  Having that state bit makes this sort
> of thing a trivial check in the kernel and relatively easy to debug.

I agree with this!  It is currently a bit painful to debug problems
that might result in multiple tasks runnable on a given CPU.  If you
suspect a problem, you enable tracing and re-run.  Not paricularly
friendly for chasing down intermittent problems, so some sort of
improvement would be a very good thing.

Thanx, Paul

> Finally, I proposed a "strict" mode in patch 5/6 where we kill the
> process if it voluntarily enters the kernel by mistake after saying it
> wasn't going to any more.  To do this requires a state bit, so
> carrying another state bit for "quiesce on user entry" seems pretty
> reasonable.
> 
> -- 
> Chris Metcalf, EZChip Semiconductor
> http://www.ezchip.com
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/6] support "dataplane" mode for nohz_full

2015-05-12 Thread Peter Zijlstra
On Mon, May 11, 2015 at 08:57:59AM -0400, Steven Rostedt wrote:
> 
> Please lets get NO_HZ_FULL up to par. That should be the main focus.
> 

ACK, much of this dataplane stuff is (useful) hacks working around the
fact that nohz_full just isn't complete.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/6] support dataplane mode for nohz_full

2015-05-12 Thread Peter Zijlstra
On Mon, May 11, 2015 at 08:57:59AM -0400, Steven Rostedt wrote:
 
 Please lets get NO_HZ_FULL up to par. That should be the main focus.
 

ACK, much of this dataplane stuff is (useful) hacks working around the
fact that nohz_full just isn't complete.


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/6] support dataplane mode for nohz_full

2015-05-12 Thread Paul E. McKenney
On Mon, May 11, 2015 at 03:52:37PM -0400, Chris Metcalf wrote:
 On 05/09/2015 03:19 AM, Andy Lutomirski wrote:
 Naming aside, I don't think this should be a per-task flag at all.  We
 already have way too much overhead per syscall in nohz mode, and it
 would be nice to get the per-syscall overhead as low as possible.  We
 should strive, for all tasks, to keep syscall overhead down*and*
 avoid as many interrupts as possible.
 
 That being said, I do see a legitimate use for a way to tell the
 kernel I'm going to run in userspace for a long time; stay away.
 But shouldn't that be a single operation, not an ongoing flag?  IOW, I
 think that we should have a new syscall quiesce() or something rather
 than a prctl.
 
 Yes, if all you are concerned about is quiescing the tick, we could
 probably do it as a new syscall.
 
 I do note that you'd want to try to actually do the quiesce as late as
 possible - in particular, if you just did it in the usual syscall, you
 might miss out on a timer that is set by softirq, or even something
 that happened when you called schedule() on the syscall exit path.
 Doing it as late as we are doing helps to ensure that that doesn't
 happen.  We could still arrange for this semantics by having a new
 quiesce() syscall set a temporary task bit that was cleared on
 return to userspace, but as you pointed out in a different email,
 that gets tricky if you end up doing multiple user_exit() calls on
 your way back to userspace.
 
 More to the point, I think it's actually important to know when an
 application believes it's in userspace-only mode as an actual state
 bit, rather than just during its transitional moment.  If an
 application calls the kernel at an unexpected time (third-party code
 is the usual culprit for our customers, whether it's syscalls, page
 faults, or other things) we would prefer to have the quiesce
 semantics stay in force and cause the third-party code to be
 visibly very slow, rather than cause a totally unexpected and
 hard-to-diagnose interrupt show up later as we are still going
 around the loop that we thought was safely userspace-only.
 
 And, for debugging the kernel, it's crazy helpful to have that state
 bit in place: see patch 6/6 in the series for how we can diagnose
 things like a different core just queued an IPI that will hit a
 dataplane core unexpectedly.  Having that state bit makes this sort
 of thing a trivial check in the kernel and relatively easy to debug.

I agree with this!  It is currently a bit painful to debug problems
that might result in multiple tasks runnable on a given CPU.  If you
suspect a problem, you enable tracing and re-run.  Not paricularly
friendly for chasing down intermittent problems, so some sort of
improvement would be a very good thing.

Thanx, Paul

 Finally, I proposed a strict mode in patch 5/6 where we kill the
 process if it voluntarily enters the kernel by mistake after saying it
 wasn't going to any more.  To do this requires a state bit, so
 carrying another state bit for quiesce on user entry seems pretty
 reasonable.
 
 -- 
 Chris Metcalf, EZChip Semiconductor
 http://www.ezchip.com
 

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/6] support "dataplane" mode for nohz_full

2015-05-11 Thread Mike Galbraith
On Tue, 2015-05-12 at 03:47 +0200, Mike Galbraith wrote:
> On Mon, 2015-05-11 at 15:25 -0400, Chris Metcalf wrote:
> > On 05/11/2015 03:19 PM, Mike Galbraith wrote:
> > > I really shouldn't have acked nohz_full -> isolcpus.  Beside the fact
> > > that old static isolcpus was_supposed_  to crawl off and die, I know
> > > beyond doubt that having isolated a cpu as well as you can definitely
> > > does NOT imply that said cpu should become tickless.
> > 
> > True, at a high level, I agree that it would be better to have a
> > top-level concept like Frederic's proposed ISOLATION that includes
> > isolcpus and nohz_cpu (and other stuff as needed).
> > 
> > That said, what you wrote above is wrong; even with the patch you
> > acked, setting isolcpus does not automatically turn on nohz_full for
> > a given cpu.  The patch made it true the other way around: when
> > you say nohz_full, you automatically get isolcpus on that cpu too.
> > That does, at least, make sense for the semantics of nohz_full.
> 
> I didn't write that, I wrote nohz_full implies (spelled '->') isolcpus.
> Yes, with nohz_full currently being static, the old allegedly dying but
> also static isolcpus scheduler off switch is a convenient thing to wire
> the nohz_full CPU SET (<- hint;) property to.

BTW, another facet of this: Rik wants to make isolcpus immune to
cpusets, which makes some sense, user did say isolcpus=, but that also
makes isolcpus truly static.  If the user now says nohz_full=, they lose
the ability to deactivate CPU isolation, making the set fairly useless
for anything other than HPC.  Currently, the user can flip the isolation
switch as he sees fit.  He takes a size extra large performance hit for
having said nohz_full=, but he doesn't lose generic utility.

-Mike

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/6] support "dataplane" mode for nohz_full

2015-05-11 Thread Mike Galbraith
On Mon, 2015-05-11 at 15:25 -0400, Chris Metcalf wrote:
> On 05/11/2015 03:19 PM, Mike Galbraith wrote:
> > I really shouldn't have acked nohz_full -> isolcpus.  Beside the fact
> > that old static isolcpus was_supposed_  to crawl off and die, I know
> > beyond doubt that having isolated a cpu as well as you can definitely
> > does NOT imply that said cpu should become tickless.
> 
> True, at a high level, I agree that it would be better to have a
> top-level concept like Frederic's proposed ISOLATION that includes
> isolcpus and nohz_cpu (and other stuff as needed).
> 
> That said, what you wrote above is wrong; even with the patch you
> acked, setting isolcpus does not automatically turn on nohz_full for
> a given cpu.  The patch made it true the other way around: when
> you say nohz_full, you automatically get isolcpus on that cpu too.
> That does, at least, make sense for the semantics of nohz_full.

I didn't write that, I wrote nohz_full implies (spelled '->') isolcpus.
Yes, with nohz_full currently being static, the old allegedly dying but
also static isolcpus scheduler off switch is a convenient thing to wire
the nohz_full CPU SET (<- hint;) property to.

-Mike


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/6] support "dataplane" mode for nohz_full

2015-05-11 Thread Andy Lutomirski
On May 12, 2015 4:54 AM, "Chris Metcalf"  wrote:
>
> (Oops, resending and forcing html off.)
>
>
> On 05/09/2015 03:19 AM, Andy Lutomirski wrote:
>>
>> Naming aside, I don't think this should be a per-task flag at all.  We
>> already have way too much overhead per syscall in nohz mode, and it
>> would be nice to get the per-syscall overhead as low as possible.  We
>> should strive, for all tasks, to keep syscall overhead down*and*
>> avoid as many interrupts as possible.
>>
>> That being said, I do see a legitimate use for a way to tell the
>> kernel "I'm going to run in userspace for a long time; stay away".
>> But shouldn't that be a single operation, not an ongoing flag?  IOW, I
>> think that we should have a new syscall quiesce() or something rather
>> than a prctl.
>
>
> Yes, if all you are concerned about is quiescing the tick, we could
> probably do it as a new syscall.
>
> I do note that you'd want to try to actually do the quiesce as late as
> possible - in particular, if you just did it in the usual syscall, you
> might miss out on a timer that is set by softirq, or even something
> that happened when you called schedule() on the syscall exit path.
> Doing it as late as we are doing helps to ensure that that doesn't
> happen.  We could still arrange for this semantics by having a new
> quiesce() syscall set a temporary task bit that was cleared on
> return to userspace, but as you pointed out in a different email,
> that gets tricky if you end up doing multiple user_exit() calls on
> your way back to userspace.

We should fix that, then.  A quiesce() syscall can certainly arrange
to clean up on final exit.

>
> More to the point, I think it's actually important to know when an
> application believes it's in userspace-only mode as an actual state
> bit, rather than just during its transitional moment.

We can do that, too, with a new flag that's cleared on the next entry.

>  If an
> application calls the kernel at an unexpected time (third-party code
> is the usual culprit for our customers, whether it's syscalls, page
> faults, or other things) we would prefer to have the "quiesce"
> semantics stay in force and cause the third-party code to be
> visibly very slow, rather than cause a totally unexpected and
> hard-to-diagnose interrupt show up later as we are still going
> around the loop that we thought was safely userspace-only.

I'm not really convinced that we should design this feature around
ease of debugging userspace screwups.  There are already plenty of
ways to do that part.  Userspace getting an interrupt because
userspace accidentally did a syscall is very different from userspace
getting interrupted due to an IPI.

>
> And, for debugging the kernel, it's crazy helpful to have that state
> bit in place: see patch 6/6 in the series for how we can diagnose
> things like "a different core just queued an IPI that will hit a
> dataplane core unexpectedly".  Having that state bit makes this sort
> of thing a trivial check in the kernel and relatively easy to debug.

As above, this can be done with a one-time operation, too.

>
> Finally, I proposed a "strict" mode in patch 5/6 where we kill the
> process if it voluntarily enters the kernel by mistake after saying it
> wasn't going to any more.  To do this requires a state bit, so
> carrying another state bit for "quiesce on user entry" seems pretty
> reasonable.

I still dislike that in the form you chose.  It's too deadly to be
useful for anyone but the hardest RT users.

I think I'd be okay with variants, though: let a suitably privileged
process ask for a signal on inadvertent kernel entry or rig up an fd
to be notified when one of these bad entries happens.  Queueing
something to a pollable fd would work, too.

See that thread for more comments.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/6] support "dataplane" mode for nohz_full

2015-05-11 Thread Chris Metcalf

(Oops, resending and forcing html off.)

On 05/09/2015 03:19 AM, Andy Lutomirski wrote:

Naming aside, I don't think this should be a per-task flag at all.  We
already have way too much overhead per syscall in nohz mode, and it
would be nice to get the per-syscall overhead as low as possible.  We
should strive, for all tasks, to keep syscall overhead down*and*
avoid as many interrupts as possible.

That being said, I do see a legitimate use for a way to tell the
kernel "I'm going to run in userspace for a long time; stay away".
But shouldn't that be a single operation, not an ongoing flag?  IOW, I
think that we should have a new syscall quiesce() or something rather
than a prctl.


Yes, if all you are concerned about is quiescing the tick, we could
probably do it as a new syscall.

I do note that you'd want to try to actually do the quiesce as late as
possible - in particular, if you just did it in the usual syscall, you
might miss out on a timer that is set by softirq, or even something
that happened when you called schedule() on the syscall exit path.
Doing it as late as we are doing helps to ensure that that doesn't
happen.  We could still arrange for this semantics by having a new
quiesce() syscall set a temporary task bit that was cleared on
return to userspace, but as you pointed out in a different email,
that gets tricky if you end up doing multiple user_exit() calls on
your way back to userspace.

More to the point, I think it's actually important to know when an
application believes it's in userspace-only mode as an actual state
bit, rather than just during its transitional moment.  If an
application calls the kernel at an unexpected time (third-party code
is the usual culprit for our customers, whether it's syscalls, page
faults, or other things) we would prefer to have the "quiesce"
semantics stay in force and cause the third-party code to be
visibly very slow, rather than cause a totally unexpected and
hard-to-diagnose interrupt show up later as we are still going
around the loop that we thought was safely userspace-only.

And, for debugging the kernel, it's crazy helpful to have that state
bit in place: see patch 6/6 in the series for how we can diagnose
things like "a different core just queued an IPI that will hit a
dataplane core unexpectedly".  Having that state bit makes this sort
of thing a trivial check in the kernel and relatively easy to debug.

Finally, I proposed a "strict" mode in patch 5/6 where we kill the
process if it voluntarily enters the kernel by mistake after saying it
wasn't going to any more.  To do this requires a state bit, so
carrying another state bit for "quiesce on user entry" seems pretty
reasonable.

--
Chris Metcalf, EZChip Semiconductor
http://www.ezchip.com

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/6] support "dataplane" mode for nohz_full

2015-05-11 Thread Chris Metcalf

On 05/11/2015 03:19 PM, Mike Galbraith wrote:

I really shouldn't have acked nohz_full -> isolcpus.  Beside the fact
that old static isolcpus was_supposed_  to crawl off and die, I know
beyond doubt that having isolated a cpu as well as you can definitely
does NOT imply that said cpu should become tickless.


True, at a high level, I agree that it would be better to have a
top-level concept like Frederic's proposed ISOLATION that includes
isolcpus and nohz_cpu (and other stuff as needed).

That said, what you wrote above is wrong; even with the patch you
acked, setting isolcpus does not automatically turn on nohz_full for
a given cpu.  The patch made it true the other way around: when
you say nohz_full, you automatically get isolcpus on that cpu too.
That does, at least, make sense for the semantics of nohz_full.

--
Chris Metcalf, EZChip Semiconductor
http://www.ezchip.com

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/6] support "dataplane" mode for nohz_full

2015-05-11 Thread Mike Galbraith
On Mon, 2015-05-11 at 17:36 +0200, Frederic Weisbecker wrote:

> I expect some Real Time users may want this kind of dataplane mode where a 
> syscall
> or whatever sleeps until the system is ready to provide the guarantee that no
> disturbance is going to happen for a given time. I'm not sure HPC users are 
> interested
> in that.

I bet they are.  RT is just a different way to spell HPC, and reverse.

> In fact it goes along the fact that NO_HZ_FULL was really only supposed to be 
> about
> the tick and now people are introducing more and more kernel default 
> presetting that
> assume NO_HZ_FULL implies ISOLATION which is about all kind of noise (tick, 
> tasks, irqs,
> ...). Which is true but what kind of ISOLATION?

True, nohz mode and various isolation measures are distinct properties.
NO_HZ_FULL is kinda pointless without isolation measures to go with it,
but you're right.

I really shouldn't have acked nohz_full -> isolcpus.  Beside the fact
that old static isolcpus was _supposed_ to crawl off and die, I know
beyond doubt that having isolated a cpu as well as you can definitely
does NOT imply that said cpu should become tickless.  I routinely run a
load model that wants all the isolation it can get.  It's not single
task compute though, rt executive coordinating rt workers, and of course
wants every cycle it can get, so nohz_full is less than helpful.

-Mike

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/6] support "dataplane" mode for nohz_full

2015-05-11 Thread Steven Rostedt
On Mon, 11 May 2015 14:09:59 -0400
Chris Metcalf  wrote:

> Steven writes:
> > All kidding aside, I think this is the real answer. We don't need a new
> > NO_HZ, we need to make NO_HZ_FULL work. Right now it doesn't do exactly
> > what it was created to do. That should be fixed.
> 
> The claim I'm making is that it's worthwhile to differentiate the two
> semantics.  Plain NO_HZ_FULL just says "kernel makes a best effort to
> avoid periodic interrupts without incurring any serious overhead".  My
> patch series allows an app to request "kernel makes an absolute
> commitment to avoid all interrupts regardless of cost when leaving
> kernel space".  These are different enough ideas, and serve different
> enough application needs, that I think they should be kept distinct.
> 
> Frederic actually summed this up very nicely in his recent email when
> he wrote "some people may expect hard isolation requirement (Real
> Time, deterministic latency) and others softer isolation (HPC, only
> interested in performance, can live with one rare random tick, so no
> need to loop before returning to userspace until we have the no-noise
> guarantee)."
> 
> So we need a way for apps to ask for the "harder" mode and let
> the softer mode be the default.

Fair enough. But I would hope that this would improve on NO_HZ_FULL as
well.

> 
> What about naming?  We may or may not want to have a Kconfig flag
> for this, and we may or may not have a separate mode for it, but
> we still will need some kind of name to talk about it with.  (In
> particular there's the prctl name, if we take that approach, and
> potential boot command-line flags to consider naming for.)
> 
> I'll quickly cover the suggestions that have been raised:
> 
> - DATAPLANE.  My suggestion, seemingly broadly disliked by folks
>who felt it wasn't apparent what it meant.  Probably a fair point.
> 
> - NO_INTERRUPTS (Andrew).  Captures some of the sense, but was
>criticized pretty fairly by Ingo as being too negative, confusing
>with perf nomenclature, and too long :-)

What about NO_INTERRUPTIONS

> 
> - PURE (Ingo).  Proposed as an alternative to NO_HZ_FULL, but we could
>use it as a name for this new mode.  However, I think it's not clear
>enough how FULL and PURE can/should relate to each other from the
>names alone.

I would find the two confusing as well.

> 
> - BARE_METAL (me).  Ingo observes it's confusing with respect to
>virtualization.

This is also confusing.

> 
> - TASK_SOLO (Gilad).  Not sure this conveys enough of the semantics.

Agreed.

> 
> - OS_LIGHT/OS_ZERO and NO_HZ_LEAVE_ME_THE_FSCK_ALONE.  Excellent
>ideas :-)

At least the LEAVE_ME_ALONE conveys the semantics ;-)

> 
> - ISOLATION (Frederic).  I like this but it conflicts with other uses
>of "isolation" in the kernel: cgroup isolation, lru page isolation,
>iommu isolation, scheduler isolation (at least it's a superset of
>that one), etc.  Also, we're not exactly isolating a task - often
>a "dataplane" app consists of a bunch of interacting threads in
>userspace, so not exactly isolated.  So perhaps it's too confusing.
> 
> - OVERFLOWING (Steven) - not sure I understood this one, honestly.

Actually, that was suggested by Paul McKenney.

> 
> I suggested earlier a few other candidates that I don't love, but no
> one commented on: NO_HZ_STRICT, USERSPACE_ONLY, and ZERO_OVERHEAD.
> 
> One thing I'm leaning towards is to remove the intermediate state of
> DATAPLANE_ENABLE and say that there is really only one primary state,
> DATAPLANE_QUIESCE (or whatever we call it).  The "dataplane but no
> quiesce" state probably isn't that useful, since it doesn't offer the
> hard guarantee that is the entire point of this patch series.  So that
> opens the idea of using the name NO_HZ_QUIESCE or just QUIESCE as the
> word that describes the mode; of course this sort of conflicts with
> RCU quiesce (though it is a superset of that so maybe that's OK).
> 
> One new idea I had is to use NO_HZ_HARD to reflect what Frederic was
> suggesting about "soft" and "hard" requirements for NO_HZ.  So
> enabling NO_HZ_HARD would enable my suggested QUIESCE mode.
> 
> One way to focus this discussion is on the user API naming.  I had
> prctl(PR_SET_DATAPLANE), which was attractive in being a "positive"
> noun.  A lot of the other suggestions fail this test in various way.
> Reasonable candidates seem to be:
> 
>PR_SET_OS_ZERO
>PR_SET_TASK_SOLO
>PR_SET_ISOLATION
> 
> Another possibility:
> 
>PR_SET_NONSTOP
> 
> Or take Andrew's NO_INTERRUPTS and have:
> 
>PR_SET_UNINTERRUPTED

For another possible answer, what about 

SET_TRANQUILITY

A state with no disturbances. 

-- Steve

> 
> I slightly favor ISOLATION at this point despite the overlap with
> other kernel concepts.
> 
> Let the bike-shedding continue! :-)
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More 

Re: [PATCH 0/6] support "dataplane" mode for nohz_full

2015-05-11 Thread Chris Metcalf

A bunch of issues have been raised by various folks (thanks!)  and
I'll try to break them down and respond to them in a few different
emails.  This email is just about the issue of naming and whether the
proposed patch series should even have its own "name" or just be part
of NO_HZ_FULL.

First, Ingo and Steven both suggested that this new "dataplane" mode
(or whatever we want to call it; see below) should just be rolled into
the existing NO_HZ_FULL and that we should focus on making that work
better.

Steven writes:

All kidding aside, I think this is the real answer. We don't need a new
NO_HZ, we need to make NO_HZ_FULL work. Right now it doesn't do exactly
what it was created to do. That should be fixed.


The claim I'm making is that it's worthwhile to differentiate the two
semantics.  Plain NO_HZ_FULL just says "kernel makes a best effort to
avoid periodic interrupts without incurring any serious overhead".  My
patch series allows an app to request "kernel makes an absolute
commitment to avoid all interrupts regardless of cost when leaving
kernel space".  These are different enough ideas, and serve different
enough application needs, that I think they should be kept distinct.

Frederic actually summed this up very nicely in his recent email when
he wrote "some people may expect hard isolation requirement (Real
Time, deterministic latency) and others softer isolation (HPC, only
interested in performance, can live with one rare random tick, so no
need to loop before returning to userspace until we have the no-noise
guarantee)."

So we need a way for apps to ask for the "harder" mode and let
the softer mode be the default.

What about naming?  We may or may not want to have a Kconfig flag
for this, and we may or may not have a separate mode for it, but
we still will need some kind of name to talk about it with.  (In
particular there's the prctl name, if we take that approach, and
potential boot command-line flags to consider naming for.)

I'll quickly cover the suggestions that have been raised:

- DATAPLANE.  My suggestion, seemingly broadly disliked by folks
  who felt it wasn't apparent what it meant.  Probably a fair point.

- NO_INTERRUPTS (Andrew).  Captures some of the sense, but was
  criticized pretty fairly by Ingo as being too negative, confusing
  with perf nomenclature, and too long :-)

- PURE (Ingo).  Proposed as an alternative to NO_HZ_FULL, but we could
  use it as a name for this new mode.  However, I think it's not clear
  enough how FULL and PURE can/should relate to each other from the
  names alone.

- BARE_METAL (me).  Ingo observes it's confusing with respect to
  virtualization.

- TASK_SOLO (Gilad).  Not sure this conveys enough of the semantics.

- OS_LIGHT/OS_ZERO and NO_HZ_LEAVE_ME_THE_FSCK_ALONE.  Excellent
  ideas :-)

- ISOLATION (Frederic).  I like this but it conflicts with other uses
  of "isolation" in the kernel: cgroup isolation, lru page isolation,
  iommu isolation, scheduler isolation (at least it's a superset of
  that one), etc.  Also, we're not exactly isolating a task - often
  a "dataplane" app consists of a bunch of interacting threads in
  userspace, so not exactly isolated.  So perhaps it's too confusing.

- OVERFLOWING (Steven) - not sure I understood this one, honestly.

I suggested earlier a few other candidates that I don't love, but no
one commented on: NO_HZ_STRICT, USERSPACE_ONLY, and ZERO_OVERHEAD.

One thing I'm leaning towards is to remove the intermediate state of
DATAPLANE_ENABLE and say that there is really only one primary state,
DATAPLANE_QUIESCE (or whatever we call it).  The "dataplane but no
quiesce" state probably isn't that useful, since it doesn't offer the
hard guarantee that is the entire point of this patch series.  So that
opens the idea of using the name NO_HZ_QUIESCE or just QUIESCE as the
word that describes the mode; of course this sort of conflicts with
RCU quiesce (though it is a superset of that so maybe that's OK).

One new idea I had is to use NO_HZ_HARD to reflect what Frederic was
suggesting about "soft" and "hard" requirements for NO_HZ.  So
enabling NO_HZ_HARD would enable my suggested QUIESCE mode.

One way to focus this discussion is on the user API naming.  I had
prctl(PR_SET_DATAPLANE), which was attractive in being a "positive"
noun.  A lot of the other suggestions fail this test in various way.
Reasonable candidates seem to be:

  PR_SET_OS_ZERO
  PR_SET_TASK_SOLO
  PR_SET_ISOLATION

Another possibility:

  PR_SET_NONSTOP

Or take Andrew's NO_INTERRUPTS and have:

  PR_SET_UNINTERRUPTED

I slightly favor ISOLATION at this point despite the overlap with
other kernel concepts.

Let the bike-shedding continue! :-)

--
Chris Metcalf, EZChip Semiconductor
http://www.ezchip.com

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  

Re: [PATCH 0/6] support "dataplane" mode for nohz_full

2015-05-11 Thread Steven Rostedt
On Mon, 11 May 2015 19:33:06 +0200
Frederic Weisbecker  wrote:

> On Mon, May 11, 2015 at 10:27:44AM -0700, Andrew Morton wrote:
> > On Mon, 11 May 2015 10:19:16 -0700 "Paul E. McKenney" 
> >  wrote:
> > 
> > > On Mon, May 11, 2015 at 08:57:59AM -0400, Steven Rostedt wrote:
> > > > 
> > > > NO_HZ_LEAVE_ME_THE_FSCK_ALONE!
> > > 
> > > NO_HZ_OVERFLOWING?
> > 
> > Actually, "NO_HZ" shouldn't appear in the name at all.  The objective
> > is to permit userspace to execute without interruption.  NO_HZ is a
> > part of that, as is NO_INTERRUPTS.  The "NO_HZ" thing is a historical
> > artifact from an early partial implementation.
> 
> Agreed! Which is why I'd rather advocate in favour of CONFIG_ISOLATION.

Then we should have CONFIG_LEAVE_ME_THE_FSCK_ALONE. Hmm, I guess that's
just an synonym for CONFIG_ISOLATION.

-- Steve
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/6] support "dataplane" mode for nohz_full

2015-05-11 Thread Frederic Weisbecker
On Mon, May 11, 2015 at 10:27:44AM -0700, Andrew Morton wrote:
> On Mon, 11 May 2015 10:19:16 -0700 "Paul E. McKenney" 
>  wrote:
> 
> > On Mon, May 11, 2015 at 08:57:59AM -0400, Steven Rostedt wrote:
> > > 
> > > NO_HZ_LEAVE_ME_THE_FSCK_ALONE!
> > 
> > NO_HZ_OVERFLOWING?
> 
> Actually, "NO_HZ" shouldn't appear in the name at all.  The objective
> is to permit userspace to execute without interruption.  NO_HZ is a
> part of that, as is NO_INTERRUPTS.  The "NO_HZ" thing is a historical
> artifact from an early partial implementation.

Agreed! Which is why I'd rather advocate in favour of CONFIG_ISOLATION.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/6] support "dataplane" mode for nohz_full

2015-05-11 Thread Andrew Morton
On Mon, 11 May 2015 10:19:16 -0700 "Paul E. McKenney" 
 wrote:

> On Mon, May 11, 2015 at 08:57:59AM -0400, Steven Rostedt wrote:
> > 
> > NO_HZ_LEAVE_ME_THE_FSCK_ALONE!
> 
> NO_HZ_OVERFLOWING?

Actually, "NO_HZ" shouldn't appear in the name at all.  The objective
is to permit userspace to execute without interruption.  NO_HZ is a
part of that, as is NO_INTERRUPTS.  The "NO_HZ" thing is a historical
artifact from an early partial implementation.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/6] support "dataplane" mode for nohz_full

2015-05-11 Thread Paul E. McKenney
On Mon, May 11, 2015 at 08:57:59AM -0400, Steven Rostedt wrote:
> 
> NO_HZ_LEAVE_ME_THE_FSCK_ALONE!

NO_HZ_OVERFLOWING?

Kconfig naming controversy aside, I believe this patchset is addressing
a real need.  Might need additional adjustment, but something useful.

Thanx, Paul

> On Sat, 9 May 2015 09:05:38 +0200
> Ingo Molnar  wrote:
> 
> > So I think we should either rename NO_HZ_FULL to NO_HZ_PURE, or keep 
> > it at NO_HZ_FULL: because the intention of NO_HZ_FULL was always to be 
> > such a 'zero overhead' mode of operation, where if user-space runs, it 
> > won't get interrupted in any way.
> 
> 
> All kidding aside, I think this is the real answer. We don't need a new
> NO_HZ, we need to make NO_HZ_FULL work. Right now it doesn't do exactly
> what it was created to do. That should be fixed.
> 
> Please lets get NO_HZ_FULL up to par. That should be the main focus.
> 
> -- Steve
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/6] support "dataplane" mode for nohz_full

2015-05-11 Thread Frederic Weisbecker
On Mon, May 11, 2015 at 08:57:59AM -0400, Steven Rostedt wrote:
> 
> NO_HZ_LEAVE_ME_THE_FSCK_ALONE!
> 
> 
> On Sat, 9 May 2015 09:05:38 +0200
> Ingo Molnar  wrote:
>  
> > So I think we should either rename NO_HZ_FULL to NO_HZ_PURE, or keep 
> > it at NO_HZ_FULL: because the intention of NO_HZ_FULL was always to be 
> > such a 'zero overhead' mode of operation, where if user-space runs, it 
> > won't get interrupted in any way.
> 
> 
> All kidding aside, I think this is the real answer. We don't need a new
> NO_HZ, we need to make NO_HZ_FULL work. Right now it doesn't do exactly
> what it was created to do. That should be fixed.
> 
> Please lets get NO_HZ_FULL up to par. That should be the main focus.

Now if we can achieve to make NO_HZ_FULL behave in a specific way
that fits everyone's usecase, I'll be happy.

But some people may expect hard isolation requirement (Real Time, deterministic
latency) and others softer isolation (HPC, only interested in performance, can
live with one rare random tick, so no need to loop before returning to userspace
until we have the no-noise guarantee).

I expect some Real Time users may want this kind of dataplane mode where a 
syscall
or whatever sleeps until the system is ready to provide the guarantee that no
disturbance is going to happen for a given time. I'm not sure HPC users are 
interested
in that.

In fact it goes along the fact that NO_HZ_FULL was really only supposed to be 
about
the tick and now people are introducing more and more kernel default presetting 
that
assume NO_HZ_FULL implies ISOLATION which is about all kind of noise (tick, 
tasks, irqs,
...). Which is true but what kind of ISOLATION?

Probably NO_HZ_FULL should really only be about stopping the tick then some sort
of CONFIG_ISOLATION would drive the kind of isolation we are interested in
and hereby the behaviour of NO_HZ_FULL, workqueues, timers, tasks affinity, irqs
affinity, dataplane mode, ...
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/6] support "dataplane" mode for nohz_full

2015-05-11 Thread Steven Rostedt

NO_HZ_LEAVE_ME_THE_FSCK_ALONE!


On Sat, 9 May 2015 09:05:38 +0200
Ingo Molnar  wrote:
 
> So I think we should either rename NO_HZ_FULL to NO_HZ_PURE, or keep 
> it at NO_HZ_FULL: because the intention of NO_HZ_FULL was always to be 
> such a 'zero overhead' mode of operation, where if user-space runs, it 
> won't get interrupted in any way.


All kidding aside, I think this is the real answer. We don't need a new
NO_HZ, we need to make NO_HZ_FULL work. Right now it doesn't do exactly
what it was created to do. That should be fixed.

Please lets get NO_HZ_FULL up to par. That should be the main focus.

-- Steve
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/6] support dataplane mode for nohz_full

2015-05-11 Thread Steven Rostedt

NO_HZ_LEAVE_ME_THE_FSCK_ALONE!


On Sat, 9 May 2015 09:05:38 +0200
Ingo Molnar mi...@kernel.org wrote:
 
 So I think we should either rename NO_HZ_FULL to NO_HZ_PURE, or keep 
 it at NO_HZ_FULL: because the intention of NO_HZ_FULL was always to be 
 such a 'zero overhead' mode of operation, where if user-space runs, it 
 won't get interrupted in any way.


All kidding aside, I think this is the real answer. We don't need a new
NO_HZ, we need to make NO_HZ_FULL work. Right now it doesn't do exactly
what it was created to do. That should be fixed.

Please lets get NO_HZ_FULL up to par. That should be the main focus.

-- Steve
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/6] support dataplane mode for nohz_full

2015-05-11 Thread Frederic Weisbecker
On Mon, May 11, 2015 at 08:57:59AM -0400, Steven Rostedt wrote:
 
 NO_HZ_LEAVE_ME_THE_FSCK_ALONE!
 
 
 On Sat, 9 May 2015 09:05:38 +0200
 Ingo Molnar mi...@kernel.org wrote:
  
  So I think we should either rename NO_HZ_FULL to NO_HZ_PURE, or keep 
  it at NO_HZ_FULL: because the intention of NO_HZ_FULL was always to be 
  such a 'zero overhead' mode of operation, where if user-space runs, it 
  won't get interrupted in any way.
 
 
 All kidding aside, I think this is the real answer. We don't need a new
 NO_HZ, we need to make NO_HZ_FULL work. Right now it doesn't do exactly
 what it was created to do. That should be fixed.
 
 Please lets get NO_HZ_FULL up to par. That should be the main focus.

Now if we can achieve to make NO_HZ_FULL behave in a specific way
that fits everyone's usecase, I'll be happy.

But some people may expect hard isolation requirement (Real Time, deterministic
latency) and others softer isolation (HPC, only interested in performance, can
live with one rare random tick, so no need to loop before returning to userspace
until we have the no-noise guarantee).

I expect some Real Time users may want this kind of dataplane mode where a 
syscall
or whatever sleeps until the system is ready to provide the guarantee that no
disturbance is going to happen for a given time. I'm not sure HPC users are 
interested
in that.

In fact it goes along the fact that NO_HZ_FULL was really only supposed to be 
about
the tick and now people are introducing more and more kernel default presetting 
that
assume NO_HZ_FULL implies ISOLATION which is about all kind of noise (tick, 
tasks, irqs,
...). Which is true but what kind of ISOLATION?

Probably NO_HZ_FULL should really only be about stopping the tick then some sort
of CONFIG_ISOLATION would drive the kind of isolation we are interested in
and hereby the behaviour of NO_HZ_FULL, workqueues, timers, tasks affinity, irqs
affinity, dataplane mode, ...
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/6] support dataplane mode for nohz_full

2015-05-11 Thread Steven Rostedt
On Mon, 11 May 2015 14:09:59 -0400
Chris Metcalf cmetc...@ezchip.com wrote:

 Steven writes:
  All kidding aside, I think this is the real answer. We don't need a new
  NO_HZ, we need to make NO_HZ_FULL work. Right now it doesn't do exactly
  what it was created to do. That should be fixed.
 
 The claim I'm making is that it's worthwhile to differentiate the two
 semantics.  Plain NO_HZ_FULL just says kernel makes a best effort to
 avoid periodic interrupts without incurring any serious overhead.  My
 patch series allows an app to request kernel makes an absolute
 commitment to avoid all interrupts regardless of cost when leaving
 kernel space.  These are different enough ideas, and serve different
 enough application needs, that I think they should be kept distinct.
 
 Frederic actually summed this up very nicely in his recent email when
 he wrote some people may expect hard isolation requirement (Real
 Time, deterministic latency) and others softer isolation (HPC, only
 interested in performance, can live with one rare random tick, so no
 need to loop before returning to userspace until we have the no-noise
 guarantee).
 
 So we need a way for apps to ask for the harder mode and let
 the softer mode be the default.

Fair enough. But I would hope that this would improve on NO_HZ_FULL as
well.

 
 What about naming?  We may or may not want to have a Kconfig flag
 for this, and we may or may not have a separate mode for it, but
 we still will need some kind of name to talk about it with.  (In
 particular there's the prctl name, if we take that approach, and
 potential boot command-line flags to consider naming for.)
 
 I'll quickly cover the suggestions that have been raised:
 
 - DATAPLANE.  My suggestion, seemingly broadly disliked by folks
who felt it wasn't apparent what it meant.  Probably a fair point.
 
 - NO_INTERRUPTS (Andrew).  Captures some of the sense, but was
criticized pretty fairly by Ingo as being too negative, confusing
with perf nomenclature, and too long :-)

What about NO_INTERRUPTIONS

 
 - PURE (Ingo).  Proposed as an alternative to NO_HZ_FULL, but we could
use it as a name for this new mode.  However, I think it's not clear
enough how FULL and PURE can/should relate to each other from the
names alone.

I would find the two confusing as well.

 
 - BARE_METAL (me).  Ingo observes it's confusing with respect to
virtualization.

This is also confusing.

 
 - TASK_SOLO (Gilad).  Not sure this conveys enough of the semantics.

Agreed.

 
 - OS_LIGHT/OS_ZERO and NO_HZ_LEAVE_ME_THE_FSCK_ALONE.  Excellent
ideas :-)

At least the LEAVE_ME_ALONE conveys the semantics ;-)

 
 - ISOLATION (Frederic).  I like this but it conflicts with other uses
of isolation in the kernel: cgroup isolation, lru page isolation,
iommu isolation, scheduler isolation (at least it's a superset of
that one), etc.  Also, we're not exactly isolating a task - often
a dataplane app consists of a bunch of interacting threads in
userspace, so not exactly isolated.  So perhaps it's too confusing.
 
 - OVERFLOWING (Steven) - not sure I understood this one, honestly.

Actually, that was suggested by Paul McKenney.

 
 I suggested earlier a few other candidates that I don't love, but no
 one commented on: NO_HZ_STRICT, USERSPACE_ONLY, and ZERO_OVERHEAD.
 
 One thing I'm leaning towards is to remove the intermediate state of
 DATAPLANE_ENABLE and say that there is really only one primary state,
 DATAPLANE_QUIESCE (or whatever we call it).  The dataplane but no
 quiesce state probably isn't that useful, since it doesn't offer the
 hard guarantee that is the entire point of this patch series.  So that
 opens the idea of using the name NO_HZ_QUIESCE or just QUIESCE as the
 word that describes the mode; of course this sort of conflicts with
 RCU quiesce (though it is a superset of that so maybe that's OK).
 
 One new idea I had is to use NO_HZ_HARD to reflect what Frederic was
 suggesting about soft and hard requirements for NO_HZ.  So
 enabling NO_HZ_HARD would enable my suggested QUIESCE mode.
 
 One way to focus this discussion is on the user API naming.  I had
 prctl(PR_SET_DATAPLANE), which was attractive in being a positive
 noun.  A lot of the other suggestions fail this test in various way.
 Reasonable candidates seem to be:
 
PR_SET_OS_ZERO
PR_SET_TASK_SOLO
PR_SET_ISOLATION
 
 Another possibility:
 
PR_SET_NONSTOP
 
 Or take Andrew's NO_INTERRUPTS and have:
 
PR_SET_UNINTERRUPTED

For another possible answer, what about 

SET_TRANQUILITY

A state with no disturbances. 

-- Steve

 
 I slightly favor ISOLATION at this point despite the overlap with
 other kernel concepts.
 
 Let the bike-shedding continue! :-)
 

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  

Re: [PATCH 0/6] support dataplane mode for nohz_full

2015-05-11 Thread Chris Metcalf

On 05/11/2015 03:19 PM, Mike Galbraith wrote:

I really shouldn't have acked nohz_full - isolcpus.  Beside the fact
that old static isolcpus was_supposed_  to crawl off and die, I know
beyond doubt that having isolated a cpu as well as you can definitely
does NOT imply that said cpu should become tickless.


True, at a high level, I agree that it would be better to have a
top-level concept like Frederic's proposed ISOLATION that includes
isolcpus and nohz_cpu (and other stuff as needed).

That said, what you wrote above is wrong; even with the patch you
acked, setting isolcpus does not automatically turn on nohz_full for
a given cpu.  The patch made it true the other way around: when
you say nohz_full, you automatically get isolcpus on that cpu too.
That does, at least, make sense for the semantics of nohz_full.

--
Chris Metcalf, EZChip Semiconductor
http://www.ezchip.com

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/6] support dataplane mode for nohz_full

2015-05-11 Thread Steven Rostedt
On Mon, 11 May 2015 19:33:06 +0200
Frederic Weisbecker fweis...@gmail.com wrote:

 On Mon, May 11, 2015 at 10:27:44AM -0700, Andrew Morton wrote:
  On Mon, 11 May 2015 10:19:16 -0700 Paul E. McKenney 
  paul...@linux.vnet.ibm.com wrote:
  
   On Mon, May 11, 2015 at 08:57:59AM -0400, Steven Rostedt wrote:

NO_HZ_LEAVE_ME_THE_FSCK_ALONE!
   
   NO_HZ_OVERFLOWING?
  
  Actually, NO_HZ shouldn't appear in the name at all.  The objective
  is to permit userspace to execute without interruption.  NO_HZ is a
  part of that, as is NO_INTERRUPTS.  The NO_HZ thing is a historical
  artifact from an early partial implementation.
 
 Agreed! Which is why I'd rather advocate in favour of CONFIG_ISOLATION.

Then we should have CONFIG_LEAVE_ME_THE_FSCK_ALONE. Hmm, I guess that's
just an synonym for CONFIG_ISOLATION.

-- Steve
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/6] support dataplane mode for nohz_full

2015-05-11 Thread Andy Lutomirski
On May 12, 2015 4:54 AM, Chris Metcalf cmetc...@ezchip.com wrote:

 (Oops, resending and forcing html off.)


 On 05/09/2015 03:19 AM, Andy Lutomirski wrote:

 Naming aside, I don't think this should be a per-task flag at all.  We
 already have way too much overhead per syscall in nohz mode, and it
 would be nice to get the per-syscall overhead as low as possible.  We
 should strive, for all tasks, to keep syscall overhead down*and*
 avoid as many interrupts as possible.

 That being said, I do see a legitimate use for a way to tell the
 kernel I'm going to run in userspace for a long time; stay away.
 But shouldn't that be a single operation, not an ongoing flag?  IOW, I
 think that we should have a new syscall quiesce() or something rather
 than a prctl.


 Yes, if all you are concerned about is quiescing the tick, we could
 probably do it as a new syscall.

 I do note that you'd want to try to actually do the quiesce as late as
 possible - in particular, if you just did it in the usual syscall, you
 might miss out on a timer that is set by softirq, or even something
 that happened when you called schedule() on the syscall exit path.
 Doing it as late as we are doing helps to ensure that that doesn't
 happen.  We could still arrange for this semantics by having a new
 quiesce() syscall set a temporary task bit that was cleared on
 return to userspace, but as you pointed out in a different email,
 that gets tricky if you end up doing multiple user_exit() calls on
 your way back to userspace.

We should fix that, then.  A quiesce() syscall can certainly arrange
to clean up on final exit.


 More to the point, I think it's actually important to know when an
 application believes it's in userspace-only mode as an actual state
 bit, rather than just during its transitional moment.

We can do that, too, with a new flag that's cleared on the next entry.

  If an
 application calls the kernel at an unexpected time (third-party code
 is the usual culprit for our customers, whether it's syscalls, page
 faults, or other things) we would prefer to have the quiesce
 semantics stay in force and cause the third-party code to be
 visibly very slow, rather than cause a totally unexpected and
 hard-to-diagnose interrupt show up later as we are still going
 around the loop that we thought was safely userspace-only.

I'm not really convinced that we should design this feature around
ease of debugging userspace screwups.  There are already plenty of
ways to do that part.  Userspace getting an interrupt because
userspace accidentally did a syscall is very different from userspace
getting interrupted due to an IPI.


 And, for debugging the kernel, it's crazy helpful to have that state
 bit in place: see patch 6/6 in the series for how we can diagnose
 things like a different core just queued an IPI that will hit a
 dataplane core unexpectedly.  Having that state bit makes this sort
 of thing a trivial check in the kernel and relatively easy to debug.

As above, this can be done with a one-time operation, too.


 Finally, I proposed a strict mode in patch 5/6 where we kill the
 process if it voluntarily enters the kernel by mistake after saying it
 wasn't going to any more.  To do this requires a state bit, so
 carrying another state bit for quiesce on user entry seems pretty
 reasonable.

I still dislike that in the form you chose.  It's too deadly to be
useful for anyone but the hardest RT users.

I think I'd be okay with variants, though: let a suitably privileged
process ask for a signal on inadvertent kernel entry or rig up an fd
to be notified when one of these bad entries happens.  Queueing
something to a pollable fd would work, too.

See that thread for more comments.

--Andy
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/6] support dataplane mode for nohz_full

2015-05-11 Thread Mike Galbraith
On Mon, 2015-05-11 at 17:36 +0200, Frederic Weisbecker wrote:

 I expect some Real Time users may want this kind of dataplane mode where a 
 syscall
 or whatever sleeps until the system is ready to provide the guarantee that no
 disturbance is going to happen for a given time. I'm not sure HPC users are 
 interested
 in that.

I bet they are.  RT is just a different way to spell HPC, and reverse.

 In fact it goes along the fact that NO_HZ_FULL was really only supposed to be 
 about
 the tick and now people are introducing more and more kernel default 
 presetting that
 assume NO_HZ_FULL implies ISOLATION which is about all kind of noise (tick, 
 tasks, irqs,
 ...). Which is true but what kind of ISOLATION?

True, nohz mode and various isolation measures are distinct properties.
NO_HZ_FULL is kinda pointless without isolation measures to go with it,
but you're right.

I really shouldn't have acked nohz_full - isolcpus.  Beside the fact
that old static isolcpus was _supposed_ to crawl off and die, I know
beyond doubt that having isolated a cpu as well as you can definitely
does NOT imply that said cpu should become tickless.  I routinely run a
load model that wants all the isolation it can get.  It's not single
task compute though, rt executive coordinating rt workers, and of course
wants every cycle it can get, so nohz_full is less than helpful.

-Mike

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/6] support dataplane mode for nohz_full

2015-05-11 Thread Chris Metcalf

(Oops, resending and forcing html off.)

On 05/09/2015 03:19 AM, Andy Lutomirski wrote:

Naming aside, I don't think this should be a per-task flag at all.  We
already have way too much overhead per syscall in nohz mode, and it
would be nice to get the per-syscall overhead as low as possible.  We
should strive, for all tasks, to keep syscall overhead down*and*
avoid as many interrupts as possible.

That being said, I do see a legitimate use for a way to tell the
kernel I'm going to run in userspace for a long time; stay away.
But shouldn't that be a single operation, not an ongoing flag?  IOW, I
think that we should have a new syscall quiesce() or something rather
than a prctl.


Yes, if all you are concerned about is quiescing the tick, we could
probably do it as a new syscall.

I do note that you'd want to try to actually do the quiesce as late as
possible - in particular, if you just did it in the usual syscall, you
might miss out on a timer that is set by softirq, or even something
that happened when you called schedule() on the syscall exit path.
Doing it as late as we are doing helps to ensure that that doesn't
happen.  We could still arrange for this semantics by having a new
quiesce() syscall set a temporary task bit that was cleared on
return to userspace, but as you pointed out in a different email,
that gets tricky if you end up doing multiple user_exit() calls on
your way back to userspace.

More to the point, I think it's actually important to know when an
application believes it's in userspace-only mode as an actual state
bit, rather than just during its transitional moment.  If an
application calls the kernel at an unexpected time (third-party code
is the usual culprit for our customers, whether it's syscalls, page
faults, or other things) we would prefer to have the quiesce
semantics stay in force and cause the third-party code to be
visibly very slow, rather than cause a totally unexpected and
hard-to-diagnose interrupt show up later as we are still going
around the loop that we thought was safely userspace-only.

And, for debugging the kernel, it's crazy helpful to have that state
bit in place: see patch 6/6 in the series for how we can diagnose
things like a different core just queued an IPI that will hit a
dataplane core unexpectedly.  Having that state bit makes this sort
of thing a trivial check in the kernel and relatively easy to debug.

Finally, I proposed a strict mode in patch 5/6 where we kill the
process if it voluntarily enters the kernel by mistake after saying it
wasn't going to any more.  To do this requires a state bit, so
carrying another state bit for quiesce on user entry seems pretty
reasonable.

--
Chris Metcalf, EZChip Semiconductor
http://www.ezchip.com

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/6] support dataplane mode for nohz_full

2015-05-11 Thread Chris Metcalf

A bunch of issues have been raised by various folks (thanks!)  and
I'll try to break them down and respond to them in a few different
emails.  This email is just about the issue of naming and whether the
proposed patch series should even have its own name or just be part
of NO_HZ_FULL.

First, Ingo and Steven both suggested that this new dataplane mode
(or whatever we want to call it; see below) should just be rolled into
the existing NO_HZ_FULL and that we should focus on making that work
better.

Steven writes:

All kidding aside, I think this is the real answer. We don't need a new
NO_HZ, we need to make NO_HZ_FULL work. Right now it doesn't do exactly
what it was created to do. That should be fixed.


The claim I'm making is that it's worthwhile to differentiate the two
semantics.  Plain NO_HZ_FULL just says kernel makes a best effort to
avoid periodic interrupts without incurring any serious overhead.  My
patch series allows an app to request kernel makes an absolute
commitment to avoid all interrupts regardless of cost when leaving
kernel space.  These are different enough ideas, and serve different
enough application needs, that I think they should be kept distinct.

Frederic actually summed this up very nicely in his recent email when
he wrote some people may expect hard isolation requirement (Real
Time, deterministic latency) and others softer isolation (HPC, only
interested in performance, can live with one rare random tick, so no
need to loop before returning to userspace until we have the no-noise
guarantee).

So we need a way for apps to ask for the harder mode and let
the softer mode be the default.

What about naming?  We may or may not want to have a Kconfig flag
for this, and we may or may not have a separate mode for it, but
we still will need some kind of name to talk about it with.  (In
particular there's the prctl name, if we take that approach, and
potential boot command-line flags to consider naming for.)

I'll quickly cover the suggestions that have been raised:

- DATAPLANE.  My suggestion, seemingly broadly disliked by folks
  who felt it wasn't apparent what it meant.  Probably a fair point.

- NO_INTERRUPTS (Andrew).  Captures some of the sense, but was
  criticized pretty fairly by Ingo as being too negative, confusing
  with perf nomenclature, and too long :-)

- PURE (Ingo).  Proposed as an alternative to NO_HZ_FULL, but we could
  use it as a name for this new mode.  However, I think it's not clear
  enough how FULL and PURE can/should relate to each other from the
  names alone.

- BARE_METAL (me).  Ingo observes it's confusing with respect to
  virtualization.

- TASK_SOLO (Gilad).  Not sure this conveys enough of the semantics.

- OS_LIGHT/OS_ZERO and NO_HZ_LEAVE_ME_THE_FSCK_ALONE.  Excellent
  ideas :-)

- ISOLATION (Frederic).  I like this but it conflicts with other uses
  of isolation in the kernel: cgroup isolation, lru page isolation,
  iommu isolation, scheduler isolation (at least it's a superset of
  that one), etc.  Also, we're not exactly isolating a task - often
  a dataplane app consists of a bunch of interacting threads in
  userspace, so not exactly isolated.  So perhaps it's too confusing.

- OVERFLOWING (Steven) - not sure I understood this one, honestly.

I suggested earlier a few other candidates that I don't love, but no
one commented on: NO_HZ_STRICT, USERSPACE_ONLY, and ZERO_OVERHEAD.

One thing I'm leaning towards is to remove the intermediate state of
DATAPLANE_ENABLE and say that there is really only one primary state,
DATAPLANE_QUIESCE (or whatever we call it).  The dataplane but no
quiesce state probably isn't that useful, since it doesn't offer the
hard guarantee that is the entire point of this patch series.  So that
opens the idea of using the name NO_HZ_QUIESCE or just QUIESCE as the
word that describes the mode; of course this sort of conflicts with
RCU quiesce (though it is a superset of that so maybe that's OK).

One new idea I had is to use NO_HZ_HARD to reflect what Frederic was
suggesting about soft and hard requirements for NO_HZ.  So
enabling NO_HZ_HARD would enable my suggested QUIESCE mode.

One way to focus this discussion is on the user API naming.  I had
prctl(PR_SET_DATAPLANE), which was attractive in being a positive
noun.  A lot of the other suggestions fail this test in various way.
Reasonable candidates seem to be:

  PR_SET_OS_ZERO
  PR_SET_TASK_SOLO
  PR_SET_ISOLATION

Another possibility:

  PR_SET_NONSTOP

Or take Andrew's NO_INTERRUPTS and have:

  PR_SET_UNINTERRUPTED

I slightly favor ISOLATION at this point despite the overlap with
other kernel concepts.

Let the bike-shedding continue! :-)

--
Chris Metcalf, EZChip Semiconductor
http://www.ezchip.com

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/6] support dataplane mode for nohz_full

2015-05-11 Thread Mike Galbraith
On Tue, 2015-05-12 at 03:47 +0200, Mike Galbraith wrote:
 On Mon, 2015-05-11 at 15:25 -0400, Chris Metcalf wrote:
  On 05/11/2015 03:19 PM, Mike Galbraith wrote:
   I really shouldn't have acked nohz_full - isolcpus.  Beside the fact
   that old static isolcpus was_supposed_  to crawl off and die, I know
   beyond doubt that having isolated a cpu as well as you can definitely
   does NOT imply that said cpu should become tickless.
  
  True, at a high level, I agree that it would be better to have a
  top-level concept like Frederic's proposed ISOLATION that includes
  isolcpus and nohz_cpu (and other stuff as needed).
  
  That said, what you wrote above is wrong; even with the patch you
  acked, setting isolcpus does not automatically turn on nohz_full for
  a given cpu.  The patch made it true the other way around: when
  you say nohz_full, you automatically get isolcpus on that cpu too.
  That does, at least, make sense for the semantics of nohz_full.
 
 I didn't write that, I wrote nohz_full implies (spelled '-') isolcpus.
 Yes, with nohz_full currently being static, the old allegedly dying but
 also static isolcpus scheduler off switch is a convenient thing to wire
 the nohz_full CPU SET (- hint;) property to.

BTW, another facet of this: Rik wants to make isolcpus immune to
cpusets, which makes some sense, user did say isolcpus=, but that also
makes isolcpus truly static.  If the user now says nohz_full=, they lose
the ability to deactivate CPU isolation, making the set fairly useless
for anything other than HPC.  Currently, the user can flip the isolation
switch as he sees fit.  He takes a size extra large performance hit for
having said nohz_full=, but he doesn't lose generic utility.

-Mike

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/6] support dataplane mode for nohz_full

2015-05-11 Thread Mike Galbraith
On Mon, 2015-05-11 at 15:25 -0400, Chris Metcalf wrote:
 On 05/11/2015 03:19 PM, Mike Galbraith wrote:
  I really shouldn't have acked nohz_full - isolcpus.  Beside the fact
  that old static isolcpus was_supposed_  to crawl off and die, I know
  beyond doubt that having isolated a cpu as well as you can definitely
  does NOT imply that said cpu should become tickless.
 
 True, at a high level, I agree that it would be better to have a
 top-level concept like Frederic's proposed ISOLATION that includes
 isolcpus and nohz_cpu (and other stuff as needed).
 
 That said, what you wrote above is wrong; even with the patch you
 acked, setting isolcpus does not automatically turn on nohz_full for
 a given cpu.  The patch made it true the other way around: when
 you say nohz_full, you automatically get isolcpus on that cpu too.
 That does, at least, make sense for the semantics of nohz_full.

I didn't write that, I wrote nohz_full implies (spelled '-') isolcpus.
Yes, with nohz_full currently being static, the old allegedly dying but
also static isolcpus scheduler off switch is a convenient thing to wire
the nohz_full CPU SET (- hint;) property to.

-Mike


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/6] support dataplane mode for nohz_full

2015-05-11 Thread Paul E. McKenney
On Mon, May 11, 2015 at 08:57:59AM -0400, Steven Rostedt wrote:
 
 NO_HZ_LEAVE_ME_THE_FSCK_ALONE!

NO_HZ_OVERFLOWING?

Kconfig naming controversy aside, I believe this patchset is addressing
a real need.  Might need additional adjustment, but something useful.

Thanx, Paul

 On Sat, 9 May 2015 09:05:38 +0200
 Ingo Molnar mi...@kernel.org wrote:
 
  So I think we should either rename NO_HZ_FULL to NO_HZ_PURE, or keep 
  it at NO_HZ_FULL: because the intention of NO_HZ_FULL was always to be 
  such a 'zero overhead' mode of operation, where if user-space runs, it 
  won't get interrupted in any way.
 
 
 All kidding aside, I think this is the real answer. We don't need a new
 NO_HZ, we need to make NO_HZ_FULL work. Right now it doesn't do exactly
 what it was created to do. That should be fixed.
 
 Please lets get NO_HZ_FULL up to par. That should be the main focus.
 
 -- Steve
 

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/6] support dataplane mode for nohz_full

2015-05-11 Thread Frederic Weisbecker
On Mon, May 11, 2015 at 10:27:44AM -0700, Andrew Morton wrote:
 On Mon, 11 May 2015 10:19:16 -0700 Paul E. McKenney 
 paul...@linux.vnet.ibm.com wrote:
 
  On Mon, May 11, 2015 at 08:57:59AM -0400, Steven Rostedt wrote:
   
   NO_HZ_LEAVE_ME_THE_FSCK_ALONE!
  
  NO_HZ_OVERFLOWING?
 
 Actually, NO_HZ shouldn't appear in the name at all.  The objective
 is to permit userspace to execute without interruption.  NO_HZ is a
 part of that, as is NO_INTERRUPTS.  The NO_HZ thing is a historical
 artifact from an early partial implementation.

Agreed! Which is why I'd rather advocate in favour of CONFIG_ISOLATION.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/6] support dataplane mode for nohz_full

2015-05-11 Thread Andrew Morton
On Mon, 11 May 2015 10:19:16 -0700 Paul E. McKenney 
paul...@linux.vnet.ibm.com wrote:

 On Mon, May 11, 2015 at 08:57:59AM -0400, Steven Rostedt wrote:
  
  NO_HZ_LEAVE_ME_THE_FSCK_ALONE!
 
 NO_HZ_OVERFLOWING?

Actually, NO_HZ shouldn't appear in the name at all.  The objective
is to permit userspace to execute without interruption.  NO_HZ is a
part of that, as is NO_INTERRUPTS.  The NO_HZ thing is a historical
artifact from an early partial implementation.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH 0/6] support "dataplane" mode for nohz_full

2015-05-09 Thread Gilad Ben Yossef

> From: Mike Galbraith [mailto:umgwanakikb...@gmail.com]
> Sent: Saturday, May 09, 2015 10:20 AM
> To: Ingo Molnar
> Cc: Andrew Morton; Chris Metcalf; Steven Rostedt; Gilad Ben Yossef; Ingo
> Molnar; Peter Zijlstra; Rik van Riel; Tejun Heo; Frederic Weisbecker;
> Thomas Gleixner; Paul E. McKenney; Christoph Lameter; Srivatsa S. Bhat;
> linux-...@vger.kernel.org; linux-...@vger.kernel.org; linux-
> ker...@vger.kernel.org
> Subject: Re: [PATCH 0/6] support "dataplane" mode for nohz_full
> 
> On Sat, 2015-05-09 at 09:05 +0200, Ingo Molnar wrote:
> > * Andrew Morton  wrote:
> >
> > > On Fri, 8 May 2015 19:11:10 -0400 Chris Metcalf 
> wrote:
> > >
> > > > On 5/8/2015 5:22 PM, Steven Rostedt wrote:
> > > > > On Fri, 8 May 2015 14:18:24 -0700
> > > > > Andrew Morton  wrote:
> > > > >
> > > > >> On Fri, 8 May 2015 13:58:41 -0400 Chris Metcalf
>  wrote:
> > > > >>
> > > > >>> A prctl() option (PR_SET_DATAPLANE) is added
> > > > >> Dumb question: what does the term "dataplane" mean in this
> context?  I
> > > > >> can't see the relationship between those words and what this
> patch
> > > > >> does.
> > > > > I was thinking the same thing. I haven't gotten around to
> searching
> > > > > DATAPLANE yet.
> > > > >
> > > > > I would assume we want a name that is more meaningful for what is
> > > > > happening.
> > > >
> > > > The text in the commit message and the 0/6 cover letter do try to
> explain
> > > > the concept.  The terminology comes, I think, from networking line
> cards,
> > > > where the "dataplane" is the part of the application that handles
> all the
> > > > fast path processing of network packets, and the "control plane" is
> the part
> > > > that handles routing updates, etc., generally slow-path stuff.  I've
> probably
> > > > just been using the terms so long they seem normal to me.
> > > >
> > > > That said, what would be clearer?  NO_HZ_STRICT as a superset of
> > > > NO_HZ_FULL?  Or move away from the NO_HZ terminology a bit; after
> all,
> > > > we're talking about no interrupts of any kind, and maybe NO_HZ is
> too
> > > > limited in scope?  So, NO_INTERRUPTS?  USERSPACE_ONLY?  Or look
> > > > to vendors who ship bare-metal runtimes and call it BARE_METAL?
> > > > Borrow the Tilera marketing name and call it ZERO_OVERHEAD?
> > > >
> > > > Maybe BARE_METAL seems most plausible -- after DATAPLANE, to me,
> > > > of course :-)
> >
> > 'baremetal' has uses in virtualization speak, so I think that would be
> > confusing.
> >
> > > I like NO_INTERRUPTS.  Simple, direct.
> >
> > NO_HZ_PURE?
> 
> Hm, coke light, coke zero... OS_LIGHT and OS_ZERO?
LOL... you forgot OS_CLASSIC for backwards compatibility :-)
How about TASK_SOLO?
Yes, you are trying to achieve the least amount of interference but the bigger 
context is about monopolizing a single CPU for yourself.
Anyway it is worth pointing out that while NO_HZ_FULL is very useful in 
conjunction with this turning the tick off is useful also if you have multiple 
tasks runnable (e.g. if you know you only need to context switch in 100 ms, why 
keep a periodic interrupt running?) even though we don't support it *right 
now*. It might be a good idea not to entangle these concepts too much.

Gilad
Gilad Ben-Yossef
Chief Software Architect
EZchip Technologies Ltd.
37 Israel Pollak Ave, Kiryat Gat 82025 ,Israel
Tel: +972-4-959- ext. 576, Fax: +972-8-681-1483 
Mobile: +972-52-826-0388, US Mobile: +1-973-826-0388
Email: gil...@ezchip.com, Web: http://www.ezchip.com

N�r��yb�X��ǧv�^�)޺{.n�+{zX����ܨ}���Ơz�:+v���zZ+��+zf���h���~i���z��w���?�&�)ߢf��^jǫy�m��@A�a���
0��h���i

Re: [PATCH 0/6] support "dataplane" mode for nohz_full

2015-05-09 Thread Mike Galbraith
On Sat, 2015-05-09 at 09:05 +0200, Ingo Molnar wrote:
> * Andrew Morton  wrote:
> 
> > On Fri, 8 May 2015 19:11:10 -0400 Chris Metcalf  wrote:
> > 
> > > On 5/8/2015 5:22 PM, Steven Rostedt wrote:
> > > > On Fri, 8 May 2015 14:18:24 -0700
> > > > Andrew Morton  wrote:
> > > >
> > > >> On Fri, 8 May 2015 13:58:41 -0400 Chris Metcalf  
> > > >> wrote:
> > > >>
> > > >>> A prctl() option (PR_SET_DATAPLANE) is added
> > > >> Dumb question: what does the term "dataplane" mean in this context?  I
> > > >> can't see the relationship between those words and what this patch
> > > >> does.
> > > > I was thinking the same thing. I haven't gotten around to searching
> > > > DATAPLANE yet.
> > > >
> > > > I would assume we want a name that is more meaningful for what is
> > > > happening.
> > > 
> > > The text in the commit message and the 0/6 cover letter do try to explain
> > > the concept.  The terminology comes, I think, from networking line cards,
> > > where the "dataplane" is the part of the application that handles all the
> > > fast path processing of network packets, and the "control plane" is the 
> > > part
> > > that handles routing updates, etc., generally slow-path stuff.  I've 
> > > probably
> > > just been using the terms so long they seem normal to me.
> > > 
> > > That said, what would be clearer?  NO_HZ_STRICT as a superset of
> > > NO_HZ_FULL?  Or move away from the NO_HZ terminology a bit; after all,
> > > we're talking about no interrupts of any kind, and maybe NO_HZ is too
> > > limited in scope?  So, NO_INTERRUPTS?  USERSPACE_ONLY?  Or look
> > > to vendors who ship bare-metal runtimes and call it BARE_METAL?
> > > Borrow the Tilera marketing name and call it ZERO_OVERHEAD?
> > > 
> > > Maybe BARE_METAL seems most plausible -- after DATAPLANE, to me,
> > > of course :-)
> 
> 'baremetal' has uses in virtualization speak, so I think that would be 
> confusing.
> 
> > I like NO_INTERRUPTS.  Simple, direct.
> 
> NO_HZ_PURE?

Hm, coke light, coke zero... OS_LIGHT and OS_ZERO?

-Mike

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/6] support "dataplane" mode for nohz_full

2015-05-09 Thread Andy Lutomirski
On Sat, May 9, 2015 at 12:05 AM, Ingo Molnar  wrote:
>
> * Andrew Morton  wrote:
>
>> On Fri, 8 May 2015 19:11:10 -0400 Chris Metcalf  wrote:
>>
>> > On 5/8/2015 5:22 PM, Steven Rostedt wrote:
>> > > On Fri, 8 May 2015 14:18:24 -0700
>> > > Andrew Morton  wrote:
>> > >
>> > >> On Fri, 8 May 2015 13:58:41 -0400 Chris Metcalf  
>> > >> wrote:
>> > >>
>> > >>> A prctl() option (PR_SET_DATAPLANE) is added
>> > >> Dumb question: what does the term "dataplane" mean in this context?  I
>> > >> can't see the relationship between those words and what this patch
>> > >> does.
>> > > I was thinking the same thing. I haven't gotten around to searching
>> > > DATAPLANE yet.
>> > >
>> > > I would assume we want a name that is more meaningful for what is
>> > > happening.
>> >
>> > The text in the commit message and the 0/6 cover letter do try to explain
>> > the concept.  The terminology comes, I think, from networking line cards,
>> > where the "dataplane" is the part of the application that handles all the
>> > fast path processing of network packets, and the "control plane" is the 
>> > part
>> > that handles routing updates, etc., generally slow-path stuff.  I've 
>> > probably
>> > just been using the terms so long they seem normal to me.
>> >
>> > That said, what would be clearer?  NO_HZ_STRICT as a superset of
>> > NO_HZ_FULL?  Or move away from the NO_HZ terminology a bit; after all,
>> > we're talking about no interrupts of any kind, and maybe NO_HZ is too
>> > limited in scope?  So, NO_INTERRUPTS?  USERSPACE_ONLY?  Or look
>> > to vendors who ship bare-metal runtimes and call it BARE_METAL?
>> > Borrow the Tilera marketing name and call it ZERO_OVERHEAD?
>> >
>> > Maybe BARE_METAL seems most plausible -- after DATAPLANE, to me,
>> > of course :-)
>
> 'baremetal' has uses in virtualization speak, so I think that would be
> confusing.
>
>> I like NO_INTERRUPTS.  Simple, direct.
>
> NO_HZ_PURE?
>

Naming aside, I don't think this should be a per-task flag at all.  We
already have way too much overhead per syscall in nohz mode, and it
would be nice to get the per-syscall overhead as low as possible.  We
should strive, for all tasks, to keep syscall overhead down *and*
avoid as many interrupts as possible.

That being said, I do see a legitimate use for a way to tell the
kernel "I'm going to run in userspace for a long time; stay away".
But shouldn't that be a single operation, not an ongoing flag?  IOW, I
think that we should have a new syscall quiesce() or something rather
than a prctl.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/6] support "dataplane" mode for nohz_full

2015-05-09 Thread Ingo Molnar

* Andrew Morton  wrote:

> On Fri, 8 May 2015 19:11:10 -0400 Chris Metcalf  wrote:
> 
> > On 5/8/2015 5:22 PM, Steven Rostedt wrote:
> > > On Fri, 8 May 2015 14:18:24 -0700
> > > Andrew Morton  wrote:
> > >
> > >> On Fri, 8 May 2015 13:58:41 -0400 Chris Metcalf  
> > >> wrote:
> > >>
> > >>> A prctl() option (PR_SET_DATAPLANE) is added
> > >> Dumb question: what does the term "dataplane" mean in this context?  I
> > >> can't see the relationship between those words and what this patch
> > >> does.
> > > I was thinking the same thing. I haven't gotten around to searching
> > > DATAPLANE yet.
> > >
> > > I would assume we want a name that is more meaningful for what is
> > > happening.
> > 
> > The text in the commit message and the 0/6 cover letter do try to explain
> > the concept.  The terminology comes, I think, from networking line cards,
> > where the "dataplane" is the part of the application that handles all the
> > fast path processing of network packets, and the "control plane" is the part
> > that handles routing updates, etc., generally slow-path stuff.  I've 
> > probably
> > just been using the terms so long they seem normal to me.
> > 
> > That said, what would be clearer?  NO_HZ_STRICT as a superset of
> > NO_HZ_FULL?  Or move away from the NO_HZ terminology a bit; after all,
> > we're talking about no interrupts of any kind, and maybe NO_HZ is too
> > limited in scope?  So, NO_INTERRUPTS?  USERSPACE_ONLY?  Or look
> > to vendors who ship bare-metal runtimes and call it BARE_METAL?
> > Borrow the Tilera marketing name and call it ZERO_OVERHEAD?
> > 
> > Maybe BARE_METAL seems most plausible -- after DATAPLANE, to me,
> > of course :-)

'baremetal' has uses in virtualization speak, so I think that would be 
confusing.

> I like NO_INTERRUPTS.  Simple, direct.

NO_HZ_PURE?

That's what it's really about: user-space wants to run exclusively, in 
pure user-mode, without any interrupts.

So I don't like 'NO_HZ_NO_INTERRUPTS' for a couple of reasons:

 - It is similar to a term we use in perf: PERF_PMU_CAP_NO_INTERRUPT.

 - Another reason is that 'NO_INTERRUPTS', in most existing uses in 
   the kernel generally relates to some sort of hardware weakness, 
   limitation, a negative property: that we try to limp along without 
   having a hardware interrupt and have to poll. In other driver code
   that uses variants of NO_INTERRUPT it appears to be similar. So I 
   think there's some confusion potential here.

 - Here the fact that we don't disturb user-space is an absolutely
   positive property, not a limitation, a kernel feature we work hard 
   to achieve. NO_HZ_PURE would convey that while NO_HZ_NO_INTERRUPTS 
   wouldn't.

 - NO_HZ_NO_INTERRUPTS has a double negation, and it's also too long,
   compared to NO_HZ_FULL or NO_HZ_PURE ;-) The term 'no HZ' already 
   expresses that we don't have periodic interruptions. We just 
   duplicate that information with NO_HZ_NO_INTERRUPTS, while 
   NO_HZ_FULL or NO_HZ_PURE qualifies it, makes it a stronger
   property - which is what we want I think.

So I think we should either rename NO_HZ_FULL to NO_HZ_PURE, or keep 
it at NO_HZ_FULL: because the intention of NO_HZ_FULL was always to be 
such a 'zero overhead' mode of operation, where if user-space runs, it 
won't get interrupted in any way.

There's no need to add yet another Kconfig variant - lets just enhance 
the current stuff and maybe rename it to NO_HZ_PURE to better express 
its intent.

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/6] support dataplane mode for nohz_full

2015-05-09 Thread Andy Lutomirski
On Sat, May 9, 2015 at 12:05 AM, Ingo Molnar mi...@kernel.org wrote:

 * Andrew Morton a...@linux-foundation.org wrote:

 On Fri, 8 May 2015 19:11:10 -0400 Chris Metcalf cmetc...@ezchip.com wrote:

  On 5/8/2015 5:22 PM, Steven Rostedt wrote:
   On Fri, 8 May 2015 14:18:24 -0700
   Andrew Morton a...@linux-foundation.org wrote:
  
   On Fri, 8 May 2015 13:58:41 -0400 Chris Metcalf cmetc...@ezchip.com 
   wrote:
  
   A prctl() option (PR_SET_DATAPLANE) is added
   Dumb question: what does the term dataplane mean in this context?  I
   can't see the relationship between those words and what this patch
   does.
   I was thinking the same thing. I haven't gotten around to searching
   DATAPLANE yet.
  
   I would assume we want a name that is more meaningful for what is
   happening.
 
  The text in the commit message and the 0/6 cover letter do try to explain
  the concept.  The terminology comes, I think, from networking line cards,
  where the dataplane is the part of the application that handles all the
  fast path processing of network packets, and the control plane is the 
  part
  that handles routing updates, etc., generally slow-path stuff.  I've 
  probably
  just been using the terms so long they seem normal to me.
 
  That said, what would be clearer?  NO_HZ_STRICT as a superset of
  NO_HZ_FULL?  Or move away from the NO_HZ terminology a bit; after all,
  we're talking about no interrupts of any kind, and maybe NO_HZ is too
  limited in scope?  So, NO_INTERRUPTS?  USERSPACE_ONLY?  Or look
  to vendors who ship bare-metal runtimes and call it BARE_METAL?
  Borrow the Tilera marketing name and call it ZERO_OVERHEAD?
 
  Maybe BARE_METAL seems most plausible -- after DATAPLANE, to me,
  of course :-)

 'baremetal' has uses in virtualization speak, so I think that would be
 confusing.

 I like NO_INTERRUPTS.  Simple, direct.

 NO_HZ_PURE?


Naming aside, I don't think this should be a per-task flag at all.  We
already have way too much overhead per syscall in nohz mode, and it
would be nice to get the per-syscall overhead as low as possible.  We
should strive, for all tasks, to keep syscall overhead down *and*
avoid as many interrupts as possible.

That being said, I do see a legitimate use for a way to tell the
kernel I'm going to run in userspace for a long time; stay away.
But shouldn't that be a single operation, not an ongoing flag?  IOW, I
think that we should have a new syscall quiesce() or something rather
than a prctl.

--Andy
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/6] support dataplane mode for nohz_full

2015-05-09 Thread Mike Galbraith
On Sat, 2015-05-09 at 09:05 +0200, Ingo Molnar wrote:
 * Andrew Morton a...@linux-foundation.org wrote:
 
  On Fri, 8 May 2015 19:11:10 -0400 Chris Metcalf cmetc...@ezchip.com wrote:
  
   On 5/8/2015 5:22 PM, Steven Rostedt wrote:
On Fri, 8 May 2015 14:18:24 -0700
Andrew Morton a...@linux-foundation.org wrote:
   
On Fri, 8 May 2015 13:58:41 -0400 Chris Metcalf cmetc...@ezchip.com 
wrote:
   
A prctl() option (PR_SET_DATAPLANE) is added
Dumb question: what does the term dataplane mean in this context?  I
can't see the relationship between those words and what this patch
does.
I was thinking the same thing. I haven't gotten around to searching
DATAPLANE yet.
   
I would assume we want a name that is more meaningful for what is
happening.
   
   The text in the commit message and the 0/6 cover letter do try to explain
   the concept.  The terminology comes, I think, from networking line cards,
   where the dataplane is the part of the application that handles all the
   fast path processing of network packets, and the control plane is the 
   part
   that handles routing updates, etc., generally slow-path stuff.  I've 
   probably
   just been using the terms so long they seem normal to me.
   
   That said, what would be clearer?  NO_HZ_STRICT as a superset of
   NO_HZ_FULL?  Or move away from the NO_HZ terminology a bit; after all,
   we're talking about no interrupts of any kind, and maybe NO_HZ is too
   limited in scope?  So, NO_INTERRUPTS?  USERSPACE_ONLY?  Or look
   to vendors who ship bare-metal runtimes and call it BARE_METAL?
   Borrow the Tilera marketing name and call it ZERO_OVERHEAD?
   
   Maybe BARE_METAL seems most plausible -- after DATAPLANE, to me,
   of course :-)
 
 'baremetal' has uses in virtualization speak, so I think that would be 
 confusing.
 
  I like NO_INTERRUPTS.  Simple, direct.
 
 NO_HZ_PURE?

Hm, coke light, coke zero... OS_LIGHT and OS_ZERO?

-Mike

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH 0/6] support dataplane mode for nohz_full

2015-05-09 Thread Gilad Ben Yossef

 From: Mike Galbraith [mailto:umgwanakikb...@gmail.com]
 Sent: Saturday, May 09, 2015 10:20 AM
 To: Ingo Molnar
 Cc: Andrew Morton; Chris Metcalf; Steven Rostedt; Gilad Ben Yossef; Ingo
 Molnar; Peter Zijlstra; Rik van Riel; Tejun Heo; Frederic Weisbecker;
 Thomas Gleixner; Paul E. McKenney; Christoph Lameter; Srivatsa S. Bhat;
 linux-...@vger.kernel.org; linux-...@vger.kernel.org; linux-
 ker...@vger.kernel.org
 Subject: Re: [PATCH 0/6] support dataplane mode for nohz_full
 
 On Sat, 2015-05-09 at 09:05 +0200, Ingo Molnar wrote:
  * Andrew Morton a...@linux-foundation.org wrote:
 
   On Fri, 8 May 2015 19:11:10 -0400 Chris Metcalf cmetc...@ezchip.com
 wrote:
  
On 5/8/2015 5:22 PM, Steven Rostedt wrote:
 On Fri, 8 May 2015 14:18:24 -0700
 Andrew Morton a...@linux-foundation.org wrote:

 On Fri, 8 May 2015 13:58:41 -0400 Chris Metcalf
 cmetc...@ezchip.com wrote:

 A prctl() option (PR_SET_DATAPLANE) is added
 Dumb question: what does the term dataplane mean in this
 context?  I
 can't see the relationship between those words and what this
 patch
 does.
 I was thinking the same thing. I haven't gotten around to
 searching
 DATAPLANE yet.

 I would assume we want a name that is more meaningful for what is
 happening.
   
The text in the commit message and the 0/6 cover letter do try to
 explain
the concept.  The terminology comes, I think, from networking line
 cards,
where the dataplane is the part of the application that handles
 all the
fast path processing of network packets, and the control plane is
 the part
that handles routing updates, etc., generally slow-path stuff.  I've
 probably
just been using the terms so long they seem normal to me.
   
That said, what would be clearer?  NO_HZ_STRICT as a superset of
NO_HZ_FULL?  Or move away from the NO_HZ terminology a bit; after
 all,
we're talking about no interrupts of any kind, and maybe NO_HZ is
 too
limited in scope?  So, NO_INTERRUPTS?  USERSPACE_ONLY?  Or look
to vendors who ship bare-metal runtimes and call it BARE_METAL?
Borrow the Tilera marketing name and call it ZERO_OVERHEAD?
   
Maybe BARE_METAL seems most plausible -- after DATAPLANE, to me,
of course :-)
 
  'baremetal' has uses in virtualization speak, so I think that would be
  confusing.
 
   I like NO_INTERRUPTS.  Simple, direct.
 
  NO_HZ_PURE?
 
 Hm, coke light, coke zero... OS_LIGHT and OS_ZERO?
LOL... you forgot OS_CLASSIC for backwards compatibility :-)
How about TASK_SOLO?
Yes, you are trying to achieve the least amount of interference but the bigger 
context is about monopolizing a single CPU for yourself.
Anyway it is worth pointing out that while NO_HZ_FULL is very useful in 
conjunction with this turning the tick off is useful also if you have multiple 
tasks runnable (e.g. if you know you only need to context switch in 100 ms, why 
keep a periodic interrupt running?) even though we don't support it *right 
now*. It might be a good idea not to entangle these concepts too much.

Gilad
Gilad Ben-Yossef
Chief Software Architect
EZchip Technologies Ltd.
37 Israel Pollak Ave, Kiryat Gat 82025 ,Israel
Tel: +972-4-959- ext. 576, Fax: +972-8-681-1483 
Mobile: +972-52-826-0388, US Mobile: +1-973-826-0388
Email: gil...@ezchip.com, Web: http://www.ezchip.com

N�r��yb�X��ǧv�^�)޺{.n�+{zX����ܨ}���Ơz�j:+v���zZ+��+zf���h���~i���z��w���?��)ߢf��^jǫy�m��@A�a���
0��h���i

Re: [PATCH 0/6] support dataplane mode for nohz_full

2015-05-09 Thread Ingo Molnar

* Andrew Morton a...@linux-foundation.org wrote:

 On Fri, 8 May 2015 19:11:10 -0400 Chris Metcalf cmetc...@ezchip.com wrote:
 
  On 5/8/2015 5:22 PM, Steven Rostedt wrote:
   On Fri, 8 May 2015 14:18:24 -0700
   Andrew Morton a...@linux-foundation.org wrote:
  
   On Fri, 8 May 2015 13:58:41 -0400 Chris Metcalf cmetc...@ezchip.com 
   wrote:
  
   A prctl() option (PR_SET_DATAPLANE) is added
   Dumb question: what does the term dataplane mean in this context?  I
   can't see the relationship between those words and what this patch
   does.
   I was thinking the same thing. I haven't gotten around to searching
   DATAPLANE yet.
  
   I would assume we want a name that is more meaningful for what is
   happening.
  
  The text in the commit message and the 0/6 cover letter do try to explain
  the concept.  The terminology comes, I think, from networking line cards,
  where the dataplane is the part of the application that handles all the
  fast path processing of network packets, and the control plane is the part
  that handles routing updates, etc., generally slow-path stuff.  I've 
  probably
  just been using the terms so long they seem normal to me.
  
  That said, what would be clearer?  NO_HZ_STRICT as a superset of
  NO_HZ_FULL?  Or move away from the NO_HZ terminology a bit; after all,
  we're talking about no interrupts of any kind, and maybe NO_HZ is too
  limited in scope?  So, NO_INTERRUPTS?  USERSPACE_ONLY?  Or look
  to vendors who ship bare-metal runtimes and call it BARE_METAL?
  Borrow the Tilera marketing name and call it ZERO_OVERHEAD?
  
  Maybe BARE_METAL seems most plausible -- after DATAPLANE, to me,
  of course :-)

'baremetal' has uses in virtualization speak, so I think that would be 
confusing.

 I like NO_INTERRUPTS.  Simple, direct.

NO_HZ_PURE?

That's what it's really about: user-space wants to run exclusively, in 
pure user-mode, without any interrupts.

So I don't like 'NO_HZ_NO_INTERRUPTS' for a couple of reasons:

 - It is similar to a term we use in perf: PERF_PMU_CAP_NO_INTERRUPT.

 - Another reason is that 'NO_INTERRUPTS', in most existing uses in 
   the kernel generally relates to some sort of hardware weakness, 
   limitation, a negative property: that we try to limp along without 
   having a hardware interrupt and have to poll. In other driver code
   that uses variants of NO_INTERRUPT it appears to be similar. So I 
   think there's some confusion potential here.

 - Here the fact that we don't disturb user-space is an absolutely
   positive property, not a limitation, a kernel feature we work hard 
   to achieve. NO_HZ_PURE would convey that while NO_HZ_NO_INTERRUPTS 
   wouldn't.

 - NO_HZ_NO_INTERRUPTS has a double negation, and it's also too long,
   compared to NO_HZ_FULL or NO_HZ_PURE ;-) The term 'no HZ' already 
   expresses that we don't have periodic interruptions. We just 
   duplicate that information with NO_HZ_NO_INTERRUPTS, while 
   NO_HZ_FULL or NO_HZ_PURE qualifies it, makes it a stronger
   property - which is what we want I think.

So I think we should either rename NO_HZ_FULL to NO_HZ_PURE, or keep 
it at NO_HZ_FULL: because the intention of NO_HZ_FULL was always to be 
such a 'zero overhead' mode of operation, where if user-space runs, it 
won't get interrupted in any way.

There's no need to add yet another Kconfig variant - lets just enhance 
the current stuff and maybe rename it to NO_HZ_PURE to better express 
its intent.

Thanks,

Ingo
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/6] support "dataplane" mode for nohz_full

2015-05-08 Thread Andrew Morton
On Fri, 8 May 2015 19:11:10 -0400 Chris Metcalf  wrote:

> On 5/8/2015 5:22 PM, Steven Rostedt wrote:
> > On Fri, 8 May 2015 14:18:24 -0700
> > Andrew Morton  wrote:
> >
> >> On Fri, 8 May 2015 13:58:41 -0400 Chris Metcalf  
> >> wrote:
> >>
> >>> A prctl() option (PR_SET_DATAPLANE) is added
> >> Dumb question: what does the term "dataplane" mean in this context?  I
> >> can't see the relationship between those words and what this patch
> >> does.
> > I was thinking the same thing. I haven't gotten around to searching
> > DATAPLANE yet.
> >
> > I would assume we want a name that is more meaningful for what is
> > happening.
> 
> The text in the commit message and the 0/6 cover letter do try to explain
> the concept.  The terminology comes, I think, from networking line cards,
> where the "dataplane" is the part of the application that handles all the
> fast path processing of network packets, and the "control plane" is the part
> that handles routing updates, etc., generally slow-path stuff.  I've probably
> just been using the terms so long they seem normal to me.
> 
> That said, what would be clearer?  NO_HZ_STRICT as a superset of
> NO_HZ_FULL?  Or move away from the NO_HZ terminology a bit; after all,
> we're talking about no interrupts of any kind, and maybe NO_HZ is too
> limited in scope?  So, NO_INTERRUPTS?  USERSPACE_ONLY?  Or look
> to vendors who ship bare-metal runtimes and call it BARE_METAL?
> Borrow the Tilera marketing name and call it ZERO_OVERHEAD?
> 
> Maybe BARE_METAL seems most plausible -- after DATAPLANE, to me,
> of course :-)

I like NO_INTERRUPTS.  Simple, direct.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/6] support "dataplane" mode for nohz_full

2015-05-08 Thread Chris Metcalf

On 5/8/2015 5:22 PM, Steven Rostedt wrote:

On Fri, 8 May 2015 14:18:24 -0700
Andrew Morton  wrote:


On Fri, 8 May 2015 13:58:41 -0400 Chris Metcalf  wrote:


A prctl() option (PR_SET_DATAPLANE) is added

Dumb question: what does the term "dataplane" mean in this context?  I
can't see the relationship between those words and what this patch
does.

I was thinking the same thing. I haven't gotten around to searching
DATAPLANE yet.

I would assume we want a name that is more meaningful for what is
happening.


The text in the commit message and the 0/6 cover letter do try to explain
the concept.  The terminology comes, I think, from networking line cards,
where the "dataplane" is the part of the application that handles all the
fast path processing of network packets, and the "control plane" is the part
that handles routing updates, etc., generally slow-path stuff.  I've probably
just been using the terms so long they seem normal to me.

That said, what would be clearer?  NO_HZ_STRICT as a superset of
NO_HZ_FULL?  Or move away from the NO_HZ terminology a bit; after all,
we're talking about no interrupts of any kind, and maybe NO_HZ is too
limited in scope?  So, NO_INTERRUPTS?  USERSPACE_ONLY?  Or look
to vendors who ship bare-metal runtimes and call it BARE_METAL?
Borrow the Tilera marketing name and call it ZERO_OVERHEAD?

Maybe BARE_METAL seems most plausible -- after DATAPLANE, to me,
of course :-)

--
Chris Metcalf, EZChip Semiconductor
http://www.ezchip.com

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/6] support "dataplane" mode for nohz_full

2015-05-08 Thread Steven Rostedt
On Fri, 8 May 2015 14:18:24 -0700
Andrew Morton  wrote:

> On Fri, 8 May 2015 13:58:41 -0400 Chris Metcalf  wrote:
> 
> > A prctl() option (PR_SET_DATAPLANE) is added
> 
> Dumb question: what does the term "dataplane" mean in this context?  I
> can't see the relationship between those words and what this patch
> does.

I was thinking the same thing. I haven't gotten around to searching
DATAPLANE yet.

I would assume we want a name that is more meaningful for what is
happening.

-- Steve
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/6] support "dataplane" mode for nohz_full

2015-05-08 Thread Andrew Morton
On Fri, 8 May 2015 13:58:41 -0400 Chris Metcalf  wrote:

> A prctl() option (PR_SET_DATAPLANE) is added

Dumb question: what does the term "dataplane" mean in this context?  I
can't see the relationship between those words and what this patch
does.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/6] support dataplane mode for nohz_full

2015-05-08 Thread Andrew Morton
On Fri, 8 May 2015 13:58:41 -0400 Chris Metcalf cmetc...@ezchip.com wrote:

 A prctl() option (PR_SET_DATAPLANE) is added

Dumb question: what does the term dataplane mean in this context?  I
can't see the relationship between those words and what this patch
does.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/6] support dataplane mode for nohz_full

2015-05-08 Thread Steven Rostedt
On Fri, 8 May 2015 14:18:24 -0700
Andrew Morton a...@linux-foundation.org wrote:

 On Fri, 8 May 2015 13:58:41 -0400 Chris Metcalf cmetc...@ezchip.com wrote:
 
  A prctl() option (PR_SET_DATAPLANE) is added
 
 Dumb question: what does the term dataplane mean in this context?  I
 can't see the relationship between those words and what this patch
 does.

I was thinking the same thing. I haven't gotten around to searching
DATAPLANE yet.

I would assume we want a name that is more meaningful for what is
happening.

-- Steve
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/6] support dataplane mode for nohz_full

2015-05-08 Thread Andrew Morton
On Fri, 8 May 2015 19:11:10 -0400 Chris Metcalf cmetc...@ezchip.com wrote:

 On 5/8/2015 5:22 PM, Steven Rostedt wrote:
  On Fri, 8 May 2015 14:18:24 -0700
  Andrew Morton a...@linux-foundation.org wrote:
 
  On Fri, 8 May 2015 13:58:41 -0400 Chris Metcalf cmetc...@ezchip.com 
  wrote:
 
  A prctl() option (PR_SET_DATAPLANE) is added
  Dumb question: what does the term dataplane mean in this context?  I
  can't see the relationship between those words and what this patch
  does.
  I was thinking the same thing. I haven't gotten around to searching
  DATAPLANE yet.
 
  I would assume we want a name that is more meaningful for what is
  happening.
 
 The text in the commit message and the 0/6 cover letter do try to explain
 the concept.  The terminology comes, I think, from networking line cards,
 where the dataplane is the part of the application that handles all the
 fast path processing of network packets, and the control plane is the part
 that handles routing updates, etc., generally slow-path stuff.  I've probably
 just been using the terms so long they seem normal to me.
 
 That said, what would be clearer?  NO_HZ_STRICT as a superset of
 NO_HZ_FULL?  Or move away from the NO_HZ terminology a bit; after all,
 we're talking about no interrupts of any kind, and maybe NO_HZ is too
 limited in scope?  So, NO_INTERRUPTS?  USERSPACE_ONLY?  Or look
 to vendors who ship bare-metal runtimes and call it BARE_METAL?
 Borrow the Tilera marketing name and call it ZERO_OVERHEAD?
 
 Maybe BARE_METAL seems most plausible -- after DATAPLANE, to me,
 of course :-)

I like NO_INTERRUPTS.  Simple, direct.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/6] support dataplane mode for nohz_full

2015-05-08 Thread Chris Metcalf

On 5/8/2015 5:22 PM, Steven Rostedt wrote:

On Fri, 8 May 2015 14:18:24 -0700
Andrew Morton a...@linux-foundation.org wrote:


On Fri, 8 May 2015 13:58:41 -0400 Chris Metcalf cmetc...@ezchip.com wrote:


A prctl() option (PR_SET_DATAPLANE) is added

Dumb question: what does the term dataplane mean in this context?  I
can't see the relationship between those words and what this patch
does.

I was thinking the same thing. I haven't gotten around to searching
DATAPLANE yet.

I would assume we want a name that is more meaningful for what is
happening.


The text in the commit message and the 0/6 cover letter do try to explain
the concept.  The terminology comes, I think, from networking line cards,
where the dataplane is the part of the application that handles all the
fast path processing of network packets, and the control plane is the part
that handles routing updates, etc., generally slow-path stuff.  I've probably
just been using the terms so long they seem normal to me.

That said, what would be clearer?  NO_HZ_STRICT as a superset of
NO_HZ_FULL?  Or move away from the NO_HZ terminology a bit; after all,
we're talking about no interrupts of any kind, and maybe NO_HZ is too
limited in scope?  So, NO_INTERRUPTS?  USERSPACE_ONLY?  Or look
to vendors who ship bare-metal runtimes and call it BARE_METAL?
Borrow the Tilera marketing name and call it ZERO_OVERHEAD?

Maybe BARE_METAL seems most plausible -- after DATAPLANE, to me,
of course :-)

--
Chris Metcalf, EZChip Semiconductor
http://www.ezchip.com

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/