Re: [dm-devel] [PATCH] dm: add secdel target

2018-10-18 Thread Christoph Hellwig
Just as a note:  the name is a complete misowner, a couple overwrite
are not in any way secure deletion.  So naming it this way and exposing
this as erase is a problem that is going to get back to bite us.

If you really want this anyway at least give it a different way, and
do a one-time warning when th first erase comes in that it is not in
any meaninful way secure.


Re: [PATCH RFC] doc: rcu: remove obsolete (non-)requirement about disabling preemption

2018-10-18 Thread Joel Fernandes
On Thu, Oct 18, 2018 at 10:52:23PM -0400, Steven Rostedt wrote:
> On Thu, 18 Oct 2018 19:25:29 -0700
> Joel Fernandes  wrote:
> 
> > On Thu, Oct 18, 2018 at 09:50:35PM -0400, Steven Rostedt wrote:
> > > On Thu, 18 Oct 2018 18:26:45 -0700
> > > Joel Fernandes  wrote:
> > >   
> > > > Yes, local_irq_restore is light weight, and does not check for 
> > > > reschedules.
> > > > 
> > > > I was thinking of case where ksoftirqd is woken up, but does not run 
> > > > unless
> > > > we set the NEED_RESCHED flag. But that should get set anyway since 
> > > > probably
> > > > ksoftirqd is of high enough priority than the currently running task..
> > > > 
> > > > Roughly speaking the scenario could be something like:
> > > > 
> > > > rcu_read_lock();
> > > >  <-- IPI comes in for the expedited GP, sets exp_hint
> > > > local_irq_disable();
> > > > // do a bunch of stuff
> > > > rcu_read_unlock();   <-- This calls the rcu_read_unlock_special which 
> > > > raises
> > > >  the soft irq, and wakesup softirqd.  
> > > 
> > > If softirqd is of higher priority than the current running task, then
> > > the try_to_wake_up() will set NEED_RESCHED of the current task here.
> > >   
> > 
> > Yes, only *if*. On my system, ksoftirqd is CFS nice 0. I thought expedited
> > grace periods are quite important and they should complete quickly which is
> > the whole reason for interrupting rcu read sections with an IPI and stuff.
> > IMO there should be no harm in setting NEED_RESCHED unconditionally anyway
> > for possible benefit of systems where the ksoftirqd is not of higher 
> > priority
> > than the currently running task, and we need to run it soon on the CPU. But
> > I'm Ok with whatever Paul and you want to do here.
> 
> 
> Setting NEED_RESCHED unconditionally wont help. Because even if we call
> schedule() ksoftirqd will not be scheduled! If it's CFS nice 0, and the
> current task still has quota to run, if you call schedule, you'll just
> waste time calculating that the current task should still be running.
> It's equivalent to calling yield() (which is why we removed all yield()
> users in the kernel, because *all* of them were buggy!). This is *why*
> it only calls schedule *if* softirqd is of higher priority.

Yes, ok. you are right the TTWU path should handle setting the NEED_RESCHED
flag or not and unconditionally setting it does not get us anything. I had to
go through the code a bit since it has been a while since I explored it.

So Paul, I'm Ok with your latest patch for the issue we discussed and don't
think much more can be done barring raising of ksofitrqd priorities :-) So I
guess the synchronize_rcu_expedited will just cope with the deal between
local_irq_enable and the next scheduling point.. :-)

thanks,

- Joel



Re: [PATCH RFC] doc: rcu: remove obsolete (non-)requirement about disabling preemption

2018-10-18 Thread Steven Rostedt
On Thu, 18 Oct 2018 19:25:29 -0700
Joel Fernandes  wrote:

> On Thu, Oct 18, 2018 at 09:50:35PM -0400, Steven Rostedt wrote:
> > On Thu, 18 Oct 2018 18:26:45 -0700
> > Joel Fernandes  wrote:
> >   
> > > Yes, local_irq_restore is light weight, and does not check for 
> > > reschedules.
> > > 
> > > I was thinking of case where ksoftirqd is woken up, but does not run 
> > > unless
> > > we set the NEED_RESCHED flag. But that should get set anyway since 
> > > probably
> > > ksoftirqd is of high enough priority than the currently running task..
> > > 
> > > Roughly speaking the scenario could be something like:
> > > 
> > > rcu_read_lock();
> > >  <-- IPI comes in for the expedited GP, sets exp_hint
> > > local_irq_disable();
> > > // do a bunch of stuff
> > > rcu_read_unlock();   <-- This calls the rcu_read_unlock_special which 
> > > raises
> > >  the soft irq, and wakesup softirqd.  
> > 
> > If softirqd is of higher priority than the current running task, then
> > the try_to_wake_up() will set NEED_RESCHED of the current task here.
> >   
> 
> Yes, only *if*. On my system, ksoftirqd is CFS nice 0. I thought expedited
> grace periods are quite important and they should complete quickly which is
> the whole reason for interrupting rcu read sections with an IPI and stuff.
> IMO there should be no harm in setting NEED_RESCHED unconditionally anyway
> for possible benefit of systems where the ksoftirqd is not of higher priority
> than the currently running task, and we need to run it soon on the CPU. But
> I'm Ok with whatever Paul and you want to do here.


Setting NEED_RESCHED unconditionally wont help. Because even if we call
schedule() ksoftirqd will not be scheduled! If it's CFS nice 0, and the
current task still has quota to run, if you call schedule, you'll just
waste time calculating that the current task should still be running.
It's equivalent to calling yield() (which is why we removed all yield()
users in the kernel, because *all* of them were buggy!). This is *why*
it only calls schedule *if* softirqd is of higher priority.

-- Steve


Re: [PATCH RFC] doc: rcu: remove obsolete (non-)requirement about disabling preemption

2018-10-18 Thread Joel Fernandes
On Thu, Oct 18, 2018 at 09:50:35PM -0400, Steven Rostedt wrote:
> On Thu, 18 Oct 2018 18:26:45 -0700
> Joel Fernandes  wrote:
> 
> > Yes, local_irq_restore is light weight, and does not check for reschedules.
> > 
> > I was thinking of case where ksoftirqd is woken up, but does not run unless
> > we set the NEED_RESCHED flag. But that should get set anyway since probably
> > ksoftirqd is of high enough priority than the currently running task..
> > 
> > Roughly speaking the scenario could be something like:
> > 
> > rcu_read_lock();
> >  <-- IPI comes in for the expedited GP, sets exp_hint
> > local_irq_disable();
> > // do a bunch of stuff
> > rcu_read_unlock();   <-- This calls the rcu_read_unlock_special which raises
> >  the soft irq, and wakesup softirqd.
> 
> If softirqd is of higher priority than the current running task, then
> the try_to_wake_up() will set NEED_RESCHED of the current task here.
> 

Yes, only *if*. On my system, ksoftirqd is CFS nice 0. I thought expedited
grace periods are quite important and they should complete quickly which is
the whole reason for interrupting rcu read sections with an IPI and stuff.
IMO there should be no harm in setting NEED_RESCHED unconditionally anyway
for possible benefit of systems where the ksoftirqd is not of higher priority
than the currently running task, and we need to run it soon on the CPU. But
I'm Ok with whatever Paul and you want to do here.

- Joel



Re: [PATCH RFC] doc: rcu: remove obsolete (non-)requirement about disabling preemption

2018-10-18 Thread Steven Rostedt
On Thu, 18 Oct 2018 18:26:45 -0700
Joel Fernandes  wrote:

> Yes, local_irq_restore is light weight, and does not check for reschedules.
> 
> I was thinking of case where ksoftirqd is woken up, but does not run unless
> we set the NEED_RESCHED flag. But that should get set anyway since probably
> ksoftirqd is of high enough priority than the currently running task..
> 
> Roughly speaking the scenario could be something like:
> 
> rcu_read_lock();
>  <-- IPI comes in for the expedited GP, sets exp_hint
> local_irq_disable();
> // do a bunch of stuff
> rcu_read_unlock();   <-- This calls the rcu_read_unlock_special which raises
>  the soft irq, and wakesup softirqd.

If softirqd is of higher priority than the current running task, then
the try_to_wake_up() will set NEED_RESCHED of the current task here.

-- Steve

> local_irq_enable();
> 
> // Now ksoftirqd is ready to run but we don't switch into the
> // scheduler for sometime because tif_need_resched() returns false and
> // any cond_resched calls do nothing. So we potentially spend lots of
> // time before the next scheduling event.
> 
> You think this should not be an issue?



Re: [PATCH RFC] doc: rcu: remove obsolete (non-)requirement about disabling preemption

2018-10-18 Thread Joel Fernandes
On Thu, Oct 18, 2018 at 09:12:45PM -0400, Steven Rostedt wrote:
> On Thu, 18 Oct 2018 17:19:32 -0700
> "Paul E. McKenney"  wrote:
> 
> > I figured that whoever calls preempt_enable_no_resched() is taking the
> > responsibility for permitting preemption in the near future, and if they
> > fail to do so, they will get called on it.  Hard to hide from the latency
> > tracer, after all.  ;-)
> 
> Correct, and doing a search of preempt_enable_no_resched() I see
> there's one in the ftrace ring buffer code, that was added a long time
> ago (2008) to fix a recursion bug that no longer exists, and this now
> can leak a preemption point.
> 
> I'll have to go fix that :-(

Cool! Glad you found this issue in the code while we are discussing it ;)

thanks,

- Joel



Re: [PATCH RFC] doc: rcu: remove obsolete (non-)requirement about disabling preemption

2018-10-18 Thread Joel Fernandes
On Thu, Oct 18, 2018 at 05:19:32PM -0700, Paul E. McKenney wrote:
> On Thu, Oct 18, 2018 at 05:03:50PM -0700, Joel Fernandes wrote:
> > On Thu, Oct 18, 2018 at 07:46:37AM -0700, Paul E. McKenney wrote:
> > [..]
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > commit 07921e8720907f58f82b142f2027fc56d5abdbfd
> > > > > > > > > Author: Paul E. McKenney 
> > > > > > > > > Date:   Tue Oct 16 04:12:58 2018 -0700
> > > > > > > > > 
> > > > > > > > > rcu: Speed up expedited GPs when interrupting RCU reader
> > > > > > > > > 
> > > > > > > > > In PREEMPT kernels, an expedited grace period might send 
> > > > > > > > > an IPI to a
> > > > > > > > > CPU that is executing an RCU read-side critical section.  
> > > > > > > > > In that case,
> > > > > > > > > it would be nice if the rcu_read_unlock() directly 
> > > > > > > > > interacted with the
> > > > > > > > > RCU core code to immediately report the quiescent state.  
> > > > > > > > > And this does
> > > > > > > > > happen in the case where the reader has been preempted.  
> > > > > > > > > But it would
> > > > > > > > > also be a nice performance optimization if immediate 
> > > > > > > > > reporting also
> > > > > > > > > happened in the preemption-free case.
> > > > > > > > > 
> > > > > > > > > This commit therefore adds an ->exp_hint field to the 
> > > > > > > > > task_struct structure's
> > > > > > > > > ->rcu_read_unlock_special field.  The IPI handler sets 
> > > > > > > > > this hint when
> > > > > > > > > it has interrupted an RCU read-side critical section, and 
> > > > > > > > > this causes
> > > > > > > > > the outermost rcu_read_unlock() call to invoke 
> > > > > > > > > rcu_read_unlock_special(),
> > > > > > > > > which, if preemption is enabled, reports the quiescent 
> > > > > > > > > state immediately.
> > > > > > > > > If preemption is disabled, then the report is required to 
> > > > > > > > > be deferred
> > > > > > > > > until preemption (or bottom halves or interrupts or 
> > > > > > > > > whatever) is re-enabled.
> > > > > > > > > 
> > > > > > > > > Because this is a hint, it does nothing for more 
> > > > > > > > > complicated cases.  For
> > > > > > > > > example, if the IPI interrupts an RCU reader, but 
> > > > > > > > > interrupts are disabled
> > > > > > > > > across the rcu_read_unlock(), but another rcu_read_lock() 
> > > > > > > > > is executed
> > > > > > > > > before interrupts are re-enabled, the hint will already 
> > > > > > > > > have been cleared.
> > > > > > > > > If you do crazy things like this, reporting will be 
> > > > > > > > > deferred until some
> > > > > > > > > later RCU_SOFTIRQ handler, context switch, 
> > > > > > > > > cond_resched(), or similar.
> > > > > > > > > 
> > > > > > > > > Reported-by: Joel Fernandes 
> > > > > > > > > Signed-off-by: Paul E. McKenney 
> > > > > > > > > 
> > > > > > > > > diff --git a/include/linux/sched.h b/include/linux/sched.h
> > > > > > > > > index 004ca21f7e80..64ce751b5fe9 100644
> > > > > > > > > --- a/include/linux/sched.h
> > > > > > > > > +++ b/include/linux/sched.h
> > > > > > > > > @@ -571,8 +571,10 @@ union rcu_special {
> > > > > > > > >   struct {
> > > > > > > > >   u8  blocked;
> > > > > > > > >   u8  need_qs;
> > > > > > > > > + u8  exp_hint; /* Hint for 
> > > > > > > > > performance. */
> > > > > > > > > + u8  pad; /* No garbage from 
> > > > > > > > > compiler! */
> > > > > > > > >   } b; /* Bits. */
> > > > > > > > > - u16 s; /* Set of bits. */
> > > > > > > > > + u32 s; /* Set of bits. */
> > > > > > > > >  };
> > > > > > > > >  
> > > > > > > > >  enum perf_event_task_context {
> > > > > > > > > diff --git a/kernel/rcu/tree_exp.h b/kernel/rcu/tree_exp.h
> > > > > > > > > index e669ccf3751b..928fe5893a57 100644
> > > > > > > > > --- a/kernel/rcu/tree_exp.h
> > > > > > > > > +++ b/kernel/rcu/tree_exp.h
> > > > > > > > > @@ -692,8 +692,10 @@ static void sync_rcu_exp_handler(void 
> > > > > > > > > *unused)
> > > > > > > > >*/
> > > > > > > > >   if (t->rcu_read_lock_nesting > 0) {
> > > > > > > > >   raw_spin_lock_irqsave_rcu_node(rnp, flags);
> > > > > > > > > - if (rnp->expmask & rdp->grpmask)
> > > > > > > > > + if (rnp->expmask & rdp->grpmask) {
> > > > > > > > >   rdp->deferred_qs = true;
> > > > > > > > > + 
> > > > > > > > > WRITE_ONCE(t->rcu_read_unlock_special.b.exp_hint, true);
> > > > > > > > > + }
> > > > > > > > >   raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
> > > > > > > > >   }
> > > > > > > > >  
> > > > > > > > > diff --git a/kernel/rcu/tree_plugin.h 
> > > > > > > > > b/kernel/rcu/tree_plugin.h
> > > > 

Re: [PATCH RFC] doc: rcu: remove obsolete (non-)requirement about disabling preemption

2018-10-18 Thread Steven Rostedt
On Thu, 18 Oct 2018 17:19:32 -0700
"Paul E. McKenney"  wrote:

> I figured that whoever calls preempt_enable_no_resched() is taking the
> responsibility for permitting preemption in the near future, and if they
> fail to do so, they will get called on it.  Hard to hide from the latency
> tracer, after all.  ;-)

Correct, and doing a search of preempt_enable_no_resched() I see
there's one in the ftrace ring buffer code, that was added a long time
ago (2008) to fix a recursion bug that no longer exists, and this now
can leak a preemption point.

I'll have to go fix that :-(

-- Steve


Re: [PATCH RFC] doc: rcu: remove obsolete (non-)requirement about disabling preemption

2018-10-18 Thread Paul E. McKenney
On Thu, Oct 18, 2018 at 05:03:50PM -0700, Joel Fernandes wrote:
> On Thu, Oct 18, 2018 at 07:46:37AM -0700, Paul E. McKenney wrote:
> [..]
> > > > > > > > 
> > > > > > > > 
> > > > > > > > commit 07921e8720907f58f82b142f2027fc56d5abdbfd
> > > > > > > > Author: Paul E. McKenney 
> > > > > > > > Date:   Tue Oct 16 04:12:58 2018 -0700
> > > > > > > > 
> > > > > > > > rcu: Speed up expedited GPs when interrupting RCU reader
> > > > > > > > 
> > > > > > > > In PREEMPT kernels, an expedited grace period might send an 
> > > > > > > > IPI to a
> > > > > > > > CPU that is executing an RCU read-side critical section.  
> > > > > > > > In that case,
> > > > > > > > it would be nice if the rcu_read_unlock() directly 
> > > > > > > > interacted with the
> > > > > > > > RCU core code to immediately report the quiescent state.  
> > > > > > > > And this does
> > > > > > > > happen in the case where the reader has been preempted.  
> > > > > > > > But it would
> > > > > > > > also be a nice performance optimization if immediate 
> > > > > > > > reporting also
> > > > > > > > happened in the preemption-free case.
> > > > > > > > 
> > > > > > > > This commit therefore adds an ->exp_hint field to the 
> > > > > > > > task_struct structure's
> > > > > > > > ->rcu_read_unlock_special field.  The IPI handler sets this 
> > > > > > > > hint when
> > > > > > > > it has interrupted an RCU read-side critical section, and 
> > > > > > > > this causes
> > > > > > > > the outermost rcu_read_unlock() call to invoke 
> > > > > > > > rcu_read_unlock_special(),
> > > > > > > > which, if preemption is enabled, reports the quiescent 
> > > > > > > > state immediately.
> > > > > > > > If preemption is disabled, then the report is required to 
> > > > > > > > be deferred
> > > > > > > > until preemption (or bottom halves or interrupts or 
> > > > > > > > whatever) is re-enabled.
> > > > > > > > 
> > > > > > > > Because this is a hint, it does nothing for more 
> > > > > > > > complicated cases.  For
> > > > > > > > example, if the IPI interrupts an RCU reader, but 
> > > > > > > > interrupts are disabled
> > > > > > > > across the rcu_read_unlock(), but another rcu_read_lock() 
> > > > > > > > is executed
> > > > > > > > before interrupts are re-enabled, the hint will already 
> > > > > > > > have been cleared.
> > > > > > > > If you do crazy things like this, reporting will be 
> > > > > > > > deferred until some
> > > > > > > > later RCU_SOFTIRQ handler, context switch, cond_resched(), 
> > > > > > > > or similar.
> > > > > > > > 
> > > > > > > > Reported-by: Joel Fernandes 
> > > > > > > > Signed-off-by: Paul E. McKenney 
> > > > > > > > 
> > > > > > > > diff --git a/include/linux/sched.h b/include/linux/sched.h
> > > > > > > > index 004ca21f7e80..64ce751b5fe9 100644
> > > > > > > > --- a/include/linux/sched.h
> > > > > > > > +++ b/include/linux/sched.h
> > > > > > > > @@ -571,8 +571,10 @@ union rcu_special {
> > > > > > > > struct {
> > > > > > > > u8  blocked;
> > > > > > > > u8  need_qs;
> > > > > > > > +   u8  exp_hint; /* Hint for 
> > > > > > > > performance. */
> > > > > > > > +   u8  pad; /* No garbage from 
> > > > > > > > compiler! */
> > > > > > > > } b; /* Bits. */
> > > > > > > > -   u16 s; /* Set of bits. */
> > > > > > > > +   u32 s; /* Set of bits. */
> > > > > > > >  };
> > > > > > > >  
> > > > > > > >  enum perf_event_task_context {
> > > > > > > > diff --git a/kernel/rcu/tree_exp.h b/kernel/rcu/tree_exp.h
> > > > > > > > index e669ccf3751b..928fe5893a57 100644
> > > > > > > > --- a/kernel/rcu/tree_exp.h
> > > > > > > > +++ b/kernel/rcu/tree_exp.h
> > > > > > > > @@ -692,8 +692,10 @@ static void sync_rcu_exp_handler(void 
> > > > > > > > *unused)
> > > > > > > >  */
> > > > > > > > if (t->rcu_read_lock_nesting > 0) {
> > > > > > > > raw_spin_lock_irqsave_rcu_node(rnp, flags);
> > > > > > > > -   if (rnp->expmask & rdp->grpmask)
> > > > > > > > +   if (rnp->expmask & rdp->grpmask) {
> > > > > > > > rdp->deferred_qs = true;
> > > > > > > > +   
> > > > > > > > WRITE_ONCE(t->rcu_read_unlock_special.b.exp_hint, true);
> > > > > > > > +   }
> > > > > > > > raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
> > > > > > > > }
> > > > > > > >  
> > > > > > > > diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
> > > > > > > > index 8b48bb7c224c..d6286eb6e77e 100644
> > > > > > > > --- a/kernel/rcu/tree_plugin.h
> > > > > > > > +++ b/kernel/rcu/tree_plugin.h
> > > > > > > > @@ -643,8 +643,9 @@ static void rcu_read_unlock_special(struct 
> > > > > > >

Re: [PATCH 1/3] printk: Introduce per-console loglevel setting

2018-10-18 Thread Sergey Senozhatsky
On (09/28/17 17:43), Calvin Owens wrote:
> Not all consoles are created equal: depending on the actual hardware,
> the latency of a printk() call can vary dramatically. The worst examples
> are serial consoles, where it can spin for tens of milliseconds banging
> the UART to emit a message, which can cause application-level problems
> when the kernel spews onto the console.
> 
> At Facebook we use netconsole to monitor our fleet, but we still have
> serial consoles attached on each host for live debugging, and the latter
> has caused problems. An obvious solution is to disable the kernel
> console output to ttyS0, but this makes live debugging frustrating,
> since crashes become silent and opaque to the ttyS0 user. Enabling it on
> the fly when needed isn't feasible, since boxes you need to debug via
> serial are likely to be borked in ways that make this impossible.
> 
> That puts us between a rock and a hard place: we'd love to set
> kernel.printk to KERN_INFO and get all the logs. But while netconsole is
> fast enough to permit that without perturbing userspace, ttyS0 is not,
> and we're forced to limit console logging to KERN_WARNING and higher.
> 
> This patch introduces a new per-console loglevel setting, and changes
> console_unlock() to use max(global_level, per_console_level) when
> deciding whether or not to emit a given log message.
> 
> This lets us have our cake and eat it too: instead of being forced to
> limit all consoles verbosity based on the speed of the slowest one, we
> can "promote" the faster console while still using a conservative system
> loglevel setting to avoid disturbing applications.

Hi Calvin,

Do you have time to address the review feedback and re-spin v2?

-ss


Re: [PATCH RFC] doc: rcu: remove obsolete (non-)requirement about disabling preemption

2018-10-18 Thread Joel Fernandes
On Thu, Oct 18, 2018 at 07:46:37AM -0700, Paul E. McKenney wrote:
[..]
> > > > > > > 
> > > > > > > 
> > > > > > > commit 07921e8720907f58f82b142f2027fc56d5abdbfd
> > > > > > > Author: Paul E. McKenney 
> > > > > > > Date:   Tue Oct 16 04:12:58 2018 -0700
> > > > > > > 
> > > > > > > rcu: Speed up expedited GPs when interrupting RCU reader
> > > > > > > 
> > > > > > > In PREEMPT kernels, an expedited grace period might send an 
> > > > > > > IPI to a
> > > > > > > CPU that is executing an RCU read-side critical section.  In 
> > > > > > > that case,
> > > > > > > it would be nice if the rcu_read_unlock() directly interacted 
> > > > > > > with the
> > > > > > > RCU core code to immediately report the quiescent state.  And 
> > > > > > > this does
> > > > > > > happen in the case where the reader has been preempted.  But 
> > > > > > > it would
> > > > > > > also be a nice performance optimization if immediate 
> > > > > > > reporting also
> > > > > > > happened in the preemption-free case.
> > > > > > > 
> > > > > > > This commit therefore adds an ->exp_hint field to the 
> > > > > > > task_struct structure's
> > > > > > > ->rcu_read_unlock_special field.  The IPI handler sets this 
> > > > > > > hint when
> > > > > > > it has interrupted an RCU read-side critical section, and 
> > > > > > > this causes
> > > > > > > the outermost rcu_read_unlock() call to invoke 
> > > > > > > rcu_read_unlock_special(),
> > > > > > > which, if preemption is enabled, reports the quiescent state 
> > > > > > > immediately.
> > > > > > > If preemption is disabled, then the report is required to be 
> > > > > > > deferred
> > > > > > > until preemption (or bottom halves or interrupts or whatever) 
> > > > > > > is re-enabled.
> > > > > > > 
> > > > > > > Because this is a hint, it does nothing for more complicated 
> > > > > > > cases.  For
> > > > > > > example, if the IPI interrupts an RCU reader, but interrupts 
> > > > > > > are disabled
> > > > > > > across the rcu_read_unlock(), but another rcu_read_lock() is 
> > > > > > > executed
> > > > > > > before interrupts are re-enabled, the hint will already have 
> > > > > > > been cleared.
> > > > > > > If you do crazy things like this, reporting will be deferred 
> > > > > > > until some
> > > > > > > later RCU_SOFTIRQ handler, context switch, cond_resched(), or 
> > > > > > > similar.
> > > > > > > 
> > > > > > > Reported-by: Joel Fernandes 
> > > > > > > Signed-off-by: Paul E. McKenney 
> > > > > > > 
> > > > > > > diff --git a/include/linux/sched.h b/include/linux/sched.h
> > > > > > > index 004ca21f7e80..64ce751b5fe9 100644
> > > > > > > --- a/include/linux/sched.h
> > > > > > > +++ b/include/linux/sched.h
> > > > > > > @@ -571,8 +571,10 @@ union rcu_special {
> > > > > > >   struct {
> > > > > > >   u8  blocked;
> > > > > > >   u8  need_qs;
> > > > > > > + u8  exp_hint; /* Hint for 
> > > > > > > performance. */
> > > > > > > + u8  pad; /* No garbage from 
> > > > > > > compiler! */
> > > > > > >   } b; /* Bits. */
> > > > > > > - u16 s; /* Set of bits. */
> > > > > > > + u32 s; /* Set of bits. */
> > > > > > >  };
> > > > > > >  
> > > > > > >  enum perf_event_task_context {
> > > > > > > diff --git a/kernel/rcu/tree_exp.h b/kernel/rcu/tree_exp.h
> > > > > > > index e669ccf3751b..928fe5893a57 100644
> > > > > > > --- a/kernel/rcu/tree_exp.h
> > > > > > > +++ b/kernel/rcu/tree_exp.h
> > > > > > > @@ -692,8 +692,10 @@ static void sync_rcu_exp_handler(void 
> > > > > > > *unused)
> > > > > > >*/
> > > > > > >   if (t->rcu_read_lock_nesting > 0) {
> > > > > > >   raw_spin_lock_irqsave_rcu_node(rnp, flags);
> > > > > > > - if (rnp->expmask & rdp->grpmask)
> > > > > > > + if (rnp->expmask & rdp->grpmask) {
> > > > > > >   rdp->deferred_qs = true;
> > > > > > > + 
> > > > > > > WRITE_ONCE(t->rcu_read_unlock_special.b.exp_hint, true);
> > > > > > > + }
> > > > > > >   raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
> > > > > > >   }
> > > > > > >  
> > > > > > > diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
> > > > > > > index 8b48bb7c224c..d6286eb6e77e 100644
> > > > > > > --- a/kernel/rcu/tree_plugin.h
> > > > > > > +++ b/kernel/rcu/tree_plugin.h
> > > > > > > @@ -643,8 +643,9 @@ static void rcu_read_unlock_special(struct 
> > > > > > > task_struct *t)
> > > > > > >   local_irq_save(flags);
> > > > > > >   irqs_were_disabled = irqs_disabled_flags(flags);
> > > > > > >   if ((preempt_bh_were_disabled || irqs_were_disabled) &&
> > > > > > > - t->rcu_read_unlock_special.b.blocked) {
> > > > > > > + t->rcu_read_unlock_special.s) {
> > > > > > >   /* Need to defer quiescent state un

[PATCH v5 10/13] arch/x86: Add AMD feature bit X86_FEATURE_MBA in cpuid bits array

2018-10-18 Thread Moger, Babu
From: Sherry Hurwitz 

The feature bit X86_FEATURE_MBA is detected via CPUID leaf 0x8008
EBX Bit 06. This bit indicates the support of AMD's MBA feature.

This feature is supported by both Intel and AMD. But they are detected
in different CPUID leaves.

Signed-off-by: Sherry Hurwitz 
Signed-off-by: Babu Moger 
Reviewed-by: Borislav Petkov 
---
 arch/x86/kernel/cpu/scattered.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/cpu/scattered.c b/arch/x86/kernel/cpu/scattered.c
index 772c219b6889..bd7853334b27 100644
--- a/arch/x86/kernel/cpu/scattered.c
+++ b/arch/x86/kernel/cpu/scattered.c
@@ -17,7 +17,11 @@ struct cpuid_bit {
u32 sub_leaf;
 };
 
-/* Please keep the leaf sorted by cpuid_bit.level for faster search. */
+/*
+ * Please keep the leaf sorted by cpuid_bit.level for faster search.
+ * X86_FEATURE_MBA supported by both Intel and AMD. But the cpuid
+ * levels are different. Add a separate enty for each.
+ */
 static const struct cpuid_bit cpuid_bits[] = {
{ X86_FEATURE_APERFMPERF,   CPUID_ECX,  0, 0x0006, 0 },
{ X86_FEATURE_EPB,  CPUID_ECX,  3, 0x0006, 0 },
@@ -29,6 +33,7 @@ static const struct cpuid_bit cpuid_bits[] = {
{ X86_FEATURE_HW_PSTATE,CPUID_EDX,  7, 0x8007, 0 },
{ X86_FEATURE_CPB,  CPUID_EDX,  9, 0x8007, 0 },
{ X86_FEATURE_PROC_FEEDBACK,CPUID_EDX, 11, 0x8007, 0 },
+   { X86_FEATURE_MBA,  CPUID_EBX,  6, 0x8008, 0 },
{ X86_FEATURE_SME,  CPUID_EAX,  0, 0x801f, 0 },
{ X86_FEATURE_SEV,  CPUID_EAX,  1, 0x801f, 0 },
{ 0, 0, 0, 0, 0 }
-- 
2.17.1



[PATCH v5 02/13] arch/x86: Rename the RDT functions and definitions

2018-10-18 Thread Moger, Babu
As AMD is starting to support RDT(or QOS) features, rename
the RDT functions and definitions to more generic names.

Replace intel_rdt to resctrl where applicable.

Signed-off-by: Babu Moger 
---
 arch/x86/include/asm/resctrl_sched.h   | 24 
 arch/x86/kernel/cpu/resctrl.c  | 26 +-
 arch/x86/kernel/cpu/resctrl.h  |  2 +-
 arch/x86/kernel/cpu/resctrl_monitor.c  | 11 ++-
 arch/x86/kernel/cpu/resctrl_rdtgroup.c | 10 +-
 arch/x86/kernel/process_32.c   |  2 +-
 arch/x86/kernel/process_64.c   |  2 +-
 7 files changed, 39 insertions(+), 38 deletions(-)

diff --git a/arch/x86/include/asm/resctrl_sched.h 
b/arch/x86/include/asm/resctrl_sched.h
index 9acb06b6f81e..6e082697a613 100644
--- a/arch/x86/include/asm/resctrl_sched.h
+++ b/arch/x86/include/asm/resctrl_sched.h
@@ -1,6 +1,6 @@
 /* SPDX-License-Identifier: GPL-2.0 */
-#ifndef _ASM_X86_INTEL_RDT_SCHED_H
-#define _ASM_X86_INTEL_RDT_SCHED_H
+#ifndef _ASM_X86_RESCTRL_SCHED_H
+#define _ASM_X86_RESCTRL_SCHED_H
 
 #ifdef CONFIG_INTEL_RDT
 
@@ -10,7 +10,7 @@
 #define IA32_PQR_ASSOC 0x0c8f
 
 /**
- * struct intel_pqr_state - State cache for the PQR MSR
+ * struct resctrl_pqr_state - State cache for the PQR MSR
  * @cur_rmid:  The cached Resource Monitoring ID
  * @cur_closid:The cached Class Of Service ID
  * @default_rmid:  The user assigned Resource Monitoring ID
@@ -24,21 +24,21 @@
  * The cache also helps to avoid pointless updates if the value does
  * not change.
  */
-struct intel_pqr_state {
+struct resctrl_pqr_state {
u32 cur_rmid;
u32 cur_closid;
u32 default_rmid;
u32 default_closid;
 };
 
-DECLARE_PER_CPU(struct intel_pqr_state, pqr_state);
+DECLARE_PER_CPU(struct resctrl_pqr_state, pqr_state);
 
 DECLARE_STATIC_KEY_FALSE(rdt_enable_key);
 DECLARE_STATIC_KEY_FALSE(rdt_alloc_enable_key);
 DECLARE_STATIC_KEY_FALSE(rdt_mon_enable_key);
 
 /*
- * __intel_rdt_sched_in() - Writes the task's CLOSid/RMID to IA32_PQR_MSR
+ * __resctrl_sched_in() - Writes the task's CLOSid/RMID to IA32_PQR_MSR
  *
  * Following considerations are made so that this has minimal impact
  * on scheduler hot path:
@@ -51,9 +51,9 @@ DECLARE_STATIC_KEY_FALSE(rdt_mon_enable_key);
  *   simple as possible.
  * Must be called with preemption disabled.
  */
-static void __intel_rdt_sched_in(void)
+static void __resctrl_sched_in(void)
 {
-   struct intel_pqr_state *state = this_cpu_ptr(&pqr_state);
+   struct resctrl_pqr_state *state = this_cpu_ptr(&pqr_state);
u32 closid = state->default_closid;
u32 rmid = state->default_rmid;
 
@@ -78,16 +78,16 @@ static void __intel_rdt_sched_in(void)
}
 }
 
-static inline void intel_rdt_sched_in(void)
+static inline void resctrl_sched_in(void)
 {
if (static_branch_likely(&rdt_enable_key))
-   __intel_rdt_sched_in();
+   __resctrl_sched_in();
 }
 
 #else
 
-static inline void intel_rdt_sched_in(void) {}
+static inline void resctrl_sched_in(void) {}
 
 #endif /* CONFIG_INTEL_RDT */
 
-#endif /* _ASM_X86_INTEL_RDT_SCHED_H */
+#endif /* _ASM_X86_RESCTRL_SCHED_H */
diff --git a/arch/x86/kernel/cpu/resctrl.c b/arch/x86/kernel/cpu/resctrl.c
index 3968b54902b1..8afc0da6fa59 100644
--- a/arch/x86/kernel/cpu/resctrl.c
+++ b/arch/x86/kernel/cpu/resctrl.c
@@ -40,12 +40,12 @@
 DEFINE_MUTEX(rdtgroup_mutex);
 
 /*
- * The cached intel_pqr_state is strictly per CPU and can never be
+ * The cached resctrl_pqr_state is strictly per CPU and can never be
  * updated from a remote CPU. Functions which modify the state
  * are called with interrupts disabled and no preemption, which
  * is sufficient for the protection.
  */
-DEFINE_PER_CPU(struct intel_pqr_state, pqr_state);
+DEFINE_PER_CPU(struct resctrl_pqr_state, pqr_state);
 
 /*
  * Used to store the max resource name width and max resource data width
@@ -632,7 +632,7 @@ static void domain_remove_cpu(int cpu, struct rdt_resource 
*r)
 
 static void clear_closid_rmid(int cpu)
 {
-   struct intel_pqr_state *state = this_cpu_ptr(&pqr_state);
+   struct resctrl_pqr_state *state = this_cpu_ptr(&pqr_state);
 
state->default_closid = 0;
state->default_rmid = 0;
@@ -641,7 +641,7 @@ static void clear_closid_rmid(int cpu)
wrmsr(IA32_PQR_ASSOC, 0, 0);
 }
 
-static int intel_rdt_online_cpu(unsigned int cpu)
+static int resctrl_online_cpu(unsigned int cpu)
 {
struct rdt_resource *r;
 
@@ -667,7 +667,7 @@ static void clear_childcpus(struct rdtgroup *r, unsigned 
int cpu)
}
 }
 
-static int intel_rdt_offline_cpu(unsigned int cpu)
+static int resctrl_offline_cpu(unsigned int cpu)
 {
struct rdtgroup *rdtgrp;
struct rdt_resource *r;
@@ -859,7 +859,7 @@ static __init bool get_rdt_resources(void)
 
 static enum cpuhp_state rdt_online;
 
-static int __init intel_rdt_late_init(void)
+static int __i

[PATCH v5 00/13] arch/x86: AMD QoS support

2018-10-18 Thread Moger, Babu
This series adds support for AMD64 architectural extensions for Platform
Quality of Service. These extensions are intended to provide for the
monitoring of the usage of certain system resources by one or more
processors and for the separate allocation and enforcement of limits on
the use of certain system resources by one or more processors.

The monitoring and enforcement are not necessarily applied across the
entire system, but in general apply to a QOS domain which corresponds to
some shared system resource.  The set of resources which are monitored and
the set for which the enforcement of limits is provided are implementation
dependent. Platform QOS features are implemented on a logical processor basis.
Therefore, multiple hardware threads of a single physical CPU core may have
independent resource monitoring and enforcement configurations.

AMD's next generation of processors support following QoS sub-features.
- L3 Cache allocation enforcement
- L3 Cache occupancy monitoring
- L3 Code-Data Prioritization support
- Memory Bandwidth Enforcement(Allocation)

The public specification for this feature is available at
https://developer.amd.com/wp-content/resources/56375.pdf

Obviously, there are multiple ways we can go about these changes. We felt
it is appropriate to rename and re-organize the code little bit before
making the functional changes. The first few patches(1-10) renames and
re-organizes the sources in preparation. Rest of the patches(7-11) adds
support for AMD QoS features.

Please review and provide me feedback.

Changes from v4 -> v5:
 a. Addressed comments from Fenghua Yu.
 b. The functions update_mba_bw and set_mba_sc is not required for AMD.
Removed all the changes related to these functions.

Changes from v3 -> v4:
https://lore.kernel.org/lkml/20181015205514.25387-1-babu.mo...@amd.com/
 a. Addressed comments from Reinette Chatre and Borislav Petkov.
 b. Removed X86 dependancy for CONFIG_AMD_QOS. Implicitly is it already
dependent on X86.
 c. Updated the MAINTAINER file for name changes.
 d. Addressed most of "checkpatch.pl --strict" issues.
 d. Updated Documentation/x86/resctrl_ui.txt(previously
intel_rdt_ui.txt) file with AMD specific details. Changed few names
to resctrl from intel_rdt.

Changes from v2 -> v3:
 https://lore.kernel.org/lkml/20181011203223.18157-1-babu.mo...@amd.com/
 a. Rebased the patches on top of below branch as suggested by Thomas Gleixner.
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git x86/cache
 b. Addressed comments from Reinette Chatre, Fenghua Yu and Borislav Petkov.
 c. Main changes are related to renaming the files and functions.
Renamed from intel_rdt to more generic resctrl(patches 1 to 3).
 d. Config parameter changed from PLATFORM_QOS to more generic RESCTRL.
 e. Fixed minor indentation issues.

Changes from v1 -> v2:
 https://lore.kernel.org/lkml/20181005205512.29545-1-babu.mo...@amd.com/
 a. Removed RFC from subject header. Based on the discussion so far, 
plan is to go ahead with these patches and eventually re-structure
the code to make arch and non-arch separate.
 b. Addressed comments from Reinette Chatre and Fenghua Yu.
 c. Separated quirks and MBA from rdt init code. Kept the rest of the
code as is.
 d. Added _intel suffixes all the Intel only code just like AMD code.
 e. Added one more patch to bring the macros into header file.
 f. Few minor text changes.

v1:
 https://lore.kernel.org/lkml/20180924191841.29111-1-babu.mo...@amd.com/


Babu Moger (12):
  arch/x86: Start renaming the rdt files to more generic names
  arch/x86: Rename the RDT functions and definitions
  arch/x86: Re-arrange RDT init code
  arch/x86: Bring all the macros to resctrl.h
  arch/x86: Introduce a new config parameter RESCTRL
  arch/x86: Use new config parameter RESCTRL for compilation
  arch/x86: Initialize the resource functions that are different
  arch/x86: Bring cbm_validate function into the resource structure
  arch/x86: Introduce new config parameter AMD_QOS
  arch/x86: Introduce QOS feature for AMD
  Documentation/x86: Rename and update intel_rdt_ui.txt
  MAINTAINERS: Update the file and documentation names in arch/x86

Sherry Hurwitz (1):
  arch/x86: Add AMD feature bit X86_FEATURE_MBA in cpuid bits array

 .../x86/{intel_rdt_ui.txt => resctrl_ui.txt}  |   9 +-
 MAINTAINERS   |   6 +-
 arch/x86/Kconfig  |  19 ++
 .../{intel_rdt_sched.h => resctrl_sched.h}|  28 +--
 arch/x86/kernel/cpu/Makefile  |   6 +-
 .../x86/kernel/cpu/{intel_rdt.c => resctrl.c} | 168 +++---
 .../x86/kernel/cpu/{intel_rdt.h => resctrl.h} |  37 ++--
 ...dt_ctrlmondata.c => resctrl_ctrlmondata.c} |  80 -
 ...{intel_rdt_monitor.c => resctrl_monitor.c} |  20 +--
 ...dt_pseudo_lock.c => resctrl_pseudo_lock.c} |   6 +-
 ...ck_event.h => resctrl_pseudo_lock_event.h} |   2 +-
 ...ntel_rdt_rdtgroup.c => resctrl_rdtgroup.c} |  14 +-
 arch/x86/kernel/cpu/scatt

[PATCH v5 09/13] arch/x86: Introduce new config parameter AMD_QOS

2018-10-18 Thread Moger, Babu
Introduces the new config parameter AMD_QOS. This parameter will be
used to enable cache and memory bandwidth allocation and monitoring
features on AMD processors. This will enable common config parameter
RESCTRL if selected.

Signed-off-by: Babu Moger 
---
 arch/x86/Kconfig | 17 -
 1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 91a703ebdc04..9cd21e536b65 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -458,9 +458,24 @@ config INTEL_RDT
 
  Say N if unsure.
 
+config AMD_QOS
+   bool "AMD Quality of Service support"
+   default n
+   depends on CPU_SUP_AMD
+   select KERNFS
+   help
+ Select to enable cache and memory bandwidth enforcement and monitoring
+ features of AMD processors. These features are intended to provide
+ support for the monitoring of the usage of certain system resources
+ by one or more processors and for the separate allocation and
+ enforcement of limits on the use of certain system resources by one or
+ more processors.
+
+ Say N if unsure.
+
 config RESCTRL
def_bool y
-   depends on X86 && INTEL_RDT
+   depends on X86 && (INTEL_RDT || AMD_QOS)
 
 if X86_32
 config X86_BIGSMP
-- 
2.17.1



[PATCH v5 06/13] arch/x86: Use new config parameter RESCTRL for compilation

2018-10-18 Thread Moger, Babu
Use newly added config parameter RESCTRL to compile sources.
This is common parameter across both Intel and AMD.

Signed-off-by: Babu Moger 
---
 arch/x86/include/asm/resctrl_sched.h | 4 ++--
 arch/x86/kernel/cpu/Makefile | 4 ++--
 include/linux/sched.h| 2 +-
 3 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/arch/x86/include/asm/resctrl_sched.h 
b/arch/x86/include/asm/resctrl_sched.h
index 6e082697a613..54990fe2a3ae 100644
--- a/arch/x86/include/asm/resctrl_sched.h
+++ b/arch/x86/include/asm/resctrl_sched.h
@@ -2,7 +2,7 @@
 #ifndef _ASM_X86_RESCTRL_SCHED_H
 #define _ASM_X86_RESCTRL_SCHED_H
 
-#ifdef CONFIG_INTEL_RDT
+#ifdef CONFIG_RESCTRL
 
 #include 
 #include 
@@ -88,6 +88,6 @@ static inline void resctrl_sched_in(void)
 
 static inline void resctrl_sched_in(void) {}
 
-#endif /* CONFIG_INTEL_RDT */
+#endif /* CONFIG_RESCTRL */
 
 #endif /* _ASM_X86_RESCTRL_SCHED_H */
diff --git a/arch/x86/kernel/cpu/Makefile b/arch/x86/kernel/cpu/Makefile
index 222cf8cc078d..79279953c5f9 100644
--- a/arch/x86/kernel/cpu/Makefile
+++ b/arch/x86/kernel/cpu/Makefile
@@ -35,8 +35,8 @@ obj-$(CONFIG_CPU_SUP_CENTAUR) += centaur.o
 obj-$(CONFIG_CPU_SUP_TRANSMETA_32) += transmeta.o
 obj-$(CONFIG_CPU_SUP_UMC_32)   += umc.o
 
-obj-$(CONFIG_INTEL_RDT)+= resctrl.o resctrl_rdtgroup.o 
resctrl_monitor.o
-obj-$(CONFIG_INTEL_RDT)+= resctrl_ctrlmondata.o resctrl_pseudo_lock.o
+obj-$(CONFIG_RESCTRL)  += resctrl.o resctrl_rdtgroup.o resctrl_monitor.o
+obj-$(CONFIG_RESCTRL)  += resctrl_ctrlmondata.o resctrl_pseudo_lock.o
 CFLAGS_resctrl_pseudo_lock.o = -I$(src)
 
 obj-$(CONFIG_X86_MCE)  += mcheck/
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 977cb57d7bc9..c4cf94c447b2 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -985,7 +985,7 @@ struct task_struct {
/* cg_list protected by css_set_lock and tsk->alloc_lock: */
struct list_headcg_list;
 #endif
-#ifdef CONFIG_INTEL_RDT
+#ifdef CONFIG_RESCTRL
u32 closid;
u32 rmid;
 #endif
-- 
2.17.1



[PATCH v5 07/13] arch/x86: Initialize the resource functions that are different

2018-10-18 Thread Moger, Babu
Initialize the resource functions that are different between the
vendors. Some features are initialized differently between the vendors.
Add _intel suffix to Intel specific functions.

For example, MBA feature varies significantly between Intel and AMD.
Separate the initialization of these resource functions. That way we
can easily add AMD's functions later.

Signed-off-by: Babu Moger 
---
 arch/x86/kernel/cpu/resctrl.c | 34 +++
 arch/x86/kernel/cpu/resctrl.h |  8 --
 arch/x86/kernel/cpu/resctrl_ctrlmondata.c |  4 +--
 3 files changed, 37 insertions(+), 9 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl.c b/arch/x86/kernel/cpu/resctrl.c
index befc4eee0f07..1592c88228f9 100644
--- a/arch/x86/kernel/cpu/resctrl.c
+++ b/arch/x86/kernel/cpu/resctrl.c
@@ -57,7 +57,8 @@ int max_name_width, max_data_width;
 bool rdt_alloc_capable;
 
 static void
-mba_wrmsr(struct rdt_domain *d, struct msr_param *m, struct rdt_resource *r);
+mba_wrmsr_intel(struct rdt_domain *d, struct msr_param *m,
+   struct rdt_resource *r);
 static void
 cat_wrmsr(struct rdt_domain *d, struct msr_param *m, struct rdt_resource *r);
 
@@ -171,10 +172,7 @@ struct rdt_resource rdt_resources_all[] = {
.rid= RDT_RESOURCE_MBA,
.name   = "MB",
.domains= domain_init(RDT_RESOURCE_MBA),
-   .msr_base   = IA32_MBA_THRTL_BASE,
-   .msr_update = mba_wrmsr,
.cache_level= 3,
-   .parse_ctrlval  = parse_bw,
.format_str = "%d=%*u",
.fflags = RFTYPE_RES_MB,
},
@@ -356,7 +354,8 @@ u32 delay_bw_map(unsigned long bw, struct rdt_resource *r)
 }
 
 static void
-mba_wrmsr(struct rdt_domain *d, struct msr_param *m, struct rdt_resource *r)
+mba_wrmsr_intel(struct rdt_domain *d, struct msr_param *m,
+   struct rdt_resource *r)
 {
unsigned int i;
 
@@ -868,6 +867,25 @@ static __init bool get_rdt_resources(void)
return (rdt_mon_capable || rdt_alloc_capable);
 }
 
+static __init void rdt_init_res_defs_intel(void)
+{
+   struct rdt_resource *r;
+
+   for_each_rdt_resource(r) {
+   if (r->rid == RDT_RESOURCE_MBA) {
+   r->msr_base = IA32_MBA_THRTL_BASE;
+   r->msr_update = mba_wrmsr_intel;
+   r->parse_ctrlval = parse_bw_intel;
+   }
+   }
+}
+
+static __init void rdt_init_res_defs(void)
+{
+   if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL)
+   rdt_init_res_defs_intel();
+}
+
 static enum cpuhp_state rdt_online;
 
 static int __init resctrl_late_init(void)
@@ -875,6 +893,12 @@ static int __init resctrl_late_init(void)
struct rdt_resource *r;
int state, ret;
 
+   /*
+* Initialize functions(or definitions) that are different
+* between vendors here.
+*/
+   rdt_init_res_defs();
+
/* Run quirks first */
rdt_quirks();
 
diff --git a/arch/x86/kernel/cpu/resctrl.h b/arch/x86/kernel/cpu/resctrl.h
index e5f7bf6a8d09..8731b7c91c28 100644
--- a/arch/x86/kernel/cpu/resctrl.h
+++ b/arch/x86/kernel/cpu/resctrl.h
@@ -444,8 +444,8 @@ struct rdt_resource {
 
 int parse_cbm(struct rdt_parse_data *data, struct rdt_resource *r,
  struct rdt_domain *d);
-int parse_bw(struct rdt_parse_data *data, struct rdt_resource *r,
-struct rdt_domain *d);
+int parse_bw_intel(struct rdt_parse_data *data, struct rdt_resource *r,
+  struct rdt_domain *d);
 
 extern struct mutex rdtgroup_mutex;
 
@@ -468,6 +468,10 @@ enum {
RDT_NUM_RESOURCES,
 };
 
+#define for_each_rdt_resource(r) \
+   for (r = rdt_resources_all; r < rdt_resources_all + RDT_NUM_RESOURCES;\
+r++)
+
 #define for_each_capable_rdt_resource(r) \
for (r = rdt_resources_all; r < rdt_resources_all + RDT_NUM_RESOURCES;\
 r++) \
diff --git a/arch/x86/kernel/cpu/resctrl_ctrlmondata.c 
b/arch/x86/kernel/cpu/resctrl_ctrlmondata.c
index 0c40a2e0a9b6..1da343b69f6e 100644
--- a/arch/x86/kernel/cpu/resctrl_ctrlmondata.c
+++ b/arch/x86/kernel/cpu/resctrl_ctrlmondata.c
@@ -64,8 +64,8 @@ static bool bw_validate(char *buf, unsigned long *data, 
struct rdt_resource *r)
return true;
 }
 
-int parse_bw(struct rdt_parse_data *data, struct rdt_resource *r,
-struct rdt_domain *d)
+int parse_bw_intel(struct rdt_parse_data *data, struct rdt_resource *r,
+  struct rdt_domain *d)
 {
unsigned long bw_val;
 
-- 
2.17.1



[PATCH v5 05/13] arch/x86: Introduce a new config parameter RESCTRL

2018-10-18 Thread Moger, Babu
Introduces a new config parameter RESCTRL.

This will be used as a common config parameter for both Intel and AMD.
Each vendor will have their own config parameter to enable RDT feature.
One for Intel(INTEL_RDT) and one for AMD(AMD_QOS). It can be enabled or
disabled separately. The new parameter RESCTRL will be dependent on
INTEL_RDT or AMD_QOS.

Signed-off-by: Babu Moger 
---
 arch/x86/Kconfig | 4 
 1 file changed, 4 insertions(+)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 1a0be022f91d..91a703ebdc04 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -458,6 +458,10 @@ config INTEL_RDT
 
  Say N if unsure.
 
+config RESCTRL
+   def_bool y
+   depends on X86 && INTEL_RDT
+
 if X86_32
 config X86_BIGSMP
bool "Support for big SMP systems with more than 8 CPUs"
-- 
2.17.1



[PATCH v5 08/13] arch/x86: Bring cbm_validate function into the resource structure

2018-10-18 Thread Moger, Babu
Idea is to bring all the functions that are different between the
vendors into resource structure and initialize them dynamically.
Add _intel suffix to Intel specific functions.

Following function is implemented separately for each vendors.
cbm_validate  : Cache bitmask validate function. AMD allows
non-contiguous masks. So, use separate functions for
Intel and AMD.

Signed-off-by: Babu Moger 
---
 arch/x86/kernel/cpu/resctrl.c |  9 -
 arch/x86/kernel/cpu/resctrl.h | 11 +++
 arch/x86/kernel/cpu/resctrl_ctrlmondata.c |  4 ++--
 3 files changed, 17 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl.c b/arch/x86/kernel/cpu/resctrl.c
index 1592c88228f9..058c9b12a978 100644
--- a/arch/x86/kernel/cpu/resctrl.c
+++ b/arch/x86/kernel/cpu/resctrl.c
@@ -872,7 +872,14 @@ static __init void rdt_init_res_defs_intel(void)
struct rdt_resource *r;
 
for_each_rdt_resource(r) {
-   if (r->rid == RDT_RESOURCE_MBA) {
+   if (r->rid == RDT_RESOURCE_L3 ||
+   r->rid == RDT_RESOURCE_L3DATA ||
+   r->rid == RDT_RESOURCE_L3CODE ||
+   r->rid == RDT_RESOURCE_L2 ||
+   r->rid == RDT_RESOURCE_L2DATA ||
+   r->rid == RDT_RESOURCE_L2CODE)
+   r->cbm_validate = cbm_validate_intel;
+   else if (r->rid == RDT_RESOURCE_MBA) {
r->msr_base = IA32_MBA_THRTL_BASE;
r->msr_update = mba_wrmsr_intel;
r->parse_ctrlval = parse_bw_intel;
diff --git a/arch/x86/kernel/cpu/resctrl.h b/arch/x86/kernel/cpu/resctrl.h
index 8731b7c91c28..102bcffbefd7 100644
--- a/arch/x86/kernel/cpu/resctrl.h
+++ b/arch/x86/kernel/cpu/resctrl.h
@@ -410,10 +410,11 @@ struct rdt_parse_data {
  * @cache: Cache allocation related data
  * @format_str:Per resource format string to show domain value
  * @parse_ctrlval: Per resource function pointer to parse control values
- * @evt_list:  List of monitoring events
- * @num_rmid:  Number of RMIDs available
- * @mon_scale: cqm counter * mon_scale = occupancy in bytes
- * @fflags:flags to choose base and info files
+ * @cbm_validate   Cache bitmask validate function
+ * @evt_list:  List of monitoring events
+ * @num_rmid:  Number of RMIDs available
+ * @mon_scale: cqm counter * mon_scale = occupancy in bytes
+ * @fflags:flags to choose base and info files
  */
 struct rdt_resource {
int rid;
@@ -436,6 +437,7 @@ struct rdt_resource {
int (*parse_ctrlval)(struct rdt_parse_data *data,
 struct rdt_resource *r,
 struct rdt_domain *d);
+   bool (*cbm_validate)(char *buf, u32 *data, struct rdt_resource *r);
struct list_headevt_list;
int num_rmid;
unsigned intmon_scale;
@@ -576,5 +578,6 @@ void cqm_setup_limbo_handler(struct rdt_domain *dom, 
unsigned long delay_ms);
 void cqm_handle_limbo(struct work_struct *work);
 bool has_busy_rmid(struct rdt_resource *r, struct rdt_domain *d);
 void __check_limbo(struct rdt_domain *d, bool force_free);
+bool cbm_validate_intel(char *buf, u32 *data, struct rdt_resource *r);
 
 #endif /* _ASM_X86_RESCTRL_H */
diff --git a/arch/x86/kernel/cpu/resctrl_ctrlmondata.c 
b/arch/x86/kernel/cpu/resctrl_ctrlmondata.c
index 1da343b69f6e..867da06223b5 100644
--- a/arch/x86/kernel/cpu/resctrl_ctrlmondata.c
+++ b/arch/x86/kernel/cpu/resctrl_ctrlmondata.c
@@ -88,7 +88,7 @@ int parse_bw_intel(struct rdt_parse_data *data, struct 
rdt_resource *r,
  * are allowed (e.g. H, 0FF0H, 003CH, etc.).
  * Additionally Haswell requires at least two bits set.
  */
-static bool cbm_validate(char *buf, u32 *data, struct rdt_resource *r)
+bool cbm_validate_intel(char *buf, u32 *data, struct rdt_resource *r)
 {
unsigned long first_bit, zero_bit, val;
unsigned int cbm_len = r->cache.cbm_len;
@@ -148,7 +148,7 @@ int parse_cbm(struct rdt_parse_data *data, struct 
rdt_resource *r,
return -EINVAL;
}
 
-   if (!cbm_validate(data->buf, &cbm_val, r))
+   if (r->cbm_validate && !r->cbm_validate(data->buf, &cbm_val, r))
return -EINVAL;
 
if ((rdtgrp->mode == RDT_MODE_EXCLUSIVE ||
-- 
2.17.1



[PATCH v5 03/13] arch/x86: Re-arrange RDT init code

2018-10-18 Thread Moger, Babu
Separate the call sequence for rdt_quirks and MBA feature.
This is in preparation to handle vendor differences in these
call sequences.

Signed-off-by: Babu Moger 
---
 arch/x86/kernel/cpu/resctrl.c | 29 +++--
 1 file changed, 23 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl.c b/arch/x86/kernel/cpu/resctrl.c
index 8afc0da6fa59..6c1199f7f28e 100644
--- a/arch/x86/kernel/cpu/resctrl.c
+++ b/arch/x86/kernel/cpu/resctrl.c
@@ -787,6 +787,16 @@ static bool __init rdt_cpu_has(int flag)
return ret;
 }
 
+static __init bool rdt_mba_config(void)
+{
+   if (rdt_cpu_has(X86_FEATURE_MBA)) {
+   if (rdt_get_mem_config(&rdt_resources_all[RDT_RESOURCE_MBA]))
+   return true;
+   }
+
+   return false;
+}
+
 static __init bool get_rdt_alloc_resources(void)
 {
bool ret = false;
@@ -811,10 +821,9 @@ static __init bool get_rdt_alloc_resources(void)
ret = true;
}
 
-   if (rdt_cpu_has(X86_FEATURE_MBA)) {
-   if (rdt_get_mem_config(&rdt_resources_all[RDT_RESOURCE_MBA]))
-   ret = true;
-   }
+   if (rdt_mba_config())
+   ret = true;
+
return ret;
 }
 
@@ -833,7 +842,7 @@ static __init bool get_rdt_mon_resources(void)
return !rdt_get_mon_l3_config(&rdt_resources_all[RDT_RESOURCE_L3]);
 }
 
-static __init void rdt_quirks(void)
+static __init void rdt_quirks_intel(void)
 {
switch (boot_cpu_data.x86_model) {
case INTEL_FAM6_HASWELL_X:
@@ -848,9 +857,14 @@ static __init void rdt_quirks(void)
}
 }
 
+static __init void rdt_quirks(void)
+{
+   if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL)
+   rdt_quirks_intel();
+}
+
 static __init bool get_rdt_resources(void)
 {
-   rdt_quirks();
rdt_alloc_capable = get_rdt_alloc_resources();
rdt_mon_capable = get_rdt_mon_resources();
 
@@ -864,6 +878,9 @@ static int __init resctrl_late_init(void)
struct rdt_resource *r;
int state, ret;
 
+   /* Run quirks first */
+   rdt_quirks();
+
if (!get_rdt_resources())
return -ENODEV;
 
-- 
2.17.1



[PATCH v5 11/13] arch/x86: Introduce QOS feature for AMD

2018-10-18 Thread Moger, Babu
Enables QOS feature on AMD.
Following QoS sub-features are supported in AMD if the underlying
hardware supports it.
 - L3 Cache allocation enforcement
 - L3 Cache occupancy monitoring
 - L3 Code-Data Prioritization support
 - Memory Bandwidth Enforcement(Allocation)

The specification for this feature is available at
https://developer.amd.com/wp-content/resources/56375.pdf

There are differences in the way some of the features are implemented.
Separate those functions and add those as vendor specific functions.
The major difference is in MBA feature.
 - AMD uses CPUID leaf 0x8020 to initialize the MBA features.
 - AMD uses direct bandwidth value instead of delay based on bandwidth
   values.
 - MSR register base addresses are different for MBA.
 - Also AMD allows non-contiguous L3 cache bit masks.

Adds following functions to take care of the differences.
rdt_get_mem_config_amd : MBA initialization function
parse_bw_amd : Bandwidth parsing
mba_wrmsr_amd: Writes bandwidth value
cbm_validate_amd : L3 cache bitmask validation

Signed-off-by: Babu Moger 
---
 arch/x86/kernel/cpu/resctrl.c | 69 +-
 arch/x86/kernel/cpu/resctrl.h |  5 ++
 arch/x86/kernel/cpu/resctrl_ctrlmondata.c | 70 +++
 3 files changed, 142 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl.c b/arch/x86/kernel/cpu/resctrl.c
index 058c9b12a978..7a149075ed24 100644
--- a/arch/x86/kernel/cpu/resctrl.c
+++ b/arch/x86/kernel/cpu/resctrl.c
@@ -61,6 +61,9 @@ mba_wrmsr_intel(struct rdt_domain *d, struct msr_param *m,
struct rdt_resource *r);
 static void
 cat_wrmsr(struct rdt_domain *d, struct msr_param *m, struct rdt_resource *r);
+static void
+mba_wrmsr_amd(struct rdt_domain *d, struct msr_param *m,
+ struct rdt_resource *r);
 
 #define domain_init(id) LIST_HEAD_INIT(rdt_resources_all[id].domains)
 
@@ -280,6 +283,31 @@ static bool rdt_get_mem_config(struct rdt_resource *r)
return true;
 }
 
+static bool rdt_get_mem_config_amd(struct rdt_resource *r)
+{
+   union cpuid_0x10_3_eax eax;
+   union cpuid_0x10_x_edx edx;
+   u32 ebx, ecx;
+
+   cpuid_count(0x8020, 1, &eax.full, &ebx, &ecx, &edx.full);
+   r->num_closid = edx.split.cos_max + 1;
+   r->default_ctrl = MAX_MBA_BW_AMD;
+
+   /* AMD does not use delay. Set delay_linear to false by default */
+   r->membw.delay_linear = false;
+
+   /* FIX ME - May need to be read from MSR */
+   r->membw.min_bw = 0;
+   r->membw.bw_gran = 1;
+   /* Max value is 2048, Data width should be 4 in decimal */
+   r->data_width = 4;
+
+   r->alloc_capable = true;
+   r->alloc_enabled = true;
+
+   return true;
+}
+
 static void rdt_get_cache_alloc_cfg(int idx, struct rdt_resource *r)
 {
union cpuid_0x10_1_eax eax;
@@ -339,6 +367,16 @@ static int get_cache_id(int cpu, int level)
return -1;
 }
 
+static void
+mba_wrmsr_amd(struct rdt_domain *d, struct msr_param *m, struct rdt_resource 
*r)
+{
+   unsigned int i;
+
+   /*  Write the bw values for mba. */
+   for (i = m->low; i < m->high; i++)
+   wrmsrl(r->msr_base + i, d->ctrl_val[i]);
+}
+
 /*
  * Map the memory b/w percentage value to delay values
  * that can be written to QOS_MSRs.
@@ -786,8 +824,13 @@ static bool __init rdt_cpu_has(int flag)
 static __init bool rdt_mba_config(void)
 {
if (rdt_cpu_has(X86_FEATURE_MBA)) {
-   if (rdt_get_mem_config(&rdt_resources_all[RDT_RESOURCE_MBA]))
-   return true;
+   if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL) {
+   if 
(rdt_get_mem_config(&rdt_resources_all[RDT_RESOURCE_MBA]))
+   return true;
+   } else if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD) {
+   if 
(rdt_get_mem_config_amd(&rdt_resources_all[RDT_RESOURCE_MBA]))
+   return true;
+   }
}
 
return false;
@@ -887,10 +930,32 @@ static __init void rdt_init_res_defs_intel(void)
}
 }
 
+static __init void rdt_init_res_defs_amd(void)
+{
+   struct rdt_resource *r;
+
+   for_each_rdt_resource(r) {
+   if (r->rid == RDT_RESOURCE_L3 ||
+   r->rid == RDT_RESOURCE_L3DATA ||
+   r->rid == RDT_RESOURCE_L3CODE ||
+   r->rid == RDT_RESOURCE_L2 ||
+   r->rid == RDT_RESOURCE_L2DATA ||
+   r->rid == RDT_RESOURCE_L2CODE)
+   r->cbm_validate = cbm_validate_amd;
+   else if (r->rid == RDT_RESOURCE_MBA) {
+   r->msr_base = IA32_MBA_BW_BASE;
+   r->msr_update = mba_wrmsr_amd;
+   r->parse_ctrlval = parse_bw_amd;
+   }
+   }
+}
+
 static __init void rdt_init_res_defs(void)
 {
if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL)
rdt_init_r

[PATCH v5 04/13] arch/x86: Bring all the macros to resctrl.h

2018-10-18 Thread Moger, Babu
Bring all the macros to resctrl.h and rename for consistency.

Signed-off-by: Babu Moger 
---
 arch/x86/kernel/cpu/resctrl.c | 3 ---
 arch/x86/kernel/cpu/resctrl.h | 5 +
 arch/x86/kernel/cpu/resctrl_monitor.c | 7 ++-
 3 files changed, 7 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl.c b/arch/x86/kernel/cpu/resctrl.c
index 6c1199f7f28e..befc4eee0f07 100644
--- a/arch/x86/kernel/cpu/resctrl.c
+++ b/arch/x86/kernel/cpu/resctrl.c
@@ -33,9 +33,6 @@
 #include 
 #include "resctrl.h"
 
-#define MBA_IS_LINEAR  0x4
-#define MBA_MAX_MBPS   U32_MAX
-
 /* Mutex to protect rdtgroup access. */
 DEFINE_MUTEX(rdtgroup_mutex);
 
diff --git a/arch/x86/kernel/cpu/resctrl.h b/arch/x86/kernel/cpu/resctrl.h
index abf5c7e4c625..e5f7bf6a8d09 100644
--- a/arch/x86/kernel/cpu/resctrl.h
+++ b/arch/x86/kernel/cpu/resctrl.h
@@ -12,6 +12,9 @@
 #define IA32_L2_CBM_BASE   0xd10
 #define IA32_MBA_THRTL_BASE0xd50
 
+#define IA32_QM_CTR0x0c8e
+#define IA32_QM_EVTSEL 0x0c8d
+
 #define L3_QOS_CDP_ENABLE  0x01ULL
 
 #define L2_QOS_CDP_ENABLE  0x01ULL
@@ -29,6 +32,8 @@
 #define MBM_CNTR_WIDTH 24
 #define MBM_OVERFLOW_INTERVAL  1000
 #define MAX_MBA_BW 100u
+#define MBA_IS_LINEAR  0x4
+#define MBA_MAX_MBPS   U32_MAX
 
 #define RMID_VAL_ERROR BIT_ULL(63)
 #define RMID_VAL_UNAVAIL   BIT_ULL(62)
diff --git a/arch/x86/kernel/cpu/resctrl_monitor.c 
b/arch/x86/kernel/cpu/resctrl_monitor.c
index 68dbdbbf47df..ad0107bc16a0 100644
--- a/arch/x86/kernel/cpu/resctrl_monitor.c
+++ b/arch/x86/kernel/cpu/resctrl_monitor.c
@@ -28,9 +28,6 @@
 #include 
 #include "resctrl.h"
 
-#define MSR_IA32_QM_CTR0x0c8e
-#define MSR_IA32_QM_EVTSEL 0x0c8d
-
 struct rmid_entry {
u32 rmid;
int busy;
@@ -97,8 +94,8 @@ static u64 __rmid_read(u32 rmid, u32 eventid)
 * IA32_QM_CTR.Error (bit 63) and IA32_QM_CTR.Unavailable (bit 62)
 * are error bits.
 */
-   wrmsr(MSR_IA32_QM_EVTSEL, eventid, rmid);
-   rdmsrl(MSR_IA32_QM_CTR, val);
+   wrmsr(IA32_QM_EVTSEL, eventid, rmid);
+   rdmsrl(IA32_QM_CTR, val);
 
return val;
 }
-- 
2.17.1



[PATCH v5 13/13] MAINTAINERS: Update the file and documentation names in arch/x86

2018-10-18 Thread Moger, Babu
Update the MAINTAINERS to reflect the changed file(and documentation)
names in arch/x86/kernel/cpu. The file names have changed from
intel_rdt* to resctrl*.

Signed-off-by: Babu Moger 
---
 MAINTAINERS | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index 48a65c3a4189..7643dba289c6 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -12267,9 +12267,9 @@ M:  Fenghua Yu 
 M: Reinette Chatre 
 L: linux-ker...@vger.kernel.org
 S: Supported
-F: arch/x86/kernel/cpu/intel_rdt*
-F: arch/x86/include/asm/intel_rdt_sched.h
-F: Documentation/x86/intel_rdt*
+F: arch/x86/kernel/cpu/resctrl*
+F: arch/x86/include/asm/resctrl_sched.h
+F: Documentation/x86/resctrl*
 
 READ-COPY UPDATE (RCU)
 M: "Paul E. McKenney" 
-- 
2.17.1



[PATCH v5 12/13] Documentation/x86: Rename and update intel_rdt_ui.txt

2018-10-18 Thread Moger, Babu
Rename intel_rdt_ui.txt to generic resctrl_ui.txt and update the
documentation for AMD.

Signed-off-by: Babu Moger 
---
 Documentation/x86/{intel_rdt_ui.txt => resctrl_ui.txt} | 9 ++---
 1 file changed, 6 insertions(+), 3 deletions(-)
 rename Documentation/x86/{intel_rdt_ui.txt => resctrl_ui.txt} (99%)

diff --git a/Documentation/x86/intel_rdt_ui.txt 
b/Documentation/x86/resctrl_ui.txt
similarity index 99%
rename from Documentation/x86/intel_rdt_ui.txt
rename to Documentation/x86/resctrl_ui.txt
index 52b10945ff75..c4e2349482b8 100644
--- a/Documentation/x86/intel_rdt_ui.txt
+++ b/Documentation/x86/resctrl_ui.txt
@@ -1,4 +1,7 @@
-User Interface for Resource Allocation in Intel Resource Director Technology
+User Interface for RESCTRL feature
+
+Intel refers to this feature as Intel Resource Director Technology(Intel(R) 
RDT).
+AMD refers to this feature as AMD Platform Quality of Service(AMD QoS).
 
 Copyright (C) 2016 Intel Corporation
 
@@ -6,8 +9,8 @@ Fenghua Yu 
 Tony Luck 
 Vikas Shivappa 
 
-This feature is enabled by the CONFIG_INTEL_RDT Kconfig and the
-X86 /proc/cpuinfo flag bits:
+This feature is enabled by the CONFIG_INTEL_RDT Kconfig(for Intel) or
+CONFIG_AMD_QOS(for AMD) and the X86 /proc/cpuinfo flag bits:
 RDT (Resource Director Technology) Allocation - "rdt_a"
 CAT (Cache Allocation Technology) - "cat_l3", "cat_l2"
 CDP (Code and Data Prioritization ) - "cdp_l3", "cdp_l2"
-- 
2.17.1



[PATCH v5 01/13] arch/x86: Start renaming the rdt files to more generic names

2018-10-18 Thread Moger, Babu
New generation of AMD processors start supporting RDT(or QOS) features.
With more than one vendors supporting these features, it seems more
appropriate to rename these files.

Changed intel_rdt to resctrl where applicable.

Signed-off-by: Babu Moger 
---
 arch/x86/include/asm/{intel_rdt_sched.h => resctrl_sched.h} | 0
 arch/x86/kernel/cpu/Makefile| 6 +++---
 arch/x86/kernel/cpu/{intel_rdt.c => resctrl.c}  | 4 ++--
 arch/x86/kernel/cpu/{intel_rdt.h => resctrl.h}  | 6 +++---
 .../cpu/{intel_rdt_ctrlmondata.c => resctrl_ctrlmondata.c}  | 2 +-
 .../kernel/cpu/{intel_rdt_monitor.c => resctrl_monitor.c}   | 2 +-
 .../cpu/{intel_rdt_pseudo_lock.c => resctrl_pseudo_lock.c}  | 6 +++---
 ..._rdt_pseudo_lock_event.h => resctrl_pseudo_lock_event.h} | 2 +-
 .../kernel/cpu/{intel_rdt_rdtgroup.c => resctrl_rdtgroup.c} | 4 ++--
 arch/x86/kernel/process_32.c| 2 +-
 arch/x86/kernel/process_64.c| 2 +-
 11 files changed, 18 insertions(+), 18 deletions(-)
 rename arch/x86/include/asm/{intel_rdt_sched.h => resctrl_sched.h} (100%)
 rename arch/x86/kernel/cpu/{intel_rdt.c => resctrl.c} (99%)
 rename arch/x86/kernel/cpu/{intel_rdt.h => resctrl.h} (99%)
 rename arch/x86/kernel/cpu/{intel_rdt_ctrlmondata.c => resctrl_ctrlmondata.c} 
(99%)
 rename arch/x86/kernel/cpu/{intel_rdt_monitor.c => resctrl_monitor.c} (99%)
 rename arch/x86/kernel/cpu/{intel_rdt_pseudo_lock.c => resctrl_pseudo_lock.c} 
(99%)
 rename arch/x86/kernel/cpu/{intel_rdt_pseudo_lock_event.h => 
resctrl_pseudo_lock_event.h} (95%)
 rename arch/x86/kernel/cpu/{intel_rdt_rdtgroup.c => resctrl_rdtgroup.c} (99%)

diff --git a/arch/x86/include/asm/intel_rdt_sched.h 
b/arch/x86/include/asm/resctrl_sched.h
similarity index 100%
rename from arch/x86/include/asm/intel_rdt_sched.h
rename to arch/x86/include/asm/resctrl_sched.h
diff --git a/arch/x86/kernel/cpu/Makefile b/arch/x86/kernel/cpu/Makefile
index 347137e80bf5..222cf8cc078d 100644
--- a/arch/x86/kernel/cpu/Makefile
+++ b/arch/x86/kernel/cpu/Makefile
@@ -35,9 +35,9 @@ obj-$(CONFIG_CPU_SUP_CENTAUR) += centaur.o
 obj-$(CONFIG_CPU_SUP_TRANSMETA_32) += transmeta.o
 obj-$(CONFIG_CPU_SUP_UMC_32)   += umc.o
 
-obj-$(CONFIG_INTEL_RDT)+= intel_rdt.o intel_rdt_rdtgroup.o 
intel_rdt_monitor.o
-obj-$(CONFIG_INTEL_RDT)+= intel_rdt_ctrlmondata.o 
intel_rdt_pseudo_lock.o
-CFLAGS_intel_rdt_pseudo_lock.o = -I$(src)
+obj-$(CONFIG_INTEL_RDT)+= resctrl.o resctrl_rdtgroup.o 
resctrl_monitor.o
+obj-$(CONFIG_INTEL_RDT)+= resctrl_ctrlmondata.o resctrl_pseudo_lock.o
+CFLAGS_resctrl_pseudo_lock.o = -I$(src)
 
 obj-$(CONFIG_X86_MCE)  += mcheck/
 obj-$(CONFIG_MTRR) += mtrr/
diff --git a/arch/x86/kernel/cpu/intel_rdt.c b/arch/x86/kernel/cpu/resctrl.c
similarity index 99%
rename from arch/x86/kernel/cpu/intel_rdt.c
rename to arch/x86/kernel/cpu/resctrl.c
index 1214f3f7ec6d..3968b54902b1 100644
--- a/arch/x86/kernel/cpu/intel_rdt.c
+++ b/arch/x86/kernel/cpu/resctrl.c
@@ -30,8 +30,8 @@
 #include 
 
 #include 
-#include 
-#include "intel_rdt.h"
+#include 
+#include "resctrl.h"
 
 #define MBA_IS_LINEAR  0x4
 #define MBA_MAX_MBPS   U32_MAX
diff --git a/arch/x86/kernel/cpu/intel_rdt.h b/arch/x86/kernel/cpu/resctrl.h
similarity index 99%
rename from arch/x86/kernel/cpu/intel_rdt.h
rename to arch/x86/kernel/cpu/resctrl.h
index 3736f6dc9545..a9d906767bb2 100644
--- a/arch/x86/kernel/cpu/intel_rdt.h
+++ b/arch/x86/kernel/cpu/resctrl.h
@@ -1,6 +1,6 @@
 /* SPDX-License-Identifier: GPL-2.0 */
-#ifndef _ASM_X86_INTEL_RDT_H
-#define _ASM_X86_INTEL_RDT_H
+#ifndef _ASM_X86_RESCTRL_H
+#define _ASM_X86_RESCTRL_H
 
 #include 
 #include 
@@ -568,4 +568,4 @@ void cqm_handle_limbo(struct work_struct *work);
 bool has_busy_rmid(struct rdt_resource *r, struct rdt_domain *d);
 void __check_limbo(struct rdt_domain *d, bool force_free);
 
-#endif /* _ASM_X86_INTEL_RDT_H */
+#endif /* _ASM_X86_RESCTRL_H */
diff --git a/arch/x86/kernel/cpu/intel_rdt_ctrlmondata.c 
b/arch/x86/kernel/cpu/resctrl_ctrlmondata.c
similarity index 99%
rename from arch/x86/kernel/cpu/intel_rdt_ctrlmondata.c
rename to arch/x86/kernel/cpu/resctrl_ctrlmondata.c
index 0f53049719cd..0c40a2e0a9b6 100644
--- a/arch/x86/kernel/cpu/intel_rdt_ctrlmondata.c
+++ b/arch/x86/kernel/cpu/resctrl_ctrlmondata.c
@@ -26,7 +26,7 @@
 #include 
 #include 
 #include 
-#include "intel_rdt.h"
+#include "resctrl.h"
 
 /*
  * Check whether MBA bandwidth percentage value is correct. The value is
diff --git a/arch/x86/kernel/cpu/intel_rdt_monitor.c 
b/arch/x86/kernel/cpu/resctrl_monitor.c
similarity index 99%
rename from arch/x86/kernel/cpu/intel_rdt_monitor.c
rename to arch/x86/kernel/cpu/resctrl_monitor.c
index b0f3aed76b75..211d97bcbde5 100644
--- a/arch/x86/kernel/cpu/intel_rdt_monitor.c
+++ b/arch/x86/kernel/cpu/resctrl_monitor.c
@@ -26,7 +26,7 @@
 #include 
 #include 
 #include 
-#include "intel_rdt.h"
+#include "resctrl.h"
 

Re: [PATCH 3/4] dt-bindings: iommu/arm, smmu: add compatible string for Marvell

2018-10-18 Thread Rob Herring
On Mon, Oct 15, 2018 at 02:11:52PM +0100, Robin Murphy wrote:
> On 15/10/18 13:00, han...@marvell.com wrote:
> > From: Hanna Hawa 
> > 
> > Add specific compatible string for Marvell usage due errata of
> > accessing 64bit registers of ARM SMMU, in AP806.
> > 
> > AP806 SOC use the generic ARM-MMU500, and there's no specific
> > implementation of Marvell, this compatible is used for errata only.
> 
> Given that, I think something more specific like:
> 
>   "marvell,ap806-smmu", "arm,mmu-500";
> 
> would be most appropriate. Otherwise, if some future Marvell SoC were to
> ever come out with a *different* MMU-500 integration problem, you'd already
> have painted yourself into a corner.
> 
> Alternatively (or additionally), we could perhaps consider a separate
> property like "marvell,32bit-config-access", to mirror the existing handling
> of the secure integration bug.

The former please. We have learned our lesson there (though for some 
reason, that was the *only* SMMU problem in Calxeda Midway ;) ).

Rob


[PATCH] doc: fix a typo in adding-syscalls.rst

2018-10-18 Thread corwin
From: Guillaume Dore 

There was a typo in adding-syscalls.rst that could mislead developers
to add a C filename in a makefile instead of an object filename.
This error, while not keeping developers from contributing could slow
the development process down by introducing build errors.

Signed-off-by: Guillaume Dore 
---
 Documentation/process/adding-syscalls.rst | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/process/adding-syscalls.rst 
b/Documentation/process/adding-syscalls.rst
index 0d4f29bc798b..88a7d5c8bb2f 100644
--- a/Documentation/process/adding-syscalls.rst
+++ b/Documentation/process/adding-syscalls.rst
@@ -232,7 +232,7 @@ normally be optional, so add a ``CONFIG`` option (typically 
to
by the option.
  - Make the option depend on EXPERT if it should be hidden from normal users.
  - Make any new source files implementing the function dependent on the CONFIG
-   option in the Makefile (e.g. ``obj-$(CONFIG_XYZZY_SYSCALL) += xyzzy.c``).
+   option in the Makefile (e.g. ``obj-$(CONFIG_XYZZY_SYSCALL) += xyzzy.o``).
  - Double check that the kernel still builds with the new CONFIG option turned
off.
 
-- 
2.19.1



Re: dm: add secdel target

2018-10-18 Thread Mike Snitzer
On Sun, Oct 14 2018 at  7:24am -0400,
Vitaly Chikunov  wrote:

> Report to the upper level ability to discard, and translate arriving
> discards to the writes of random or zero data to the underlying level.
> 
> Signed-off-by: Vitaly Chikunov 
> ---
>   This target is the same as the linear target except that is reports ability 
> to
>   discard to the upper level and translates arriving discards into sector
>   overwrites with random (or zero) data.

There is a fair amount of code duplication between dm-linear.c and this
new target.

Something needs to give, ideally you'd factor out methods that are
shared by both targets, but those methods must _not_ introduce overhead
to dm-linear.

Could be that dm-linear methods just get called by the wrapper
dm-sec-erase target (more on the "dm-sec-erase" name below).
 
>   The target does not try to determine if the underlying drive reliably 
> supports
>   data overwrites, this decision is solely on the discretion of a user.
> 
>   It may be useful to create a secure deletion setup when filesystem when
>   unlinking a file sends discards to its sectors, in this target they are
>   translated to writes that wipe deleted data on the underlying drive.
> 
>   Tested on x86.

All of this extra context and explanation needs to be captured in the
actual patch header.  Not as a tangent in that "cut" section of your
patch header.
 
>  Documentation/device-mapper/dm-secdel.txt |  24 ++
>  drivers/md/Kconfig|  14 ++
>  drivers/md/Makefile   |   2 +
>  drivers/md/dm-secdel.c| 399 
> ++
>  4 files changed, 439 insertions(+)
>  create mode 100644 Documentation/device-mapper/dm-secdel.txt
>  create mode 100644 drivers/md/dm-secdel.c



Shouldn't this target be implementing all that is needed for
REQ_OP_SECURE_ERASE support?  And the resulting DM device would
advertise its capability using QUEUE_FLAG_SECERASE?

And this is why I think the target should be named "dm-sec-erase" or
even "dm-secure-erase".

> diff --git a/drivers/md/dm-secdel.c b/drivers/md/dm-secdel.c
> new file mode 100644
> index ..9aeaf3f243c0
> --- /dev/null
> +++ b/drivers/md/dm-secdel.c



> +/*
> + * Send amount of masking data to the device
> + * @mode: 0 to write zeros, otherwise to write random data
> + */
> +static int issue_erase(struct block_device *bdev, sector_t sector,
> +sector_t nr_sects, gfp_t gfp_mask, enum secdel_mode mode)
> +{
> + int ret = 0;
> +
> + while (nr_sects) {
> + struct bio *bio;
> + unsigned int nrvecs = min(nr_sects,
> +   (sector_t)BIO_MAX_PAGES >> 3);
> +
> + bio = bio_alloc(gfp_mask, nrvecs);

You should probably be using your own bioset to allocate these bios.

> + if (!bio) {
> + DMERR("%s %lu[%lu]: no memory to allocate bio (%u)",
> +   __func__, sector, nr_sects, nrvecs);
> + ret = -ENOMEM;
> + break;
> + }
> +
> + bio->bi_iter.bi_sector = sector;
> + bio_set_dev(bio, bdev);
> + bio->bi_end_io = bio_end_erase;
> +
> + while (nr_sects != 0) {
> + unsigned int sn;
> + struct page *page = NULL;
> +
> + sn = min((sector_t)PAGE_SIZE >> 9, nr_sects);
> + if (mode == SECDEL_MODE_RAND) {
> + page = alloc_page(gfp_mask);
> + if (!page) {
> + DMERR("%s %lu[%lu]: no memory to 
> allocate page for random data",
> +   __func__, sector, nr_sects);
> + /* will fallback to zero filling */

In general, performing memory allocations to service IO is something all
DM core and DM targets must work to avoid.  This smells bad.

...

> +
> +/* convert discards into writes */
> +static int secdel_map_discard(struct dm_target *ti, struct bio *sbio)
> +{
> + struct secdel_c *lc = ti->private;
> + struct block_device *bdev = lc->dev->bdev;
> + sector_t sector = sbio->bi_iter.bi_sector;
> + sector_t nr_sects = bio_sectors(sbio);
> +
> + lc->requests++;
> + if (!bio_sectors(sbio))
> + return 0;
> + if (!op_discard(sbio))
> + return 0;
> + lc->discards++;
> + if (WARN_ON(sbio->bi_vcnt != 0))
> + return -1;
> + DMDEBUG("DISCARD %lu: %u sectors M%d", sbio->bi_iter.bi_sector,
> + bio_sectors(sbio), lc->mode);
> + bio_endio(sbio);
> +
> + if (issue_erase(bdev, sector, nr_sects, GFP_NOFS, lc->mode))

At a minimum this should be GFP_NOIO.  You don't want to recurse into
block (and potentially yourself) in the face of low memory.

> +static int secdel_end_io(struct dm_target *ti, struct bio *bio,
> + 

Re: [PATCH v5 03/27] x86/fpu/xstate: Introduce XSAVES system states

2018-10-18 Thread Randy Dunlap
On 10/18/18 2:26 AM, Borislav Petkov wrote:
> On Wed, Oct 17, 2018 at 04:17:01PM -0700, Randy Dunlap wrote:
>> I asked what I really wanted to know.
> 
> Then the answer is a bit better readability, I'd guess.
> 

Thanks for the reply.

-- 
~Randy


Re: [PATCH v2] docs/uio: fix a grammar nitpick

2018-10-18 Thread Jonathan Corbet
On Tue, 16 Oct 2018 08:57:44 + (UTC)
Will Korteland  wrote:

> This patch fixes a minor, incorrect piece of grammar in the UIO howto.
> 
> Signed-off-by: Will Korteland 
> Acked-by: Randy Dunlap 
> ---
> The sole change since v1 is that I re-did the patch against
> linux-next-20181016 instead of linux. Sorry for the extra work Greg, and
> thanks Randy for the acked-by.

Unfortunately, this patch is badly white-space mangled.  I tried fixing
it up but eventually had to give up and move on.

Please fix up your email client so that this doesn't happen; see
Documentation/process/email-clients.rst for some helpful suggestions
to that end.  Once you can email the patch to yourself and apply the
result successfully, please resubmit it.

Thanks,

jon
> 
>   Documentation/driver-api/uio-howto.rst | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/Documentation/driver-api/uio-howto.rst 
> b/Documentation/driver-api/uio-howto.rst
> index fb2eb73be4a3..25f50eace28b 100644
> --- a/Documentation/driver-api/uio-howto.rst
> +++ b/Documentation/driver-api/uio-howto.rst
> @@ -463,8 +463,8 @@ Getting information about your UIO device
> 
>   Information about all UIO devices is available in sysfs. The first 
> thing
>   you should do in your driver is check ``name`` and ``version`` to make
> -sure your talking to the right device and that its kernel driver has 
> the
> -version you expect.
> +sure you're talking to the right device and that its kernel driver has
> +the version you expect.
> 
>   You should also make sure that the memory mapping you need exists and
>   has the size you expect.


Re: [PATCH v2] docs: Introduce deprecated APIs list

2018-10-18 Thread Jonathan Corbet
On Wed, 17 Oct 2018 16:45:32 -0700
Kees Cook  wrote:

> As discussed in the "API replacement/deprecation" thread[1], this makes
> an effort to document what things shouldn't get (re)added to the kernel,
> by introducing Documentation/process/deprecated.rst.
> 
> [1] 
> https://lists.linuxfoundation.org/pipermail/ksummit-discuss/2018-September/005282.html
> 
> Signed-off-by: Kees Cook 

Applied, thanks.

jon


Re: [PATCH] kernel-doc: fix declaration type determination

2018-10-18 Thread Jonathan Corbet
On Wed, 17 Oct 2018 21:07:27 -0700
Randy Dunlap  wrote:

> From: Randy Dunlap 
> 
> Make declaration type determination more robust.
> 
> When scripts/kernel-doc is deciding if some kernel-doc notation
> contains an enum, a struct, a union, a typedef, or a function,
> it does a pattern match on the beginning of the string, looking
> for a match with one of "struct", "union", "enum", or "typedef",
> and otherwise defaults to a function declaration type.
> However, if a function or a function-like macro has a name that
> begins with "struct" (e.g., struct_size()), then kernel-doc
> incorrectly decides that this is a struct declaration.
> 
> Fix this by looking for the declaration type keywords having an
> ending word boundary (\b), so that "struct_size" will not match
> a struct declaration.
> 
> I compared lots of html before/after output from core-api, driver-api,
> and networking.  There were no differences in any of the files that
> I checked.
> 
> Signed-off-by: Randy Dunlap 

Applied, thanks.

jon


Re: [PATCH] doc: fix a typo in adding-syscalls.rst

2018-10-18 Thread Jonathan Corbet
On Thu, 18 Oct 2018 17:47:50 +0200
cor...@poussif.eu wrote:

> There was a typo in adding-syscalls.rst that could mislead developers
> to add a C filename in a makefile instead of an object filename.
> This error, while not keeping developers from contributing could slow
> the development process down by introducing build errors.
> 
> Signed-off-by: Guillaume Dore 

Applied, thanks.

jon


Re: [PATCH v7 0/8] arm64: untag user pointers passed to the kernel

2018-10-18 Thread Catalin Marinas
On Wed, Oct 17, 2018 at 01:25:42PM -0700, Evgenii Stepanov wrote:
> On Wed, Oct 17, 2018 at 7:20 AM, Andrey Konovalov  
> wrote:
> > On Wed, Oct 17, 2018 at 4:06 PM, Vincenzo Frascino
> >  wrote:
> >> I have been thinking a bit lately on how to address the problem of
> >> user tagged pointers passed to the kernel through syscalls, and
> >> IMHO probably the best way we have to catch them all and make sure
> >> that the approach is maintainable in the long term is to introduce
> >> shims that tag/untag the pointers passed to the kernel.
> >>
> >> In details, what I am proposing can live either in userspace
> >> (preferred solution so that we do not have to relax the ABI) or in
> >> kernel space and can be summarized as follows:
> >>  - A shim is specific to a syscall and is called by the libc when
> >>  it needs to invoke the respective syscall.
> >>  - It is required only if the syscall accepts pointers.
> >>  - It saves the tags of a pointers passed to the syscall in memory
> >>  (same approach if the we are passing a struct that contains
> >>  pointers to the kernel, with the difference that all the tags of
> >>  the pointers in the struct need to be saved singularly)
> >>  - Untags the pointers
> >>  - Invokes the syscall
> >>  - Retags the pointers with the tags stored in memory
> >>  - Returns
> >>
> >> What do you think?
> >
> > If I correctly understand what you are proposing, I'm not sure if that
> > would work with the countless number of different ioctl calls. For
> > example when an ioctl accepts a struct with a bunch of pointer fields.
> > In this case a shim like the one you propose can't live in userspace,
> > since libc doesn't know about the interface of all ioctls, so it can't
> > know which fields to untag. The kernel knows about those interfaces
> > (since the kernel implements them), but then we would need a custom
> > shim for each ioctl variation, which doesn't seem practical.
> 
> The current patchset handles majority of pointers in a just a few
> common places, like copy_from_user. Userspace shims will need to untag
> & retag all pointer arguments - we are looking at hundreds if not
> thousands of shims. They will also be located in a different code base
> from the syscall / ioctl implementations, which would make them
> impossible to keep up to date.

I think ioctls are a good reason not to attempt such user-space shim
layer (though it would have been much easier for the kernel ;)).

-- 
Catalin


Re: [PATCH v7 7/8] arm64: update Documentation/arm64/tagged-pointers.txt

2018-10-18 Thread Catalin Marinas
On Wed, Oct 10, 2018 at 04:09:25PM +0200, Andrey Konovalov wrote:
> On Wed, Oct 3, 2018 at 7:32 PM, Catalin Marinas  
> wrote:
> > On Tue, Oct 02, 2018 at 03:12:42PM +0200, Andrey Konovalov wrote:
[...]
> > Also, how is user space supposed to know that it can now pass tagged
> > pointers into the kernel? An ABI change (or relaxation), needs to be
> > advertised by the kernel, usually via a new HWCAP bit (e.g. HWCAP_TBI).
> > Once we have a HWCAP bit in place, we need to be pretty clear about
> > which syscalls can and cannot cope with tagged pointers. The "as of now"
> > implies potential further relaxation which, again, would need to be
> > advertised to user in some (additional) way.
> 
> How exactly should I do that? Something like this [1]? Or is it only
> for hardware specific things and for this patchset I need to do
> something else?
> 
> [1] 
> https://github.com/torvalds/linux/commit/7206dc93a58fb76421c4411eefa3c003337bcb2d

Thinking some more on this, we should probably keep the HWCAP_* bits for
actual hardware features. Maybe someone else has a better idea (the
linux-abi list?). An option would be to make use of AT_FLAGS auxv
(currently 0) in Linux. I've seen some MIPS patches in the past but
nothing upstream.

Yet another option would be for the user to probe on some innocuous
syscall currently returning -EFAULT on tagged pointer arguments but I
don't particularly like this.

> >> - - pointer arguments to system calls, including pointers in structures
> >> -   passed to system calls,
> >> +  - pointer arguments (including pointers in structures), which don't
> >> +describe virtual memory ranges, passed to system calls
> >
> > I think we need to be more precise here...
> 
> In what way?

In the way of being explicit about which syscalls support tagged
pointers, unless we find a good reason to support tagged pointers on all
syscalls and avoid any lists.

-- 
Catalin


Re: [PATCH 2/4] iommu/arm-smmu: Workaround for Marvell Armada-AP806 SoC erratum #582743

2018-10-18 Thread Robin Murphy

On 16/10/18 09:25, Hanna Hawa wrote:

Hi Robin,


On 10/15/2018 04:00 PM, Robin Murphy wrote:

Hi Hanna,

On 15/10/18 13:00, han...@marvell.com wrote:

From: Hanna Hawa 

Due to erratum #582743, the Marvell Armada-AP806 can't access 64bit
to ARM SMMUv2 registers.
This patch split the writeq/readq to two accesses of writel/readl.

Note that separate writes/reads to 2 is not problem regards to 
atomicity,
because the driver use the readq/writeq while initialize the SMMU, 
report

for SMMU fault, and use spinlock in one case (iova_to_phys).


In general, this doesn't work. Here's what the SMMU spec says about
SMMU_CBn_TLBIVA, but others are similar:

"If SMMU_CBA2Rn.VA64 is one, then AArch64 format is selected. The
programmer should use 64 bit accesses to this register. If 32-bit
accesses are used then writes to the top 32 bits are ignored and writes
to the lower 32 bits are zero extended."

If your interconnect won't let 64-bit transactions through, then you
can't use AArch64 format at stage 1 at all, since there's no way to
invalidate entries with the correct ASID, and you'll have to restrict
stage 2 formats to at most 44-bit IOVAs in order for TLBIIPAS2{L} not to
invalidate the wrong thing.

Thanks for your suggestion.

To restrict the IOVAs i need to add another work-around to the driver to 
limit the va_size, is that acceptable?


Yeah, constraining AArch64 stage 2 to 44 bits should just be a case of 
adjusting smmu->ipa_size at probe time, but you'd still need to add the 
writel()-based TLBI path to take advantage of it.


How big is the physical memory map on these SoCs? If everything fits 
into 40 bits then I think you could get away with simply hiding the 
SMMU_IDR2.PTFSv8 fields to sidestep the AArch64 formats altogether, and 
everything else should fall out in the wash. Otherwise, you'll have to 
just disable stage 1 support in addition to the stage 2 workaround as 
above.



What the different in the driver between AARCH32_L & AARCH32_S?


AARCH32_L is the 3-level LPAE format, which gives you 32-bit 
input/40-bit output at stage 1 and 40-bit input/40-bit output at stage 
2. AARCH32_S is the legacy 2-level short-descriptor format which only 
supports stage 1 and is limited to 32-bit output addresses - MMU-500 
does support it, but you probably want to avoid it if possible ;)


Robin.


Re: [RESEND PATCH v5 0/9] extend PWM framework to support PWM modes

2018-10-18 Thread Thierry Reding
On Wed, Oct 17, 2018 at 12:41:53PM +, claudiu.bez...@microchip.com wrote:
> 
> 
> On 16.10.2018 15:03, Thierry Reding wrote:
> > On Fri, Sep 14, 2018 at 06:20:48PM +0200, Nicolas Ferre wrote:
> >> Thierry,
> >>
> >> On 28/08/2018 at 15:01, Claudiu Beznea wrote:
> >>> Hi,
> >>>
> >>> Please give feedback on these patches which extends the PWM framework in
> >>> order to support multiple PWM modes of operations. This series is a rework
> >>> of [1] and [2].
> >>
> >> This series started with a RFC back on 5 April 2017 "extend PWM framework 
> >> to
> >> support PWM modes". The continuous work starting with v2 of this series on
> >> January 12, 2018.
> >>
> >> Then Claudiu tried to address all comments up to v4 which didn't have any
> >> more reviews. He posted a v5 without comments since May 22, 2018. This
> >> series is basically a resent of the v5 (as said in the $subject).
> >>
> >> We would like to know what is preventing this series to be included in the
> >> PWM sub-system. Note that if some issue still remain with it, we are ready
> >> to help to solve them.
> >>
> >> Without feedback from you side, we fear that we would miss a merge window
> >> again for no obvious reason (DT part is Acked by Rob: patch 5/9).
> > 
> > First off, apologies for not getting around to this earlier.
> > 
> > I think this series is mostly fine, but I still have doubts about the DT
> > aspects of this. In particular, Rob raised a concern about this here:
> > 
> > https://lkml.org/lkml/2018/1/22/655
> > 
> > and it seems like that particular question was never fully resolved as
> > the discussion veered off that particular topic.
> 
> 1/ If you are talking about this sentence:
> "Yes, but you have to make "normal" be no bit set to be compatible with
> everything already out there."
> 
> The current implementation consider that if no mode is provided then, the
> old approach is considered, meaning the normal mode will be used by every
> PWM in-kernel clients.
> 
> In of_pwm_xlate_with_flags() the pmw->args.mode is initialized with what
> pwm_mode_get_valid() returns. In case of controllers which does not
> implement something special for PWM modes the PWM normal mode will be
> returned (pwmchip_get_default_caps() function has to be called in the end).
> Otherwise the pwm->args.mode will be populated with what user provided as
> input from DT, if what was provided from DT is valid for PWM channel.
> Please see that pwm_mode_valid() is used to validate user input, otherwise
> PWM normal mode will be used.

No, that part looks fine.

> 
> + pwm->args.mode = pwm_mode_get_valid(pc, pwm);
> 
> - if (args->args_count > 2 && args->args[2] & PWM_POLARITY_INVERTED)
> - pwm->args.polarity = PWM_POLARITY_INVERSED;
> + if (args->args_count > 2) {
> + if (args->args[2] & PWM_POLARITY_INVERTED)
> + pwm->args.polarity = PWM_POLARITY_INVERSED;
> +
> + for (modebit = PWMC_MODE_COMPLEMENTARY_BIT;
> +  modebit < PWMC_MODE_CNT; modebit++) {
> + unsigned long mode = BIT(modebit);
> +
> + if ((args->args[2] & mode) &&
> + pwm_mode_valid(pwm, mode)) {
> + pwm->args.mode = mode;
> + break;
> + }
> + }
> + }
> 
> 
> 2/ If you are talking about this sentence:
> "Thinking about this some more, shouldn't the new modes just be
> implied? A client is going to require one of these modes or it won't
> work right."
> 
> As explained at point 1, if there is no mode requested from DT the default
> mode for channel will be used, which, in case of PWM controller which are
> not implementing the new modes, will be PWM normal mode.

I don't think that's an issue. I think what Rob was referring to and
which mirrors my concern is that these modes are a feature that doesn't
extend to typical use-cases. So for all existing use-cases (like LED or
backlight) we always assume a PWM running in normal mode. Now, if you
write a driver for some particular piece of hardware that needs a mode
that is not the normal mode, the question is: wouldn't that driver know
that it wants exactly push-pull or complementary mode? Wouldn't it have
to explicitly check that the PWM supports it and select it (i.e. in the
driver code)?

Say you have a driver that requires push-pull mode. It doesn't really
make sense to require the mode to be encoded in DT, because the driver
will only work with one specific mode anyway. So might as well require
it and have the driver check for support and fail if the PWM is not
compatible. This would likely never happen, because hardware engineers
couldn't have validated the design in that case, but there's no reason
for the mode to be specified in DT because it is fixed by the very use-
case anyway.

Also, leaving it out of DT simplifies things. If you allow the mode to
be specified in DT you could end up with a situa

Re: [RESEND PATCH v5 1/9] pwm: extend PWM framework with PWM modes

2018-10-18 Thread Thierry Reding
On Wed, Oct 17, 2018 at 12:42:00PM +, claudiu.bez...@microchip.com wrote:
> On 16.10.2018 15:25, Thierry Reding wrote:
> > On Tue, Aug 28, 2018 at 04:01:18PM +0300, Claudiu Beznea wrote:
[...]
> >> +const char *pwm_mode_desc(struct pwm_device *pwm, unsigned long mode)
> >> +{
> >> +  static const char * const modes[] = {
> >> +  "invalid",
> >> +  "normal",
> >> +  "complementary",
> >> +  };
> >> +
> >> +  if (!pwm_mode_valid(pwm, mode))
> >> +  return modes[0];
> >> +
> >> +  return modes[ffs(mode)];
> >> +}
> > 
> > Do we really need to be able to get the name of the mode in the context
> > of a given PWM channel? Couldn't we drop the pwm parameter and simply
> > return the name (pwm_get_mode_name()?) and at the same time remove the
> > extra "invalid" mode in there? I'm not sure what the use-case here is,
> > but it seems to me like the code should always check for supported modes
> > first before reporting their names in any way.
> 
> Looking back at this code, the main use case for checking PWM mode validity
> in pwm_mode_desc() was only with regards to mode_store(). But there is not
> need for this checking since the same thing will be checked in
> pwm_apply_state() and, in case user provides an invalid mode via sysfs the
> pwm_apply_state() will fail.
> 
> To conclude, I will change this function in something like:
> 
> if (mode == PWM_MODE_NORMAL)
>   return "normal";
> else if (mode == PWM_MODE_COMPLEMENTARY)
>   return "complementary";
> else if (mode == PWM_MODE_PUSH_PULL)
>   return "push-pull";
> else
>   return "invalid";
> 
> Please let me know if it is OK for you.

Do we even have to check here for validity of the mode in the first
place? Shouldn't this already happen at a higher level? I mean we do
need to check for valid input in mode_store(), but whatever mode we
pass into this could already have been validated, so that this would
never return "invalid".

For example, you already define an enum for the PWM modes. I think it'd
be best if we then used that enum to pass the modes around. That way it
becomes easy to check for validity.

So taking one step back, I think we can remove some of the ambiguities
by making sure we only ever specify one mode. When the mode is
explicitly being set, we only ever want one, right? The only point in
time where we can store more than one is for the capabilities. So I
think being more explicit about that would be useful. That way we remove
any uncertainties about what the unsigned long might contain at any
point in time.

> >> +/**
> >>   * pwmchip_add_with_polarity() - register a new PWM chip
> >>   * @chip: the PWM chip to add
> >>   * @polarity: initial polarity of PWM channels
> >> @@ -275,6 +382,8 @@ int pwmchip_add_with_polarity(struct pwm_chip *chip,
> >>  
> >>mutex_lock(&pwm_lock);
> >>  
> >> +  chip->get_default_caps = pwmchip_get_default_caps;
> >> +
> >>ret = alloc_pwms(chip->base, chip->npwm);
> >>if (ret < 0)
> >>goto out;
> >> @@ -294,6 +403,7 @@ int pwmchip_add_with_polarity(struct pwm_chip *chip,
> >>pwm->pwm = chip->base + i;
> >>pwm->hwpwm = i;
> >>pwm->state.polarity = polarity;
> >> +  pwm->state.mode = pwm_mode_get_valid(chip, pwm);
> >>  
> >>if (chip->ops->get_state)
> >>chip->ops->get_state(chip, pwm, &pwm->state);
> >> @@ -469,7 +579,8 @@ int pwm_apply_state(struct pwm_device *pwm, struct 
> >> pwm_state *state)
> >>int err;
> >>  
> >>if (!pwm || !state || !state->period ||
> >> -  state->duty_cycle > state->period)
> >> +  state->duty_cycle > state->period ||
> >> +  !pwm_mode_valid(pwm, state->mode))
> >>return -EINVAL;
> >>  
> >>if (!memcmp(state, &pwm->state, sizeof(*state)))
> >> @@ -530,6 +641,9 @@ int pwm_apply_state(struct pwm_device *pwm, struct 
> >> pwm_state *state)
> >>  
> >>pwm->state.enabled = state->enabled;
> >>}
> >> +
> >> +  /* No mode support for non-atomic PWM. */
> >> +  pwm->state.mode = state->mode;
> > 
> > That comment seems misplaced. This is actually part of atomic PWM, so
> > maybe just reverse the logic and say "mode support only for atomic PWM"
> > or something. I would personally just leave it away.
> 
> Ok, sure. I will remove the comment. But the code has to be there to avoid
> unassigned mode value for PWM state (normal mode means BIT(0)) and so to
> avoid future PWM applies failure.

Oh yeah, definitely keep the code around. I was only commenting on
the... comment. =)

> The legacy API has
> > no way of setting the mode, which is indication enough that we don't
> > support it.
> > 
> >> diff --git a/include/linux/pwm.h b/include/linux/pwm.h
> >> index 56518adc31dd..a4ce4ad7edf0 100644
> >> --- a/include/linux/pwm.h
> >> +++ b/include/linux/pwm.h
> >> @@ -26,9 +26,32 @@ enum pwm_polarity {
> >>  };
> >>  
> >>  /**
> >> + * PWM modes capabilities
> >> + * @PW

Re: [PATCH RFC] doc: rcu: remove obsolete (non-)requirement about disabling preemption

2018-10-18 Thread Paul E. McKenney
On Wed, Oct 17, 2018 at 07:07:51PM -0700, Joel Fernandes wrote:
> On Wed, Oct 17, 2018 at 01:33:24PM -0700, Paul E. McKenney wrote:
> > On Wed, Oct 17, 2018 at 11:15:05AM -0700, Joel Fernandes wrote:
> > > On Wed, Oct 17, 2018 at 09:11:00AM -0700, Paul E. McKenney wrote:
> > > > On Tue, Oct 16, 2018 at 01:41:22PM -0700, Joel Fernandes wrote:
> > > > > On Tue, Oct 16, 2018 at 04:26:11AM -0700, Paul E. McKenney wrote:
> > > > > > On Mon, Oct 15, 2018 at 02:08:56PM -0700, Paul E. McKenney wrote:
> > > > > > > On Mon, Oct 15, 2018 at 01:15:56PM -0700, Joel Fernandes wrote:
> > > > > > > > On Mon, Oct 15, 2018 at 12:54:26PM -0700, Paul E. McKenney 
> > > > > > > > wrote:
> > > > > > > > [...]
> > > > > > > > > > > In any case, please don't spin for milliseconds with 
> > > > > > > > > > > preemption disabled.
> > > > > > > > > > > The real-time guys are unlikely to be happy with you if 
> > > > > > > > > > > you do this!
> > > > > > > > > > 
> > > > > > > > > > Well just to clarify, I was just running Oleg's test which 
> > > > > > > > > > did this. This
> > > > > > > > > > test was mentioned in the original documentation that I 
> > > > > > > > > > deleted. Ofcourse I
> > > > > > > > > > would not dare do such a thing in production code :-D. I 
> > > > > > > > > > guess to Oleg's
> > > > > > > > > > defense, he did it to very that synchronize_rcu() was not 
> > > > > > > > > > blocked on
> > > > > > > > > > preempt-disable sections which was a different test.
> > > > > > > > > 
> > > > > > > > > Understood!  Just pointing out that RCU's tolerating a given 
> > > > > > > > > action does
> > > > > > > > > not necessarily mean that it is a good idea to take that 
> > > > > > > > > action.  ;-)
> > > > > > > > 
> > > > > > > > Makes sense :-) thanks.
> > > > > > > 
> > > > > > > Don't worry, that won't happen again.  ;-)
> > > > > > > 
> > > > > > > > > > > > > + pr_crit("SPIN done!\n");
> > > > > > > > > > > > > + preempt_enable();
> > > > > > > > > > > > > + break;
> > > > > > > > > > > > > + case 777:
> > > > > > > > > > > > > + pr_crit("SYNC start\n");
> > > > > > > > > > > > > + synchronize_rcu();
> > > > > > > > > > > > > + pr_crit("SYNC done!\n");
> > > > > > > > > > > > 
> > > > > > > > > > > > But you are using the console printing infrastructure 
> > > > > > > > > > > > which is rather
> > > > > > > > > > > > heavyweight. Try replacing pr_* calls with trace_printk 
> > > > > > > > > > > > so that you
> > > > > > > > > > > > write to the lock-free ring buffer, this will reduce 
> > > > > > > > > > > > the noise from the
> > > > > > > > > > > > heavy console printing infrastructure.
> > > > > > > > > > > 
> > > > > > > > > > > And this might be a problem as well.
> > > > > > > > > > 
> > > > > > > > > > This was not the issue (or atleast not fully the issue) 
> > > > > > > > > > since I saw the same
> > > > > > > > > > thing with trace_printk. It was exactly what you said - 
> > > > > > > > > > which is the
> > > > > > > > > > excessively long preempt disabled times.
> > > > > > > > > 
> > > > > > > > > One approach would be to apply this patch against (say) 
> > > > > > > > > v4.18, which
> > > > > > > > > does not have consolidated grace periods.  You might then be 
> > > > > > > > > able to
> > > > > > > > > tell if the pr_crit() calls make any difference.
> > > > > > > > 
> > > > > > > > I could do that, yeah. But since the original problem went away 
> > > > > > > > due to
> > > > > > > > disabling preempts for a short while, I will move on and 
> > > > > > > > continue to focus on
> > > > > > > > updating other parts of the documenation. Just to mention I
> > > > > > > > brought this up because I thought its better to do that than 
> > > > > > > > not to, just
> > > > > > > > incase there is any lurking issue with the consolidation. Sorry 
> > > > > > > > if that ended
> > > > > > > > up with me being noisy.
> > > > > > > 
> > > > > > > Not a problem, no need to apologize!
> > > > > > 
> > > > > > Besides, digging through the code did point out a reasonable 
> > > > > > optimization.
> > > > > > In the common case, this would buy 100s of microseconds rather than
> > > > > > milliseconds, but it seems simple enough to be worthwhile.  
> > > > > > Thoughts?
> > > > > 
> > > > > Cool, thanks. One comment below:
> > > > > 
> > > > > > 
> > > > > > 
> > > > > > commit 07921e8720907f58f82b142f2027fc56d5abdbfd
> > > > > > Author: Paul E. McKenney 
> > > > > > Date:   Tue Oct 16 04:12:58 2018 -0700
> > > > > > 
> > > > > > rcu: Speed up expedited GPs when interrupting RCU reader
> > > > > > 
> > > > > > In PREEMPT kernels, an expedited grace period might send an IPI 
> > > > > > to a
> > > > > > CPU that is executing an RCU read-side critical section.  In 
> > > > > > that case,
> > > > > > it would be nice if the rcu_read_unlock() directly in

Re: [PATCH v5 03/27] x86/fpu/xstate: Introduce XSAVES system states

2018-10-18 Thread Borislav Petkov
On Thu, Oct 18, 2018 at 11:31:25AM +0200, Pavel Machek wrote:
> We want readable sources, not neat ascii art everywhere.

And we want pink ponies.

Reverse xmas tree order is and has been the usual variable sorting in
the tip tree for years.

-- 
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.


Re: [PATCH v9 00/24] ILP32 for ARM64

2018-10-18 Thread Catalin Marinas
On Sun, Oct 14, 2018 at 09:49:01PM +0200, Arnd Bergmann wrote:
> On Sat, Oct 13, 2018 at 9:36 PM Andy Lutomirski  wrote:
> >
> > On Wed, May 16, 2018 at 1:19 AM Yury Norov  
> > wrote:
> > >
> > > This series enables AARCH64 with ILP32 mode.
> > >
> > > As supporting work, it introduces ARCH_32BIT_OFF_T configuration
> > > option that is enabled for existing 32-bit architectures but disabled
> > > for new arches (so 64-bit off_t userspace type is used by new userspace).
> > > Also it deprecates getrlimit and setrlimit syscalls prior to prlimit64.
> >
> > Second, ILP32 user code is highly unlikely
> > to end up with the same struct layout as ILP64 code.  The latter seems
> > like it should be solved entirely in userspace by adding a way to
> > annotate a structure as being a kernel ABI structure and getting the
> > toolchain to lay it out as if it were ILP64 even though the target is
> > ILP32.
> 
> The syscall ABI could be almost completely abstracted in glibc, the
> main issue is ioctl and a couple of related interfaces that pass data
> structures (read() on /dev/input/*, mmap on /dev/snd/*
> or raw sockets, fcntl).

There is another case on struct siginfo which has some pointers and it
wouldn't look like an LP64 structure at all (and glibc doesn't normally
intercept the sighandler call to rewrite the structure). We could add
padding around void * members as the kernel zeros them, I don't recall
the kernel reading these pointers from user. Anyway, using something
that resembles compat_siginfo looked the simplest for ILP32.

-- 
Catalin


Re: [GIT PULL] IDA/IDR fixes for 4.19

2018-10-18 Thread Greg Kroah-Hartman
On Wed, Oct 17, 2018 at 11:26:34PM -0700, Christoph Hellwig wrote:
> On Tue, Oct 16, 2018 at 10:16:08AM -0700, Matthew Wilcox wrote:
> > > >   git://git.infradead.org/users/willy/linux-dax.git ida-fixes-4.19-rc8
> > > 
> > > How about you at least test these in linux-next?  Putting things on top
> > > of the most recent change is a huge tip-off that this branch got no
> > > testing :(
> > 
> > One of the two changes is a comment line in a doc file.
> > The other has been tested by 0day (which originally reported the issue).
> > 
> > I don't see how spending time in linux-next is going to achieve anything.
> 
> Especialy if we miss 4.19 final with that.  The documentation patch
> avoids the probem of CC-BY-SA-4.0 including larger amounts of GPL files
> in the output document, so we should have this before the release for
> sure, and the other is just a test that isn't even in a normal kernel
> build.

Ok, both patches are now pulled.

> Also after idr.rst is removed we should include something like this:
> 
> ---
> >From c3660257f981e7a7254d18f52af64a2077f7bb49 Mon Sep 17 00:00:00 2001
> From: Christoph Hellwig 
> Date: Thu, 18 Oct 2018 08:22:39 +0200
> Subject: LICENSES: Remove CC-BY-SA-4.0 license text



Great idea, I've also merged this now as well after checking linux-next
to ensure that no one else is using this license there either.

thanks,

greg k-h


Re: [PATCH v14 19/19] x86/sgx: Driver documentation

2018-10-18 Thread Pavel Machek
On Thu 2018-10-18 02:45:27, Jarkko Sakkinen wrote:
> On Mon, 15 Oct 2018, Pavel Machek wrote:
> >On Tue 2018-09-25 16:06:56, Jarkko Sakkinen wrote:
> >>+Intel(R) SGX is a set of CPU instructions that can be used by applications 
> >>to
> >>+set aside private regions of code and data. The code outside the enclave is
> >>+disallowed to access the memory inside the enclave by the CPU access 
> >>control.
> >>+In a way you can think that SGX provides inverted sandbox. It protects the
> >>+application from a malicious host.
> >
> >Well, recently hardware had some problems keeping its
> >promises. So... what about rowhammer, meltdown and spectre?
> 
> Doesn't hardware always have this problem over time?

No, not really.

In this case, tries to protect from hardware "attacks" done by machine
owner. That job is theoretically impossible, so you have harder
situation than most..

> >Which ones apply, which ones do not, and on what cpu generations?
> 
> Definitely should be refined.
> 
> Meltdowns approach AFAIK does not work because reads outside the enclave
> will always have a predefined value (-1) but only if the page is present,
> which was later exploited in the Foreshadow attack.

What about L1tf and https://github.com/lsds/spectre-attack-sgx ?

Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html


signature.asc
Description: Digital signature


Re: [PATCH v5 03/27] x86/fpu/xstate: Introduce XSAVES system states

2018-10-18 Thread Pavel Machek
On Thu 2018-10-18 11:26:03, Borislav Petkov wrote:
> On Wed, Oct 17, 2018 at 04:17:01PM -0700, Randy Dunlap wrote:
> > I asked what I really wanted to know.
> 
> Then the answer is a bit better readability, I'd guess.

Normally, similar local variables are grouped together, with
initialized variables giving additional constraints.

Additionally sorting them with length ... gives too much constraints
to the author. We want readable sources, not neat ascii art everywhere.

Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html


signature.asc
Description: Digital signature


Re: [PATCH v5 03/27] x86/fpu/xstate: Introduce XSAVES system states

2018-10-18 Thread Borislav Petkov
On Wed, Oct 17, 2018 at 04:17:01PM -0700, Randy Dunlap wrote:
> I asked what I really wanted to know.

Then the answer is a bit better readability, I'd guess.

-- 
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.


Re: [PATCH v5 03/27] x86/fpu/xstate: Introduce XSAVES system states

2018-10-18 Thread Pavel Machek
On Thu 2018-10-18 00:58:29, Borislav Petkov wrote:
> On Wed, Oct 17, 2018 at 03:39:47PM -0700, Randy Dunlap wrote:
> > Would you mind explaining this request? (requirement?)
> > Other than to say that it is the preference of some maintainers,
> > please say Why it is preferred.
> > 
> > and since the s above won't typically be the same length,
> > it's not for variable name alignment, right?
> 
> Searching the net a little, it shows you have asked that question
> before. So what is it you really wanna know?

Why do you think sorting local variables is good idea (for includes it
reduces collision, hopefully you don't have that for local variables),
and where is it documented in CodingStyle.

Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html


signature.asc
Description: Digital signature


Re: [PATCH] kernel-doc: fix declaration type determination

2018-10-18 Thread Jani Nikula
On Wed, 17 Oct 2018, Randy Dunlap  wrote:
> From: Randy Dunlap 
>
> Make declaration type determination more robust.
>
> When scripts/kernel-doc is deciding if some kernel-doc notation
> contains an enum, a struct, a union, a typedef, or a function,
> it does a pattern match on the beginning of the string, looking
> for a match with one of "struct", "union", "enum", or "typedef",
> and otherwise defaults to a function declaration type.
> However, if a function or a function-like macro has a name that
> begins with "struct" (e.g., struct_size()), then kernel-doc
> incorrectly decides that this is a struct declaration.
>
> Fix this by looking for the declaration type keywords having an
> ending word boundary (\b), so that "struct_size" will not match
> a struct declaration.

My perl is all cargo cult, so can't really review, but based on the
description this is what should be done,

Acked-by: Jani Nikula 

> I compared lots of html before/after output from core-api, driver-api,
> and networking.  There were no differences in any of the files that
> I checked.

I used to do diff -r on pre and post change clean documentation builds
to verify this type of stuff.

BR,
Jani.

>
> Signed-off-by: Randy Dunlap 
> Tested-by: Kees Cook 
> Cc: Jani Nikula 
> Cc: Jonathan Corbet 
> Cc: linux-doc@vger.kernel.org
> ---
>  scripts/kernel-doc |8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
>
> --- lnx-419-rc8.orig/scripts/kernel-doc
> +++ lnx-419-rc8/scripts/kernel-doc
> @@ -1904,13 +1904,13 @@ sub process_name($$) {
>   ++$warnings;
>   }
>  
> - if ($identifier =~ m/^struct/) {
> + if ($identifier =~ m/^struct\b/) {
>   $decl_type = 'struct';
> - } elsif ($identifier =~ m/^union/) {
> + } elsif ($identifier =~ m/^union\b/) {
>   $decl_type = 'union';
> - } elsif ($identifier =~ m/^enum/) {
> + } elsif ($identifier =~ m/^enum\b/) {
>   $decl_type = 'enum';
> - } elsif ($identifier =~ m/^typedef/) {
> + } elsif ($identifier =~ m/^typedef\b/) {
>   $decl_type = 'typedef';
>   } else {
>   $decl_type = 'function';
>
>

-- 
Jani Nikula, Intel Open Source Graphics Center