date:20170522

Re: [PATCH v2 0/4] crypto: async crypto op fixes

2017-05-22 Thread Herbert Xu

On Thu, May 18, 2017 at 04:29:22PM +0300, Gilad Ben-Yossef wrote:
> This patch set fixes various usage and documentation errors
> in waiting for async crypto op to complete which can result
> in data corruption.
> 
> Note: these were discovered in the process of working on a
> patch set that replaces these call sites and more with a
> generic implementation that will prevent these problems
> going forward. These are just the fix ups for current code.
> 
> Signed-off-by: Gilad Ben-Yossef 
> CC: sta...@vger.kernel.org
> CC: Eric Biggers 

Patches 1-3 applied.  Please fix patch 4 and resubmit.  Thanks.
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] input: edt-ft5x06: increase allowed data range for threshold parameter

2017-05-22 Thread Dmitry Torokhov

On Mon, May 08, 2017 at 11:11:46AM -0500, Rob Herring wrote:
> On Tue, May 02, 2017 at 05:00:59PM +0200, Martin Kepplinger wrote:
> > The datasheet and application note does not mention an allowed range for
> > the M09_REGISTER_THRESHOLD parameter. One of our customers needs to set
> > lower values than 20 and they seem to work just fine on EDT EP0xx0M09 with
> > T5x06 touch.
> > 
> > So, lacking a known lower limit, we increase the range for thresholds,
> > and set the lower limit to 0. The documentation is updated accordingly.
> > 
> > Signed-off-by: Schoefegger Stefan 
> > Signed-off-by: Manfred Schlaegl 
> > Signed-off-by: Martin Kepplinger 
> > ---
> >  Documentation/devicetree/bindings/input/touchscreen/edt-ft5x06.txt | 2 +-
> >  Documentation/input/devices/edt-ft5x06.rst | 2 +-
> >  drivers/input/touchscreen/edt-ft5x06.c | 2 +-
> >  3 files changed, 3 insertions(+), 3 deletions(-)
> 
> Acked-by: Rob Herring 

Applied, thank you.


-- 
Dmitry
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [kernel-hardening] [PATCH v4 next 0/3] modules: automatic module loading restrictions

2017-05-22 Thread Kees Cook

On Mon, May 22, 2017 at 4:38 PM, Andy Lutomirski  wrote:
> I think that having the un-resettable mode is unnecessary.  We should
> have option that disables loading modules entirely and cannot be
> unset.  (That means no explicit loads and not implicit loads.)  Maybe
> we already have this.  Otherwise, tightening caps needed for implicit
> loads should just be a normal yes/no setting IMO.

Yup, /proc/sys/kernel/modules_disabled already does this.

-- 
Kees Cook
Pixel Security
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [kernel-hardening] [PATCH v4 next 0/3] modules: automatic module loading restrictions

2017-05-22 Thread Andy Lutomirski

On Mon, May 22, 2017 at 4:07 PM, Kees Cook  wrote:
> On Mon, May 22, 2017 at 12:55 PM, Djalal Harouni  wrote:
>> On Mon, May 22, 2017 at 6:43 PM, Solar Designer  wrote:
>>> On Mon, May 22, 2017 at 03:49:15PM +0200, Djalal Harouni wrote:
 On Mon, May 22, 2017 at 2:08 PM, Solar Designer  wrote:
 > On Mon, May 22, 2017 at 01:57:03PM +0200, Djalal Harouni wrote:
 >> *) When modules_autoload_mode is set to (2), automatic module loading is
 >> disabled for all. Once set, this value can not be changed.
 >
 > What purpose does this securelevel-like property ("Once set, this value
 > can not be changed.") serve here?  I think this mode 2 is needed, but
 > without this extra property, which is bypassable by e.g. explicitly
 > loaded kernel modules anyway (and that's OK).

 My reasoning about "Once set, this value can not be changed" is mainly for:

 If you have some systems where modules are not updated for any given
 reason, then the only one who will be able to load a module is an
 administrator, basically this is a shortcut for:

 * Apps/services can run with CAP_NET_ADMIN but they are not allowed to
 auto-load 'netdev' modules.

 * Explicitly loading modules can be guarded by seccomp filters *per*
 app, so even if these apps have
   CAP_SYS_MODULE they won't be able to explicitly load modules, one
 has to remount some sysctl /proc/ entries read-only here and remove
 CAP_SYS_ADMIN for all apps anyway.

 This mainly serves the purpose of these systems that do not receive
 updates, if I don't want to expose those kernel interfaces what should
 I do ? then if I want to unload old versions and replace them with new
 ones what operation should be allowed ? and only real root of the
 system can do it. Hence, the "Once set, this value can not be changed"
 is more of a shortcut, also the idea was put in my mind based on how
 "modules_disabled" is disabled forever, and some other interfaces. I
 would say: it is easy to handle a transition from 1) "hey this system
 is still up to date, some features should be exposed" to 2) "this
 system is not up to date anymore, only root should expose some
 features..."

 Hmm, I am not sure if this answers your question ? :-)
>>>
>>> This answers my question, but in a way that I summarize as "there's no
>>> good reason to include this securelevel-like property".
>>>
>>
>> Hmm, sorry I did forget to add in my previous comment that with such
>> systems, CAP_SYS_MODULE can be used to reset the
>> "modules_autoload_mode" sysctl back from mode 2 to mode 1, even if we
>> disable it privileged tasks can be triggered to overwrite the sysctl
>> flag and get it back unless /proc is read-only... that's one of the
>> points, it should not be so easy to relax it.
>
> I'm on the fence. For modules_disabled and Yama, it was tied to
> CAP_SYS_ADMIN, basically designed to be a at-boot setting that could
> not later be undone by an attacker gaining that privilege, keeping
> them out of either kernel memory or existing user process memory.
> Here, it's CAP_SYS_MODULE... it's hard to imagine the situation where
> a CAP_SYS_MODULE-capable process could write to this sysctl but NOT
> issue direct modprobe requests, but it's _possible_ via crazy symlink
> games to trick capable processes into writing to sysctls. We've seen
> this multiple times before, and it's a way for attackers to turn a
> single privileged write into a privileged exec.
>
> I might turn the question around, though: why would we want to have it
> changeable at this setting?
>
> I'm fine leaving that piece off, either way.

I think that having the un-resettable mode is unnecessary.  We should
have option that disables loading modules entirely and cannot be
unset.  (That means no explicit loads and not implicit loads.)  Maybe
we already have this.  Otherwise, tightening caps needed for implicit
loads should just be a normal yes/no setting IMO.
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [kernel-hardening] [PATCH v4 next 0/3] modules: automatic module loading restrictions

2017-05-22 Thread Kees Cook

On Mon, May 22, 2017 at 12:55 PM, Djalal Harouni  wrote:
> On Mon, May 22, 2017 at 6:43 PM, Solar Designer  wrote:
>> On Mon, May 22, 2017 at 03:49:15PM +0200, Djalal Harouni wrote:
>>> On Mon, May 22, 2017 at 2:08 PM, Solar Designer  wrote:
>>> > On Mon, May 22, 2017 at 01:57:03PM +0200, Djalal Harouni wrote:
>>> >> *) When modules_autoload_mode is set to (2), automatic module loading is
>>> >> disabled for all. Once set, this value can not be changed.
>>> >
>>> > What purpose does this securelevel-like property ("Once set, this value
>>> > can not be changed.") serve here?  I think this mode 2 is needed, but
>>> > without this extra property, which is bypassable by e.g. explicitly
>>> > loaded kernel modules anyway (and that's OK).
>>>
>>> My reasoning about "Once set, this value can not be changed" is mainly for:
>>>
>>> If you have some systems where modules are not updated for any given
>>> reason, then the only one who will be able to load a module is an
>>> administrator, basically this is a shortcut for:
>>>
>>> * Apps/services can run with CAP_NET_ADMIN but they are not allowed to
>>> auto-load 'netdev' modules.
>>>
>>> * Explicitly loading modules can be guarded by seccomp filters *per*
>>> app, so even if these apps have
>>>   CAP_SYS_MODULE they won't be able to explicitly load modules, one
>>> has to remount some sysctl /proc/ entries read-only here and remove
>>> CAP_SYS_ADMIN for all apps anyway.
>>>
>>> This mainly serves the purpose of these systems that do not receive
>>> updates, if I don't want to expose those kernel interfaces what should
>>> I do ? then if I want to unload old versions and replace them with new
>>> ones what operation should be allowed ? and only real root of the
>>> system can do it. Hence, the "Once set, this value can not be changed"
>>> is more of a shortcut, also the idea was put in my mind based on how
>>> "modules_disabled" is disabled forever, and some other interfaces. I
>>> would say: it is easy to handle a transition from 1) "hey this system
>>> is still up to date, some features should be exposed" to 2) "this
>>> system is not up to date anymore, only root should expose some
>>> features..."
>>>
>>> Hmm, I am not sure if this answers your question ? :-)
>>
>> This answers my question, but in a way that I summarize as "there's no
>> good reason to include this securelevel-like property".
>>
>
> Hmm, sorry I did forget to add in my previous comment that with such
> systems, CAP_SYS_MODULE can be used to reset the
> "modules_autoload_mode" sysctl back from mode 2 to mode 1, even if we
> disable it privileged tasks can be triggered to overwrite the sysctl
> flag and get it back unless /proc is read-only... that's one of the
> points, it should not be so easy to relax it.

I'm on the fence. For modules_disabled and Yama, it was tied to
CAP_SYS_ADMIN, basically designed to be a at-boot setting that could
not later be undone by an attacker gaining that privilege, keeping
them out of either kernel memory or existing user process memory.
Here, it's CAP_SYS_MODULE... it's hard to imagine the situation where
a CAP_SYS_MODULE-capable process could write to this sysctl but NOT
issue direct modprobe requests, but it's _possible_ via crazy symlink
games to trick capable processes into writing to sysctls. We've seen
this multiple times before, and it's a way for attackers to turn a
single privileged write into a privileged exec.

I might turn the question around, though: why would we want to have it
changeable at this setting?

I'm fine leaving that piece off, either way.

-Kees

-- 
Kees Cook
Pixel Security
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v4 next 2/3] modules:capabilities: automatic module loading restriction

2017-05-22 Thread Kees Cook

On Mon, May 22, 2017 at 4:57 AM, Djalal Harouni  wrote:
> [...]
> diff --git a/kernel/module.c b/kernel/module.c
> index 4a3665f..ce7a146 100644
> --- a/kernel/module.c
> +++ b/kernel/module.c
> @@ -282,6 +282,8 @@ module_param(sig_enforce, bool_enable_only, 0644);
>
>  /* Block module loading/unloading? */
>  int modules_disabled = 0;
> +int modules_autoload_mode = MODULES_AUTOLOAD_ALLOWED;
> +const int modules_autoload_max = MODULES_AUTOLOAD_DISABLED;
>  core_param(nomodule, modules_disabled, bint, 0);
>
>  /* Waiting for a module to finish initializing? */
> @@ -4296,6 +4298,46 @@ struct module *__module_text_address(unsigned long 
> addr)
>  }
>  EXPORT_SYMBOL_GPL(__module_text_address);
>
> +/**
> + * may_autoload_module - Determine whether a module auto-load operation
> + * is permitted
> + * @kmod_name: The module name
> + * @allow_cap: if positive, may allow to auto-load the module if this 
> capability
> + * is set
> + *
> + * Determine whether a module auto-load operation is allowed or not. The 
> check
> + * uses the sysctl "modules_autoload_mode" value.
> + *
> + * This allows to have more control on automatic module loading, and align it
> + * with explicit load/unload module operations. The kernel contains several
> + * modules, some of them are not updated often and may contain bugs and
> + * vulnerabilities.
> + *
> + * The "allow_cap" is passed by callers to explicitly note that the module 
> has
> + * the appropriate alias and that the "allow_cap" capability is set. This is
> + * for backward compatibility, the aim is to have a clear picture where:
> + *
> + * 1) Implicit module loading is allowed
> + * 2) Implicit module loading as with the explicit one requires 
> CAP_SYS_MODULE.
> + * 3) Implicit module loading as with the explicit one can be disabled.
> + *
> + * Returns 0 if the module request is allowed or -EPERM if not.
> + */
> +int may_autoload_module(char *kmod_name, int allow_cap)
> +{
> +   if (modules_autoload_mode == MODULES_AUTOLOAD_ALLOWED)
> +   return 0;
> +   else if (modules_autoload_mode == MODULES_AUTOLOAD_PRIVILEGED) {
> +   /* Check CAP_SYS_MODULE then allow_cap if valid */
> +   if (capable(CAP_SYS_MODULE) ||
> +   (allow_cap > 0 && capable(allow_cap)))

With the allow_cap check already happening in my suggestion for
__request_module(), it's not needed here. (In fact, it's not even
really needed to plumb this into the hook, I don't think?

Regardless, I remain a fan. :)

-Kees

-- 
Kees Cook
Pixel Security
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v4 next 1/3] modules:capabilities: allow __request_module() to take a capability argument

2017-05-22 Thread Kees Cook

On Mon, May 22, 2017 at 4:57 AM, Djalal Harouni  wrote:
> This is a preparation patch for the module auto-load restriction feature.
>
> In order to restrict module auto-load operations we need to check if the
> caller has CAP_SYS_MODULE capability. This allows to align security
> checks of automatic module loading with the checks of the explicit operations.
>
> However for "netdev-%s" modules, they are allowed to be loaded if
> CAP_NET_ADMIN is set. Therefore, in order to not break this assumption,
> and allow userspace to only load "netdev-%s" modules with CAP_NET_ADMIN
> capability which is considered a privileged operation, we have two
> choices: 1) parse "netdev-%s" alias and check the capability or 2) hand
> the capability form request_module() to security_kernel_module_request()
> hook and let the capability subsystem decide.
>
> After a discussion with Rusty Russell [1], the suggestion was to pass
> the capability from request_module() to security_kernel_module_request()
> for 'netdev-%s' modules that need CAP_NET_ADMIN.
>
> The patch does not update request_module(), it updates the internal
> __request_module() that will take an extra "allow_cap" argument. If
> positive, then automatic module load operation can be allowed.

I find this refactor slightly confusing. I would expect to collapse
the existing caps checks in net/core/dev_ioctl.c and
net/ipv4/tcp_cong.c, and make this a "required cap" argument, and to
add a new non-__ function instead of requiring callers use
__request_module.

request_module_capable(int cap_required, fmt, args);

adjust __request_module() for the new arg, and when cap_required !=
-1, perform a cap check.

Then make request_module pass -1 to __request_module(), and change
dev_ioctl.c (and tcp_cong.c) from:

if (no_module && capable(CAP_NET_ADMIN))
no_module = request_module("netdev-%s", name);
if (no_module && capable(CAP_SYS_MODULE))
request_module("%s", name);

to:

if (no_module)
no_module = request_module_capable(CAP_NET_ADMIN,
"netdev-%s", name);
if (no_module)
no_module = request_module_capable(CAP_SYS_MODULE, "%s", name);

that'll make the code cleaner, too.

> __request_module() will be only called by networking code which is the
> exception to this, so we do not break userspace and CAP_NET_ADMIN can
> continue to load 'netdev-%s' modules. Other kernel code should continue
> to use request_module() which calls security_kernel_module_request() and
> will check for CAP_SYS_MODULE capability in next patch. Allowing more
> control on who can trigger automatic module loading.
>
> This patch updates security_kernel_module_request() to take the
> 'allow_cap' argument and SELinux which is currently the only user of
> security_kernel_module_request() hook.
>
> Based on patch by Rusty Russell:
> https://lkml.org/lkml/2017/4/26/735
>
> Cc: Serge Hallyn 
> Cc: Andy Lutomirski 
> Suggested-by: Rusty Russell 
> Suggested-by: Kees Cook 
> Signed-off-by: Djalal Harouni 
>
> [1] https://lkml.org/lkml/2017/4/24/7
> ---
>  include/linux/kmod.h  | 15 ---
>  include/linux/lsm_hooks.h |  4 +++-
>  include/linux/security.h  |  4 ++--
>  kernel/kmod.c | 15 +--
>  net/core/dev_ioctl.c  | 10 +-
>  security/security.c   |  4 ++--
>  security/selinux/hooks.c  |  2 +-
>  7 files changed, 38 insertions(+), 16 deletions(-)
>
> diff --git a/include/linux/kmod.h b/include/linux/kmod.h
> index c4e441e..a314432 100644
> --- a/include/linux/kmod.h
> +++ b/include/linux/kmod.h
> @@ -32,18 +32,19 @@
>  extern char modprobe_path[]; /* for sysctl */
>  /* modprobe exit status on success, -ve on error.  Return value
>   * usually useless though. */
> -extern __printf(2, 3)
> -int __request_module(bool wait, const char *name, ...);
> -#define request_module(mod...) __request_module(true, mod)
> -#define request_module_nowait(mod...) __request_module(false, mod)
> +extern __printf(3, 4)
> +int __request_module(bool wait, int allow_cap, const char *name, ...);
>  #define try_then_request_module(x, mod...) \
> -   ((x) ?: (__request_module(true, mod), (x)))
> +   ((x) ?: (__request_module(true, -1, mod), (x)))
>  #else
> -static inline int request_module(const char *name, ...) { return -ENOSYS; }
> -static inline int request_module_nowait(const char *name, ...) { return 
> -ENOSYS; }
> +static inline __printf(3, 4)
> +int __request_module(bool wait, int allow_cap, const char *name, ...)
> +{ return -ENOSYS; }
>  #define try_then_request_module(x, mod...) (x)
>  #endif
>
> +#define request_module(mod...) __request_module(true, -1, mod)
> +#define request_module_nowait(mod...) __request_module(false, -1, mod)
>
>  struct cred;
>  struct file;
> diff --git a/include/linux/lsm_hooks.h b/include/linux/lsm_hooks.h
> index f7914d9..7688f79 100644
> --- a/include/linux/lsm_hooks.h
> +++ b/include/linux/lsm_hooks.h
> @@ -578,6 +578,8 @@
>   * Ability to tri

Re: [kernel-hardening] [PATCH v4 next 0/3] modules: automatic module loading restrictions

2017-05-22 Thread Djalal Harouni

On Mon, May 22, 2017 at 6:43 PM, Solar Designer  wrote:
> On Mon, May 22, 2017 at 03:49:15PM +0200, Djalal Harouni wrote:
>> On Mon, May 22, 2017 at 2:08 PM, Solar Designer  wrote:
>> > On Mon, May 22, 2017 at 01:57:03PM +0200, Djalal Harouni wrote:
>> >> *) When modules_autoload_mode is set to (2), automatic module loading is
>> >> disabled for all. Once set, this value can not be changed.
>> >
>> > What purpose does this securelevel-like property ("Once set, this value
>> > can not be changed.") serve here?  I think this mode 2 is needed, but
>> > without this extra property, which is bypassable by e.g. explicitly
>> > loaded kernel modules anyway (and that's OK).
>>
>> My reasoning about "Once set, this value can not be changed" is mainly for:
>>
>> If you have some systems where modules are not updated for any given
>> reason, then the only one who will be able to load a module is an
>> administrator, basically this is a shortcut for:
>>
>> * Apps/services can run with CAP_NET_ADMIN but they are not allowed to
>> auto-load 'netdev' modules.
>>
>> * Explicitly loading modules can be guarded by seccomp filters *per*
>> app, so even if these apps have
>>   CAP_SYS_MODULE they won't be able to explicitly load modules, one
>> has to remount some sysctl /proc/ entries read-only here and remove
>> CAP_SYS_ADMIN for all apps anyway.
>>
>> This mainly serves the purpose of these systems that do not receive
>> updates, if I don't want to expose those kernel interfaces what should
>> I do ? then if I want to unload old versions and replace them with new
>> ones what operation should be allowed ? and only real root of the
>> system can do it. Hence, the "Once set, this value can not be changed"
>> is more of a shortcut, also the idea was put in my mind based on how
>> "modules_disabled" is disabled forever, and some other interfaces. I
>> would say: it is easy to handle a transition from 1) "hey this system
>> is still up to date, some features should be exposed" to 2) "this
>> system is not up to date anymore, only root should expose some
>> features..."
>>
>> Hmm, I am not sure if this answers your question ? :-)
>
> This answers my question, but in a way that I summarize as "there's no
> good reason to include this securelevel-like property".
>

Hmm, sorry I did forget to add in my previous comment that with such
systems, CAP_SYS_MODULE can be used to reset the
"modules_autoload_mode" sysctl back from mode 2 to mode 1, even if we
disable it privileged tasks can be triggered to overwrite the sysctl
flag and get it back unless /proc is read-only... that's one of the
points, it should not be so easy to relax it.



>> I definitively don't want to fall into "modules_disabled" trap where
>> is it too strict! "Once set, this value can not be changed" means for
>> some users do not set it otherwise the system is unusable...
>>
>> Maybe an extra "4" mode for that ? better get it right.
>
> I think you should simply exclude this property from mode 2.
>

Ok, maybe my comment above answers this ?

What I was referring to here, is to have one small window where it is
disable for privileged and that securelevel-like like property or
disable definitively are separated. I don't have a strong opinion
here, having a usable system is important.


> The module autoloading restrictions aren't meant to reduce root's
> powers; they're only meant to protect processes from shooting themselves
> and the system in the foot inadvertently (confused deputy).
>
> modules_disabled may be different in that respect, although with the
> rest of the kernel lacking securelevel-like support the point is moot.
>
> We had working securelevel in 2.0.34 through 2.0.40 inclusive, but
> we've lost it in 2.1+ with cap-bound apparently never becoming as
> complete a replacement for it and having been lost/broken further in
> 2.6.25+.  I regret this, but that's a different story.  Like I say,
> module autoloading doesn't even fit in with those restrictions - it's
> about a totally different threat model.
>

Ok, thanks for the information, so yes it seems we do not have such a
consistent way, but this did not block Yama LSM and other sysctl to
implement their own cases, maybe it did show that it is not that easy
to have a generic securelevel mechanism ? and what we currently have
is more practical ? I can't tell here. But we definitively want to
block privileged tasks to revert the sysctl mode if the administrator
do not want automatic module loading.

Thanks!

-- 
tixxdz
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[GIT PULL] Power management updates for v4.12-rc3

2017-05-22 Thread Rafael J. Wysocki

Hi Linus,

Please pull from the tag

 git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git \
 pm-4.12-rc3

with top-most commit bb47e964175e5fb4c163066e4373fac055fe5da0

 Merge branches 'pm-sleep' and 'powercap'

on top of commit 08332893e37af6ae779367e78e444f8f9571511d

 Linux 4.12-rc2

to receive power management updates for v4.12-rc3.

These fix RTC wakeup from suspend-to-idle broken recently, fix CPU
idleness detection condition in the schedutil cpufreq governor, fix
a cpufreq driver build failure, fix an error code path in the power
capping framework, clean up the hibernate core and update the
intel_pstate documentation.

Specifics:

 - Fix RTC wakeup from suspend-to-idle broken by the recent rework
   of ACPI wakeup handling (Rafael Wysocki).

 - Update intel_pstate driver documentation to reflect the current
   code and explain how it works in more detail (Rafael Wysocki).

   That had dependencies in both the PM and documentation trees
   which all have been merged now.

 - Fix an issue related to CPU idleness detection on systems with
   shared cpufreq policies in the schedutil governor (Juri Lelli).

 - Fix a possible build issue in the dbx500 cpufreq driver (Arnd
   Bergmann).

 - Fix a function in the power capping framework core to return
   an error code instead of 0 when there's an error (Dan Carpenter).

 - Clean up variable definition in the hibernation core (Pushkar
   Jambhlekar).

Thanks!


---

Arnd Bergmann (1):
  cpufreq: dbx500: add a Kconfig symbol

Dan Carpenter (1):
  PowerCap: Fix an error code in powercap_register_zone()

Juri Lelli (1):
  cpufreq: schedutil: use now as reference when aggregating shared
policy requests

Pushkar Jambhlekar (1):
  PM / hibernate: Declare variables as static

Rafael J. Wysocki (3):
  cpufreq: intel_pstate: Document the current behavior and user interface
  PM / wakeup: Fix up wakeup_source_report_event()
  RTC: rtc-cmos: Fix wakeup from suspend-to-idle

---

 Documentation/admin-guide/pm/cpufreq.rst  |  19 +-
 Documentation/admin-guide/pm/index.rst|   1 +
 Documentation/admin-guide/pm/intel_pstate.rst | 755 ++
 Documentation/cpu-freq/intel-pstate.txt   | 281 --
 drivers/base/power/wakeup.c   |  11 +-
 drivers/cpufreq/Kconfig.arm   |   9 +
 drivers/cpufreq/Makefile  |   2 +-
 drivers/powercap/powercap_sys.c   |   1 +
 drivers/rtc/rtc-cmos.c|   2 +-
 kernel/power/snapshot.c   |   2 +-
 kernel/sched/cpufreq_schedutil.c  |   7 +-
 11 files changed, 787 insertions(+), 303 deletions(-)
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v5 0/3] watchdog: allow setting deadline for opening /dev/watchdogN

2017-05-22 Thread Guenter Roeck

On Mon, May 22, 2017 at 07:07:51PM +0100, Alan Cox wrote:
> On Mon, 22 May 2017 16:06:36 +0200
> Rasmus Villemoes  wrote:
> 
> > If a watchdog driver tells the framework that the device is running,
> > the framework takes care of feeding the watchdog until userspace opens
> > the device. If the userspace application which is supposed to do that
> > never comes up properly, the watchdog is fed indefinitely by the
> > kernel. This can be especially problematic for embedded devices.
> > 
> > These patches allow one to set a maximum time for which the kernel
> > will feed the watchdog, thus ensuring that either userspace has come
> > up, or the board gets reset. This allows fallback logic in the
> > bootloader to attempt some recovery (for example, if an automatic
> > update is in progress, it could roll back to the previous version).
> 
> 
> This makes sense except for being a CONFIG_ option not a boot parameter.
> If it's a boot parameter then the same kernel works for multiple systems
> and is general. If it's compile time then you have to build a custom
> kernel.
> 
> For some embedded stuff that might not matter (although I bet they'd
> prefer it command line/device tree too) but for something like an x86
> platform where you are deploying a standard vendor supplied kernel it's
> bad to do it that way IMHO.
> 
> In other words I think you should drop patch 3 but the rest is good.
> 

Same here. Can we assume a formal Reviewed-by: from you for the first two
patches ?

Thanks,
Guenter

> Alan
> --
> To unsubscribe from this list: send the line "unsubscribe linux-watchdog" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 23/29] vfio-mediated-device.txt: standardize document format

2017-05-22 Thread Kirti Wankhede



On 5/19/2017 6:56 AM, Mauro Carvalho Chehab wrote:
> Each text file under Documentation follows a different
> format. Some doesn't even have titles!
> 
> In this specific document, the title, copyright and authorship
> are added as if it were a C file!
> 
> Change its representation to follow the adopted standard,
> using ReST markups for it to be parseable by Sphinx:
> - convert document preambule to the proper format;
> - mark literal blocks;
> - adjust identation;
> - use numbered lists for references.
> 
> Signed-off-by: Mauro Carvalho Chehab 

Looks good to me.

Reviewed by: Kirti Wankhede 

Thanks,
Kirti

> ---
>  Documentation/vfio-mediated-device.txt | 252 
> +
>  1 file changed, 130 insertions(+), 122 deletions(-)
> 
> diff --git a/Documentation/vfio-mediated-device.txt 
> b/Documentation/vfio-mediated-device.txt
> index e5e57b40f8af..1b3950346532 100644
> --- a/Documentation/vfio-mediated-device.txt
> +++ b/Documentation/vfio-mediated-device.txt
> @@ -1,14 +1,17 @@
> -/*
> - * VFIO Mediated devices
> - *
> - * Copyright (c) 2016, NVIDIA CORPORATION. All rights reserved.
> - * Author: Neo Jia 
> - * Kirti Wankhede 
> - *
> - * This program is free software; you can redistribute it and/or modify
> - * it under the terms of the GNU General Public License version 2 as
> - * published by the Free Software Foundation.
> - */
> +.. include:: 
> +
> +=
> +VFIO Mediated devices
> +=
> +
> +:Copyright: |copy| 2016, NVIDIA CORPORATION. All rights reserved.
> +:Author: Neo Jia 
> +:Author: Kirti Wankhede 
> +
> +This program is free software; you can redistribute it and/or modify
> +it under the terms of the GNU General Public License version 2 as
> +published by the Free Software Foundation.
> +
>  
>  Virtual Function I/O (VFIO) Mediated devices[1]
>  ===
> @@ -42,7 +45,7 @@ removes it from a VFIO group.
>  
>  The following high-level block diagram shows the main components and 
> interfaces
>  in the VFIO mediated driver framework. The diagram shows NVIDIA, Intel, and 
> IBM
> -devices as examples, as these devices are the first devices to use this 
> module.
> +devices as examples, as these devices are the first devices to use this 
> module::
>  
>   +---+
>   |   |
> @@ -91,7 +94,7 @@ Registration Interface for a Mediated Bus Driver
>  
>  
>  The registration interface for a mediated bus driver provides the following
> -structure to represent a mediated device's driver:
> +structure to represent a mediated device's driver::
>  
>   /*
>* struct mdev_driver [2] - Mediated device's driver
> @@ -110,14 +113,14 @@ structure to represent a mediated device's driver:
>  A mediated bus driver for mdev should use this structure in the function 
> calls
>  to register and unregister itself with the core driver:
>  
> -* Register:
> +* Register::
>  
> -  extern int  mdev_register_driver(struct mdev_driver *drv,
> +extern int  mdev_register_driver(struct mdev_driver *drv,
>  struct module *owner);
>  
> -* Unregister:
> +* Unregister::
>  
> -  extern void mdev_unregister_driver(struct mdev_driver *drv);
> +extern void mdev_unregister_driver(struct mdev_driver *drv);
>  
>  The mediated bus driver is responsible for adding mediated devices to the 
> VFIO
>  group when devices are bound to the driver and removing mediated devices from
> @@ -152,15 +155,15 @@ The callbacks in the mdev_parent_ops structure are as 
> follows:
>  * mmap: mmap emulation callback
>  
>  A driver should use the mdev_parent_ops structure in the function call to
> -register itself with the mdev core driver:
> +register itself with the mdev core driver::
>  
> -extern int  mdev_register_device(struct device *dev,
> - const struct mdev_parent_ops *ops);
> + extern int  mdev_register_device(struct device *dev,
> +  const struct mdev_parent_ops *ops);
>  
>  However, the mdev_parent_ops structure is not required in the function call
> -that a driver should use to unregister itself with the mdev core driver:
> +that a driver should use to unregister itself with the mdev core driver::
>  
> -extern void mdev_unregister_device(struct device *dev);
> + extern void mdev_unregister_device(struct device *dev);
>  
>  
>  Mediated Device Management Interface Through sysfs
> @@ -183,30 +186,32 @@ with the mdev core driver.
>  Directories and files under the sysfs for Each Physical Device
>  --
>  
> -|- [parent physical device]
> -|--- Vendor-specific-attributes [optional]
> -|--- [mdev_supported_types]
> -| |--- []
> -| |   |--- create
> -| |   |--- name
> -| |   |--- available_instances
> -| |   |--- device_api
> -| |   |--- descr

Re: [PATCH v5 0/3] watchdog: allow setting deadline for opening /dev/watchdogN

2017-05-22 Thread Alan Cox

On Mon, 22 May 2017 16:06:36 +0200
Rasmus Villemoes  wrote:

> If a watchdog driver tells the framework that the device is running,
> the framework takes care of feeding the watchdog until userspace opens
> the device. If the userspace application which is supposed to do that
> never comes up properly, the watchdog is fed indefinitely by the
> kernel. This can be especially problematic for embedded devices.
> 
> These patches allow one to set a maximum time for which the kernel
> will feed the watchdog, thus ensuring that either userspace has come
> up, or the board gets reset. This allows fallback logic in the
> bootloader to attempt some recovery (for example, if an automatic
> update is in progress, it could roll back to the previous version).

This makes sense except for being a CONFIG_ option not a boot parameter.
If it's a boot parameter then the same kernel works for multiple systems
and is general. If it's compile time then you have to build a custom
kernel.

For some embedded stuff that might not matter (although I bet they'd
prefer it command line/device tree too) but for something like an x86
platform where you are deploying a standard vendor supplied kernel it's
bad to do it that way IMHO.

In other words I think you should drop patch 3 but the rest is good.

Alan
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC PATCH v2 11/17] cgroup: Implement new thread mode semantics

2017-05-22 Thread Waiman Long

On 05/22/2017 01:13 PM, Waiman Long wrote:
> On 05/19/2017 04:26 PM, Tejun Heo wrote:
>>> @@ -2982,22 +3010,48 @@ static int cgroup_enable_threaded(struct cgroup 
>>> *cgrp)
>>> LIST_HEAD(csets);
>>> struct cgrp_cset_link *link;
>>> struct css_set *cset, *cset_next;
>>> +   struct cgroup *child;
>>> int ret;
>>> +   u16 ss_mask;
>>>  
>>> lockdep_assert_held(&cgroup_mutex);
>>>  
>>> /* noop if already threaded */
>>> -   if (cgrp->proc_cgrp)
>>> +   if (cgroup_is_threaded(cgrp))
>>> return 0;
>>>  
>>> -   /* allow only if there are neither children or enabled controllers */
>>> -   if (css_has_online_children(&cgrp->self) || cgrp->subtree_control)
>>> +   /*
>>> +* Allow only if it is not the root and there are:
>>> +* 1) no children,
>>> +* 2) no non-threaded controllers are enabled, and
>>> +* 3) no attached tasks.
>>> +*
>>> +* With no attached tasks, it is assumed that no css_sets will be
>>> +* linked to the current cgroup. This may not be true if some dead
>>> +* css_sets linger around due to task_struct leakage, for example.
>>> +*/
>> It doesn't look like the code is actually making this (incorrect)
>> assumption.  I suppose the comment is from before
>> cgroup_is_populated() was added?
> Yes, it is a bug. I should have checked the tasks_count instead of using
> cgroup_is_populated. Thanks for catching that.

Sorry, I would like to take it back. I think cgroup_is_populated() will
be set if there is any task attached to the cgroup. So I think it is
doing the right thing with regard to (3).

Cheers,
Longman

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC PATCH v2 11/17] cgroup: Implement new thread mode semantics

2017-05-22 Thread Waiman Long

On 05/19/2017 04:26 PM, Tejun Heo wrote:
> Hello, Waiman.
>
> On Mon, May 15, 2017 at 09:34:10AM -0400, Waiman Long wrote:
>> Now we could have something like
>>
>>  R -- A -- B
>>   \
>>T1 -- T2
>>
>> where R is the thread root, A and B are non-threaded cgroups, T1 and
>> T2 are threaded cgroups. The cgroups R, T1, T2 form a threaded subtree
>> where all the non-threaded resources are accounted for in R.  The no
>> internal process constraint does not apply in the threaded subtree.
>> Non-threaded controllers need to properly handle the competition
>> between internal processes and child cgroups at the thread root.
>>
>> This model will be flexible enough to support the need of the threaded
>> controllers.
> Maybe I'm misunderstanding the design, but this seems to push the
> processes which belong to the threaded subtree to the parent which is
> part of the usual resource domain hierarchy thus breaking the no
> internal competition constraint.  I'm not sure this is something we'd
> want.  Given that the limitation of the original threaded mode was the
> required nesting below root and that we treat root special anyway
> (exactly in the way necessary), I wonder whether it'd be better to
> simply allow root to be both domain and thread root.

Yes, root can be both domain and thread root. I haven't placed any
restriction on that.

>
> Specific review points below but we'd probably want to discuss the
> overall design first.
>
>> +static inline bool cgroup_is_threaded(const struct cgroup *cgrp)
>> +{
>> +return cgrp->proc_cgrp && (cgrp->proc_cgrp != cgrp);
>> +}
>> +
>> +static inline bool cgroup_is_thread_root(const struct cgroup *cgrp)
>> +{
>> +return cgrp->proc_cgrp == cgrp;
>> +}
> Maybe add a bit of comments explaining what's going on with
> ->proc_cgrp?

Sure, will do that.

>>  /**
>> + * threaded_children_count - returns # of threaded children
>> + * @cgrp: cgroup to be tested
>> + *
>> + * cgroup_mutex must be held by the caller.
>> + */
>> +static int threaded_children_count(struct cgroup *cgrp)
>> +{
>> +struct cgroup *child;
>> +int count = 0;
>> +
>> +lockdep_assert_held(&cgroup_mutex);
>> +cgroup_for_each_live_child(child, cgrp)
>> +if (cgroup_is_threaded(child))
>> +count++;
>> +return count;
>> +}
> It probably would be a good idea to keep track of the count so that we
> don't have to count them each time.  There are cases where people end
> up creating a very high number of cgroups and we've already been
> bitten a couple times with silly complexity issues.

Thanks for the suggestion, I can keep a count in the cgroup strcture to
avoid doing that repetitively.

>
>> @@ -2982,22 +3010,48 @@ static int cgroup_enable_threaded(struct cgroup 
>> *cgrp)
>>  LIST_HEAD(csets);
>>  struct cgrp_cset_link *link;
>>  struct css_set *cset, *cset_next;
>> +struct cgroup *child;
>>  int ret;
>> +u16 ss_mask;
>>  
>>  lockdep_assert_held(&cgroup_mutex);
>>  
>>  /* noop if already threaded */
>> -if (cgrp->proc_cgrp)
>> +if (cgroup_is_threaded(cgrp))
>>  return 0;
>>  
>> -/* allow only if there are neither children or enabled controllers */
>> -if (css_has_online_children(&cgrp->self) || cgrp->subtree_control)
>> +/*
>> + * Allow only if it is not the root and there are:
>> + * 1) no children,
>> + * 2) no non-threaded controllers are enabled, and
>> + * 3) no attached tasks.
>> + *
>> + * With no attached tasks, it is assumed that no css_sets will be
>> + * linked to the current cgroup. This may not be true if some dead
>> + * css_sets linger around due to task_struct leakage, for example.
>> + */
> It doesn't look like the code is actually making this (incorrect)
> assumption.  I suppose the comment is from before
> cgroup_is_populated() was added?

Yes, it is a bug. I should have checked the tasks_count instead of using
cgroup_is_populated. Thanks for catching that.

>
>>  spin_lock_irq(&css_set_lock);
>>  list_for_each_entry(link, &cgrp->cset_links, cset_link) {
>>  cset = link->cset;
>> +if (cset->dead)
>> +continue;
> Hmm... is this a bug fix which is necessary regardless of whether we
> change the threadroot semantics or not?

That is true. I put it there because the the reference counting bug
fixed in patch 6 caused a lot of dead csets hanging around before the
fix. I can pull this out as a separate patch.

Cheers,
Longman

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC PATCH] mm, oom: cgroup-aware OOM-killer

2017-05-22 Thread Roman Gushchin

On Sat, May 20, 2017 at 09:37:29PM +0300, Vladimir Davydov wrote:
> Hello Roman,

Hi Vladimir!

> 
> On Thu, May 18, 2017 at 05:28:04PM +0100, Roman Gushchin wrote:
> ...
> > +5-2-4. Cgroup-aware OOM Killer
> > +
> > +Cgroup v2 memory controller implements a cgroup-aware OOM killer.
> > +It means that it treats memory cgroups as memory consumers
> > +rather then individual processes. Under the OOM conditions it tries
> > +to find an elegible leaf memory cgroup, and kill all processes
> > +in this cgroup. If it's not possible (e.g. all processes belong
> > +to the root cgroup), it falls back to the traditional per-process
> > +behaviour.
> 
> I agree that the current OOM victim selection algorithm is totally
> unfair in a system using containers and it has been crying for rework
> for the last few years now, so it's great to see this finally coming.
> 
> However, I don't reckon that killing a whole leaf cgroup is always the
> best practice. It does make sense when cgroups are used for
> containerizing services or applications, because a service is unlikely
> to remain operational after one of its processes is gone, but one can
> also use cgroups to containerize processes started by a user. Kicking a
> user out for one of her process has gone mad doesn't sound right to me.

I agree, that it's not always a best practise, if you're not allowed
to change the cgroup configuration (e.g. create new cgroups).
IMHO, this case is mostly covered by using the v1 cgroup interface,
which remains unchanged.
If you do have control over cgroups, you can put processes into
separate cgroups, and obtain control over OOM victim selection and killing.

> Another example when the policy you're suggesting fails in my opinion is
> in case a service (cgroup) consists of sub-services (sub-cgroups) that
> run processes. The main service may stop working normally if one of its
> sub-services is killed. So it might make sense to kill not just an
> individual process or a leaf cgroup, but the whole main service with all
> its sub-services.

I agree, although I do not pretend for solving all possible
userspace problems caused by an OOM.

How to react on an OOM - is definitely a policy, which depends
on the workload. Nothing is changing here from how it's working now,
except now kernel will choose a victim cgroup, and kill the victim cgroup
rather than a process.

> And both kinds of workloads (services/applications and individual
> processes run by users) can co-exist on the same host - consider the
> default systemd setup, for instance.
> 
> IMHO it would be better to give users a choice regarding what they
> really want for a particular cgroup in case of OOM - killing the whole
> cgroup or one of its descendants. For example, we could introduce a
> per-cgroup flag that would tell the kernel whether the cgroup can
> tolerate killing a descendant or not. If it can, the kernel will pick
> the fattest sub-cgroup or process and check it. If it cannot, it will
> kill the whole cgroup and all its processes and sub-cgroups.

The last thing we want to do, is to compare processes with cgroups.
I agree, that we can have some option to disable the cgroup-aware OOM at all,
mostly for backward-compatibility. But I don't think it should be a
per-cgroup configuration option, which we will support forever.

> 
> > +
> > +The memory controller tries to make the best choise of a victim cgroup.
> > +In general, it tries to select the largest cgroup, matching given
> > +node/zone requirements, but the concrete algorithm is not defined,
> > +and may be changed later.
> > +
> > +This affects both system- and cgroup-wide OOMs. For a cgroup-wide OOM
> > +the memory controller considers only cgroups belonging to a sub-tree
> > +of the OOM-ing cgroup, including itself.
> ...
> > diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> > index c131f7e..8d07481 100644
> > --- a/mm/memcontrol.c
> > +++ b/mm/memcontrol.c
> > @@ -2625,6 +2625,75 @@ static inline bool memcg_has_children(struct 
> > mem_cgroup *memcg)
> > return ret;
> >  }
> >  
> > +bool mem_cgroup_select_oom_victim(struct oom_control *oc)
> > +{
> > +   struct mem_cgroup *iter;
> > +   unsigned long chosen_memcg_points;
> > +
> > +   oc->chosen_memcg = NULL;
> > +
> > +   if (mem_cgroup_disabled())
> > +   return false;
> > +
> > +   if (!cgroup_subsys_on_dfl(memory_cgrp_subsys))
> > +   return false;
> > +
> > +   pr_info("Choosing a victim memcg because of %s",
> > +   oc->memcg ?
> > +   "memory limit reached of cgroup " :
> > +   "out of memory\n");
> > +   if (oc->memcg) {
> > +   pr_cont_cgroup_path(oc->memcg->css.cgroup);
> > +   pr_cont("\n");
> > +   }
> > +
> > +   chosen_memcg_points = 0;
> > +
> > +   for_each_mem_cgroup_tree(iter, oc->memcg) {
> > +   unsigned long points;
> > +   int nid;
> > +
> > +   if (mem_cgroup_is_root(iter))
> > +   continue;
> > +
> > +   if (memcg_has_children(it

Re: [RFC PATCH v2 12/17] cgroup: Remove cgroup v2 no internal process constraint

2017-05-22 Thread Waiman Long

On 05/19/2017 04:38 PM, Tejun Heo wrote:
> Hello, Waiman.
>
> On Mon, May 15, 2017 at 09:34:11AM -0400, Waiman Long wrote:
>> The rationale behind the cgroup v2 no internal process constraint is
>> to avoid resouorce competition between internal processes and child
>> cgroups. However, not all controllers have problem with internal
>> process competiton. Enforcing this rule may lead to unnatural process
>> hierarchy and unneeded levels for those controllers.
> This isn't necessarily something we can determine by looking at the
> current state of controllers.  It's true that some controllers - pid
> and perf - inherently only care about membership of each task but at
> the same time neither really suffers from the constraint either.  CPU
> which is the problematic one here and currently only cares about tasks
> actually distributes resources which have parts which are specific to
> domain rather than threads and we don't want to declare that CPU isn't
> domain aware resource because it inherently is.

I agree that it is hard to decide which controller should be regarded as
domain aware and which should not be. That is why I don't attempt to do
that in the v2 patchset.

Unlike my v1 patch where each controller has to be specifically marked
as being a resource domain and hence has special directory for internal
process resource control knobs, the v2 patch leaves the decision up to
the userland. Depending on the context, any controllers can now have
special resource control knobs for internal processes in the
cgroup.resource_domain directory by writing the controller name to the
cgroup.resource_control file. So even the CPU controller can be regarded
as domain aware, if necessary. This is all part of my move to give as
much freedom and flexibility to the userland.

>> This patch removes the no internal process contraint by enabling those
>> controllers that don't like internal process competition to have a
>> separate set of control knobs just for internal processes in a cgroup.
>>
>> A new control file "cgroup.resource_control" is added. Enabling a
>> controller with a "+" prefix will create a separate set of control
>> knobs for that controller in the special "cgroup.resource_domain"
>> sub-directory for all the internal processes. The existing control
>> knobs in the cgroup will then be used to manage resource distribution
>> between internal processes as a group and other child cgroups.
> We would need to declare all major resource controllers to be needing
> that special sub-directory.  That'd work around the
> no-internal-process constraint but I don't think it is solving any
> real problems.  It's just the kernel doing something that userland can
> do with ease and more context.

All controllers can use the special sub-directory if userland chooses to
do so. The problem that I am trying to address in this patch is to allow
more natural hierarchy that reflect a certain purpose, like the task
classification done by systemd. Restricting tasks only to leaf nodes
makes the hierarchy unnatural and probably difficult to manage.

Regards,
Longman

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [kernel-hardening] [PATCH v4 next 0/3] modules: automatic module loading restrictions

2017-05-22 Thread Solar Designer

On Mon, May 22, 2017 at 03:49:15PM +0200, Djalal Harouni wrote:
> On Mon, May 22, 2017 at 2:08 PM, Solar Designer  wrote:
> > On Mon, May 22, 2017 at 01:57:03PM +0200, Djalal Harouni wrote:
> >> *) When modules_autoload_mode is set to (2), automatic module loading is
> >> disabled for all. Once set, this value can not be changed.
> >
> > What purpose does this securelevel-like property ("Once set, this value
> > can not be changed.") serve here?  I think this mode 2 is needed, but
> > without this extra property, which is bypassable by e.g. explicitly
> > loaded kernel modules anyway (and that's OK).
> 
> My reasoning about "Once set, this value can not be changed" is mainly for:
> 
> If you have some systems where modules are not updated for any given
> reason, then the only one who will be able to load a module is an
> administrator, basically this is a shortcut for:
> 
> * Apps/services can run with CAP_NET_ADMIN but they are not allowed to
> auto-load 'netdev' modules.
> 
> * Explicitly loading modules can be guarded by seccomp filters *per*
> app, so even if these apps have
>   CAP_SYS_MODULE they won't be able to explicitly load modules, one
> has to remount some sysctl /proc/ entries read-only here and remove
> CAP_SYS_ADMIN for all apps anyway.
> 
> This mainly serves the purpose of these systems that do not receive
> updates, if I don't want to expose those kernel interfaces what should
> I do ? then if I want to unload old versions and replace them with new
> ones what operation should be allowed ? and only real root of the
> system can do it. Hence, the "Once set, this value can not be changed"
> is more of a shortcut, also the idea was put in my mind based on how
> "modules_disabled" is disabled forever, and some other interfaces. I
> would say: it is easy to handle a transition from 1) "hey this system
> is still up to date, some features should be exposed" to 2) "this
> system is not up to date anymore, only root should expose some
> features..."
> 
> Hmm, I am not sure if this answers your question ? :-)

This answers my question, but in a way that I summarize as "there's no
good reason to include this securelevel-like property".

> I definitively don't want to fall into "modules_disabled" trap where
> is it too strict! "Once set, this value can not be changed" means for
> some users do not set it otherwise the system is unusable...
> 
> Maybe an extra "4" mode for that ? better get it right.

I think you should simply exclude this property from mode 2.

The module autoloading restrictions aren't meant to reduce root's
powers; they're only meant to protect processes from shooting themselves
and the system in the foot inadvertently (confused deputy).

modules_disabled may be different in that respect, although with the
rest of the kernel lacking securelevel-like support the point is moot.

We had working securelevel in 2.0.34 through 2.0.40 inclusive, but
we've lost it in 2.1+ with cap-bound apparently never becoming as
complete a replacement for it and having been lost/broken further in
2.6.25+.  I regret this, but that's a different story.  Like I say,
module autoloading doesn't even fit in with those restrictions - it's
about a totally different threat model.

Alexander
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v4 2/3] usb: gadget: f_uac2: split out audio core

2017-05-22 Thread Jassi Brar

On Thu, May 18, 2017 at 4:07 AM, Ruslan Bilovol
 wrote:
> Abstract the peripheral side ALSA sound card code from
> the f_uac2 function into a component that can be called
> by various functions, so the various flavors can be split
> apart and selectively reused.
>
> Visible changes:
>  - add uac_params structure to pass audio paramteres for
>g_audio_setup
>  - make ALSA sound card's name configurable
>  - add [in/out]_ep_maxpsize
>  - allocate snd_uac_chip structure during g_audio_setup
>  - add u_audio_[start/stop]_[capture/playback] functions
>
> Signed-off-by: Ruslan Bilovol 
> ---
>  drivers/usb/gadget/Kconfig|   4 +
>  drivers/usb/gadget/function/Makefile  |   1 +
>  drivers/usb/gadget/function/f_uac2.c  | 721 
> --
>  drivers/usb/gadget/function/u_audio.c | 661 +++
>  drivers/usb/gadget/function/u_audio.h |  95 +
>  drivers/usb/gadget/legacy/Kconfig |   1 +
>  6 files changed, 846 insertions(+), 637 deletions(-)
>  create mode 100644 drivers/usb/gadget/function/u_audio.c
>  create mode 100644 drivers/usb/gadget/function/u_audio.h
>
> diff --git a/drivers/usb/gadget/Kconfig b/drivers/usb/gadget/Kconfig
> index c164d6b..2ba0ace 100644
> --- a/drivers/usb/gadget/Kconfig
> +++ b/drivers/usb/gadget/Kconfig
> @@ -158,6 +158,9 @@ config USB_U_SERIAL
>  config USB_U_ETHER
> tristate
>
> +config USB_U_AUDIO
> +   tristate
> +
>  config USB_F_SERIAL
> tristate
>
> @@ -381,6 +384,7 @@ config USB_CONFIGFS_F_UAC2
> depends on SND
> select USB_LIBCOMPOSITE
> select SND_PCM
> +   select USB_U_AUDIO
> select USB_F_UAC2
> help
>   This Audio function is compatible with USB Audio Class
> diff --git a/drivers/usb/gadget/function/Makefile 
> b/drivers/usb/gadget/function/Makefile
> index cb8c225..b29f2ae 100644
> --- a/drivers/usb/gadget/function/Makefile
> +++ b/drivers/usb/gadget/function/Makefile
> @@ -32,6 +32,7 @@ usb_f_mass_storage-y  := f_mass_storage.o 
> storage_common.o
>  obj-$(CONFIG_USB_F_MASS_STORAGE)+= usb_f_mass_storage.o
>  usb_f_fs-y := f_fs.o
>  obj-$(CONFIG_USB_F_FS) += usb_f_fs.o
> +obj-$(CONFIG_USB_U_AUDIO)  += u_audio.o
>  usb_f_uac1-y   := f_uac1.o u_uac1.o
>  obj-$(CONFIG_USB_F_UAC1)   += usb_f_uac1.o
>  usb_f_uac2-y   := f_uac2.o
> diff --git a/drivers/usb/gadget/function/f_uac2.c 
> b/drivers/usb/gadget/function/f_uac2.c
> index d4565b5..059a14a 100644
> --- a/drivers/usb/gadget/function/f_uac2.c
> +++ b/drivers/usb/gadget/function/f_uac2.c
> @@ -15,10 +15,7 @@
>  #include 
>  #include 
>
> -#include 
> -#include 
> -#include 
> -
> +#include "u_audio.h"
>  #include "u_uac2.h"
>
>  /*
> @@ -50,455 +47,23 @@
>  #define UNFLW_CTRL 8
>  #define OVFLW_CTRL 10
>
> -struct uac2_req {
> -   struct uac2_rtd_params *pp; /* parent param */
> -   struct usb_request *req;
> -};
> -
> -struct uac2_rtd_params {
> -   struct snd_uac2_chip *uac2; /* parent chip */
> -   bool ep_enabled; /* if the ep is enabled */
> -   /* Size of the ring buffer */
> -   size_t dma_bytes;
> -   unsigned char *dma_area;
> -
> -   struct snd_pcm_substream *ss;
> -
> -   /* Ring buffer */
> -   ssize_t hw_ptr;
> -
> -   void *rbuf;
> -
> -   size_t period_size;
> -
> -   unsigned max_psize;
> -   struct uac2_req *ureq;
> -
> -   spinlock_t lock;
> -};
> -
> -struct snd_uac2_chip {
> -   struct uac2_rtd_params p_prm;
> -   struct uac2_rtd_params c_prm;
> -
> -   struct snd_card *card;
> -   struct snd_pcm *pcm;
> -
> -   /* timekeeping for the playback endpoint */
> -   unsigned int p_interval;
> -   unsigned int p_residue;
> -
> -   /* pre-calculated values for playback iso completion */
> -   unsigned int p_pktsize;
> -   unsigned int p_pktsize_residue;
> -   unsigned int p_framesize;
> +struct f_uac2 {
> +   struct g_audio g_audio;
> +   u8 ac_intf, as_in_intf, as_out_intf;
> +   u8 ac_alt, as_in_alt, as_out_alt;   /* needed for get_alt() */
>  };
>
> -#define BUFF_SIZE_MAX  (PAGE_SIZE * 16)
> -#define PRD_SIZE_MAX   PAGE_SIZE
> -#define MIN_PERIODS4
> -
> -static struct snd_pcm_hardware uac2_pcm_hardware = {
> -   .info = SNDRV_PCM_INFO_INTERLEAVED | SNDRV_PCM_INFO_BLOCK_TRANSFER
> -| SNDRV_PCM_INFO_MMAP | SNDRV_PCM_INFO_MMAP_VALID
> -| SNDRV_PCM_INFO_PAUSE | SNDRV_PCM_INFO_RESUME,
> -   .rates = SNDRV_PCM_RATE_CONTINUOUS,
> -   .periods_max = BUFF_SIZE_MAX / PRD_SIZE_MAX,
> -   .buffer_bytes_max = BUFF_SIZE_MAX,
> -   .period_bytes_max = PRD_SIZE_MAX,
> -   .periods_min = MIN_PERIODS,
> -};
> -
> -struct audio_dev {
> -   u8 ac_intf, ac_alt;
> -   u8 as_out_intf, as_out_alt;
> -   u8 as_in_intf, as_in_alt;
> -
> -   struct usb_ep *in_ep, *out_ep;
> -   struct usb_function func;
> -   struc

[PATCH v5 2/3] watchdog: introduce watchdog.open_timeout commandline parameter

2017-05-22 Thread Rasmus Villemoes

The watchdog framework takes care of feeding a hardware watchdog until
userspace opens /dev/watchdogN. If that never happens for some reason
(buggy init script, corrupt root filesystem or whatnot) but the kernel
itself is fine, the machine stays up indefinitely. This patch allows
setting an upper limit for how long the kernel will take care of the
watchdog, thus ensuring that the watchdog will eventually reset the
machine.

This is particularly useful for embedded devices where some fallback
logic is implemented in the bootloader (e.g., use a different root
partition, boot from network, ...).

The open timeout is also used as a maximum time for an application to
re-open /dev/watchdogN after closing it.

A value of 0 (the default) means infinite timeout, preserving the
current behaviour.

Signed-off-by: Rasmus Villemoes 
---
 Documentation/watchdog/watchdog-parameters.txt |  9 +
 drivers/watchdog/watchdog_dev.c| 26 +-
 2 files changed, 34 insertions(+), 1 deletion(-)

diff --git a/Documentation/watchdog/watchdog-parameters.txt 
b/Documentation/watchdog/watchdog-parameters.txt
index 914518a..4801ec6 100644
--- a/Documentation/watchdog/watchdog-parameters.txt
+++ b/Documentation/watchdog/watchdog-parameters.txt
@@ -8,6 +8,15 @@ See Documentation/admin-guide/kernel-parameters.rst for 
information on
 providing kernel parameters for builtin drivers versus loadable
 modules.
 
+The watchdog core currently understands one parameter,
+watchdog.open_timeout. This is the maximum time, in milliseconds, for
+which the watchdog framework will take care of pinging a hardware
+watchdog until userspace opens the corresponding /dev/watchdogN
+device. A value of 0 (the default) means an infinite timeout. Setting
+this to a non-zero value can be useful to ensure that either userspace
+comes up properly, or the board gets reset and allows fallback logic
+in the bootloader to try something else.
+
 
 -
 acquirewdt:
diff --git a/drivers/watchdog/watchdog_dev.c b/drivers/watchdog/watchdog_dev.c
index caa4b90..c807067 100644
--- a/drivers/watchdog/watchdog_dev.c
+++ b/drivers/watchdog/watchdog_dev.c
@@ -66,6 +66,7 @@ struct watchdog_core_data {
struct mutex lock;
unsigned long last_keepalive;
unsigned long last_hw_keepalive;
+   unsigned long open_deadline;
struct delayed_work work;
unsigned long status;   /* Internal status bits */
 #define _WDOG_DEV_OPEN 0   /* Opened ? */
@@ -80,6 +81,21 @@ static struct watchdog_core_data *old_wd_data;
 
 static struct workqueue_struct *watchdog_wq;
 
+static unsigned open_timeout;
+module_param(open_timeout, uint, 0644);
+
+static bool watchdog_past_open_deadline(struct watchdog_core_data *data)
+{
+   if (!open_timeout)
+   return false;
+   return time_is_before_jiffies(data->open_deadline);
+}
+
+static void watchdog_set_open_deadline(struct watchdog_core_data *data)
+{
+   data->open_deadline = jiffies + msecs_to_jiffies(open_timeout);
+}
+
 static inline bool watchdog_need_worker(struct watchdog_device *wdd)
 {
/* All variables in milli-seconds */
@@ -196,7 +212,13 @@ static bool watchdog_worker_should_ping(struct 
watchdog_core_data *wd_data)
 {
struct watchdog_device *wdd = wd_data->wdd;
 
-   return wdd && (watchdog_active(wdd) || watchdog_hw_running(wdd));
+   if (!wdd)
+   return false;
+
+   if (watchdog_active(wdd))
+   return true;
+
+   return watchdog_hw_running(wdd) && 
!watchdog_past_open_deadline(wd_data);
 }
 
 static void watchdog_ping_work(struct work_struct *work)
@@ -857,6 +879,7 @@ static int watchdog_release(struct inode *inode, struct 
file *file)
watchdog_ping(wdd);
}
 
+   watchdog_set_open_deadline(wd_data);
watchdog_update_worker(wdd);
 
/* make sure that /dev/watchdog can be re-opened */
@@ -955,6 +978,7 @@ static int watchdog_cdev_register(struct watchdog_device 
*wdd, dev_t devno)
 
/* Record time of most recent heartbeat as 'just before now'. */
wd_data->last_hw_keepalive = jiffies - 1;
+   watchdog_set_open_deadline(wd_data);
 
/*
 * If the watchdog is running, prevent its driver from being unloaded,
-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v5 0/3] watchdog: allow setting deadline for opening /dev/watchdogN

2017-05-22 Thread Rasmus Villemoes

If a watchdog driver tells the framework that the device is running,
the framework takes care of feeding the watchdog until userspace opens
the device. If the userspace application which is supposed to do that
never comes up properly, the watchdog is fed indefinitely by the
kernel. This can be especially problematic for embedded devices.

These patches allow one to set a maximum time for which the kernel
will feed the watchdog, thus ensuring that either userspace has come
up, or the board gets reset. This allows fallback logic in the
bootloader to attempt some recovery (for example, if an automatic
update is in progress, it could roll back to the previous version).

The patches have been tested on a Raspberry Pi 2 and a Wandboard.

v5 is identical to v4 posted in January, just rebased to current
master (v4.12-rc2).

v4 is mostly identical to v1. The differences are that the ability to
compile out this feature is removed, and the ability to set the
default value for the watchdog.open_timeout command line parameter via
Kconfig is split into a separate patch.

Compared to v2/v3, this drops the ability to set the open_timeout via
a device property; I'll leave implementing that to those who actually
need it.

Rasmus Villemoes (3):
  watchdog: introduce watchdog_worker_should_ping helper
  watchdog: introduce watchdog.open_timeout commandline parameter
  watchdog: introduce CONFIG_WATCHDOG_OPEN_TIMEOUT

 Documentation/watchdog/watchdog-parameters.txt | 10 +++
 drivers/watchdog/Kconfig   |  9 +++
 drivers/watchdog/watchdog_dev.c| 37 +++---
 3 files changed, 52 insertions(+), 4 deletions(-)

-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v5 3/3] watchdog: introduce CONFIG_WATCHDOG_OPEN_TIMEOUT

2017-05-22 Thread Rasmus Villemoes

This allows setting a default value for the watchdog.open_timeout
commandline parameter via Kconfig.

Signed-off-by: Rasmus Villemoes 
---
 Documentation/watchdog/watchdog-parameters.txt | 9 +
 drivers/watchdog/Kconfig   | 9 +
 drivers/watchdog/watchdog_dev.c| 2 +-
 3 files changed, 15 insertions(+), 5 deletions(-)

diff --git a/Documentation/watchdog/watchdog-parameters.txt 
b/Documentation/watchdog/watchdog-parameters.txt
index 4801ec6..6a55a3d 100644
--- a/Documentation/watchdog/watchdog-parameters.txt
+++ b/Documentation/watchdog/watchdog-parameters.txt
@@ -12,10 +12,11 @@ The watchdog core currently understands one parameter,
 watchdog.open_timeout. This is the maximum time, in milliseconds, for
 which the watchdog framework will take care of pinging a hardware
 watchdog until userspace opens the corresponding /dev/watchdogN
-device. A value of 0 (the default) means an infinite timeout. Setting
-this to a non-zero value can be useful to ensure that either userspace
-comes up properly, or the board gets reset and allows fallback logic
-in the bootloader to try something else.
+device. The defalt value is CONFIG_WATCHDOG_OPEN_TIMEOUT. A value of 0
+means an infinite timeout. Setting this to a non-zero value can be
+useful to ensure that either userspace comes up properly, or the board
+gets reset and allows fallback logic in the bootloader to try
+something else.
 
 
 -
diff --git a/drivers/watchdog/Kconfig b/drivers/watchdog/Kconfig
index 8b9049d..11946fb 100644
--- a/drivers/watchdog/Kconfig
+++ b/drivers/watchdog/Kconfig
@@ -52,6 +52,15 @@ config WATCHDOG_SYSFS
  Say Y here if you want to enable watchdog device status read through
  sysfs attributes.
 
+config WATCHDOG_OPEN_TIMEOUT
+   int "Timeout value for opening watchdog device"
+   default 0
+   help
+ The maximum time, in milliseconds, for which the watchdog
+ framework takes care of pinging a hardware watchdog. A value
+ of 0 means infinite. The value set here can be overridden by
+ the commandline parameter "watchdog.open_timeout".
+
 #
 # General Watchdog drivers
 #
diff --git a/drivers/watchdog/watchdog_dev.c b/drivers/watchdog/watchdog_dev.c
index c807067..098b9cb 100644
--- a/drivers/watchdog/watchdog_dev.c
+++ b/drivers/watchdog/watchdog_dev.c
@@ -81,7 +81,7 @@ static struct watchdog_core_data *old_wd_data;
 
 static struct workqueue_struct *watchdog_wq;
 
-static unsigned open_timeout;
+static unsigned open_timeout = CONFIG_WATCHDOG_OPEN_TIMEOUT;
 module_param(open_timeout, uint, 0644);
 
 static bool watchdog_past_open_deadline(struct watchdog_core_data *data)
-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [kernel-hardening] [PATCH v4 next 0/3] modules: automatic module loading restrictions

2017-05-22 Thread Djalal Harouni

Hi Alexander,

On Mon, May 22, 2017 at 2:08 PM, Solar Designer  wrote:
> Hi Djalal,
>
> Thank you for your work on this!
>
> On Mon, May 22, 2017 at 01:57:03PM +0200, Djalal Harouni wrote:
>> *) When modules_autoload_mode is set to (2), automatic module loading is
>> disabled for all. Once set, this value can not be changed.
>
> What purpose does this securelevel-like property ("Once set, this value
> can not be changed.") serve here?  I think this mode 2 is needed, but
> without this extra property, which is bypassable by e.g. explicitly
> loaded kernel modules anyway (and that's OK).

My reasoning about "Once set, this value can not be changed" is mainly for:

If you have some systems where modules are not updated for any given
reason, then the only one who will be able to load a module is an
administrator, basically this is a shortcut for:

* Apps/services can run with CAP_NET_ADMIN but they are not allowed to
auto-load 'netdev' modules.

* Explicitly loading modules can be guarded by seccomp filters *per*
app, so even if these apps have
  CAP_SYS_MODULE they won't be able to explicitly load modules, one
has to remount some sysctl /proc/ entries read-only here and remove
CAP_SYS_ADMIN for all apps anyway.

This mainly serves the purpose of these systems that do not receive
updates, if I don't want to expose those kernel interfaces what should
I do ? then if I want to unload old versions and replace them with new
ones what operation should be allowed ? and only real root of the
system can do it. Hence, the "Once set, this value can not be changed"
is more of a shortcut, also the idea was put in my mind based on how
"modules_disabled" is disabled forever, and some other interfaces. I
would say: it is easy to handle a transition from 1) "hey this system
is still up to date, some features should be exposed" to 2) "this
system is not up to date anymore, only root should expose some
features..."

Hmm, I am not sure if this answers your question ? :-)

I definitively don't want to fall into "modules_disabled" trap where
is it too strict! "Once set, this value can not be changed" means for
some users do not set it otherwise the system is unusable...

Maybe an extra "4" mode for that ? better get it right.

Thanks!

-- 
tixxdz
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 00/41] omap_hsmmc: Add ADMA support and UHS/HS200/DDR support

2017-05-22 Thread Kishon Vijay Abraham I

Hi,

On Saturday 20 May 2017 03:43 AM, Tony Lindgren wrote:
> Hi,
> 
> * Kishon Vijay Abraham I  [170519 01:19]:
>> This series adds UHS, HS200, DDR mode and ADMA support to
>> omap_hsmmc driver used to improve the throughput of MMC/SD in dra7
>> SoCs.
> 
> Certainly seems way less intrusive than earlier before the
> dmaengine changes :)
> 
>> *) tuning ratio of MMC in dra7 is different from sdhci
> 
> Hmm what's the tuning ratio?

For high speed modes like UHS SDR104 and HS200, sampling clock has to adjusted
according to the position of the data window and it varies depending on the
host and the card.
For this we configure the controller with phase delay from 0 to 0x7c (in steps
of 4) and send the tuning command (CMD19 or CMD21) for which the card responds
with a known pattern. We keep track of the phase delay's for which we received
the known patterns (without errors) and select one of the phase delays (3/4th
ratio from largest consecutive successful phase delay window).

sdhci driver makes use of the HW feature for tuning (no manual setting of phase
delays etc). Sdhci has an ops for platform specific tuning but that again has
to be implemented in omap_hsmmc driver.
> 
>> This series has been tested on beagleboard, pandaboard, beaglebone-black,
>> beaglebone, am335x-evm, am437x-evm, dra7xx-evm, dra72x-evm, am571x-idk
>> and am572x-idk.
> 
> I gave this a quick try after manally applying next-20170519
> after reverting 67d0687224a9 ("mm: drop HASH_ADAPT"). Looks like
> something is missing as I got:
> 
> arch/arm/mach-omap2/pdata-quirks.c:445:23: error:
> 'struct omap_hsmmc_platform_data' has no member named 'version'
> ...
> 
> It's possible I messed up something while manually applying.

[PATCH 13/41] mmc: host: omap_hsmmc: Add support to set IODELAY values adds
version member to 'struct omap_hsmmc_platform_data'.
> 
>> I can split the series to go into Ulf Hansson's tree and Tony's tree 
>> separately if that is required.
> 
> Yes please. Maybe send the dts parts first that are ready to
> get merged, like the fixes and all the iodelay configuration.
> 
> Then the mmc driver changes as a separate set.
> 
> Then third set to follow-up and enable things once the mmc
> driver changes are merged.

Sure.

Thanks
Kishon
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v8 4/9] Documentation: perf: hisi: Documentation for HiP05/06/07 PMU event counting.

2017-05-22 Thread Shaokun Zhang

From: Anurup M 

Documentation for perf usage and Hisilicon SoC PMU uncore events.
The Hisilicon SOC has event counters for hardware modules like
L3 cache, Miscellaneous node etc. These events are all uncore.

Signed-off-by: Anurup M 
Signed-off-by: Shaokun Zhang 
---
 Documentation/perf/hisi-pmu.txt | 75 +
 1 file changed, 75 insertions(+)
 create mode 100644 Documentation/perf/hisi-pmu.txt

diff --git a/Documentation/perf/hisi-pmu.txt b/Documentation/perf/hisi-pmu.txt
new file mode 100644
index 000..a21571d
--- /dev/null
+++ b/Documentation/perf/hisi-pmu.txt
@@ -0,0 +1,75 @@
+Hisilicon SoC PMU (Performance Monitoring Unit)
+
+The Hisilicon SoC HiP05/06/07 chips consist of various independent system
+device PMU's such as L3 cache(L3C) and Miscellaneous Nodes(MN).
+These PMU devices are independent and have hardware logic to gather
+statistics and performance information.
+
+HiP0x chips are encapsulated by multiple CPU and IO dies. The CPU die is
+called as Super CPU cluster (SCCL) which includes 16 cpu-cores. Every SCCL
+is further grouped as CPU clusters (CCL) which includes 4 cpu-cores each.
+Each SCCL has 1 L3 cache and 1 MN units.
+
+The L3 cache is shared by all CPU cores in a CPU die. The L3C has four banks
+(or instances). Each bank or instance of L3C has Eight 32-bit counter
+registers and also event control registers. The HiP05/06 chip L3 cache has
+22 statistics events. The HiP07 chip has 66 statistics events. These events
+are very useful for debugging.
+
+The MN module is also shared by all CPU cores in a CPU die. It receives
+barriers and DVM(Distributed Virtual Memory) messages from cpu or smmu, and
+perform the required actions and return response messages. These events are
+very useful for debugging. The MN has total 9 statistics events and support
+four 32-bit counter registers in HiP05/06/07 chips.
+
+There is no memory mapping for L3 cache and MN registers. It can be accessed
+by using the Hisilicon djtag interface. The Djtag in a SCCL is an independent
+module which connects with some modules in the SoC by Debug Bus.
+
+Hisilicon SoC (HiP05/06/07) PMU driver
+--
+The HiP0x PMU driver shall register perf PMU drivers like L3 cache, MN etc.
+The available events and configuration options shall be described in the sysfs.
+The "perf list" shall list the available events from sysfs.
+
+The L3 cache in a SCCL is divided as 4 banks. Each L3 cache bank have separate
+PMU registers for event counting and control. The L3 cache banks also do not
+have any CPU affinity. So each L3 cache banks are registered with perf as a
+separate PMU.
+The PMU name will appear in event listing as hisi_l3c_.
+where "bank-id" is the bank index (0 to 3) and "scl-id" is the SCCL identifier
+e.g. hisi_l3c0_2/read_hit is READ_HIT event of L3 cache bank #0 SCCL ID #2.
+
+The MN in a SCCL is registered as a separate PMU with perf.
+The PMU name will appear in event listing as hisi_mn_.
+e.g. hisi_mn_2/read_req. READ_REQUEST event of MN of Super CPU cluster #2.
+
+The event code is represented by 8 bits.
+   i) event 0-7
+   The event code will be represented using the LSB 8 bits.
+
+The driver also provides a "cpumask" sysfs attribute, which shows the CPU core
+ID used to count the uncore PMU event.
+
+Example usage of perf:
+$# perf list
+hisi_l3c0_2/read_hit/ [kernel PMU event]
+--
+hisi_l3c1_2/write_hit/ [kernel PMU event]
+--
+hisi_l3c0_1/read_hit/ [kernel PMU event]
+--
+hisi_l3c0_1/write_hit/ [kernel PMU event]
+--
+hisi_mn_2/read_req/ [kernel PMU event]
+hisi_mn_2/write_req/ [kernel PMU event]
+--
+
+$# perf stat -a -e "hisi_l3c0_2/read_allocate/" sleep 5
+$# perf stat -A -C 0 -e "hisi_l3c0_2/read_allocate/" sleep 5
+
+The current driver does not support sampling. So "perf record" is unsupported.
+Also attach to a task is unsupported as the events are all uncore.
+
+Note: Please contact the maintainer for a complete list of events supported for
+the PMU devices in the SoC and its information if needed.
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [kernel-hardening] [PATCH v4 next 0/3] modules: automatic module loading restrictions

2017-05-22 Thread Solar Designer

Hi Djalal,

Thank you for your work on this!

On Mon, May 22, 2017 at 01:57:03PM +0200, Djalal Harouni wrote:
> *) When modules_autoload_mode is set to (2), automatic module loading is
> disabled for all. Once set, this value can not be changed.

What purpose does this securelevel-like property ("Once set, this value
can not be changed.") serve here?  I think this mode 2 is needed, but
without this extra property, which is bypassable by e.g. explicitly
loaded kernel modules anyway (and that's OK).

I'm sorry if this has been discussed before.

Alexander
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v4 next 0/3] modules: automatic module loading restrictions

2017-05-22 Thread Djalal Harouni

Hi List,

This is v4 of the automatic module load restriction series.

v1 and v2 implementation were presented as a stackable LSM [1] [2].
v3 was updated to be integrated with the core kernel inside the
capabilities subsystem as suggested by Kees Cook [3].

This v4 is even better, lot of documentation and code comments.

All suggestions were improved and fixed. Please see changelog for more
details.

These patches are against next-20170522

==

Currently, an explicit call to load or unload kernel modules require
CAP_SYS_MODULE capability. However unprivileged users have always been
able to load some modules using the implicit auto-load operation. An
automatic module loading happens when programs request a kernel feature
from a module that is not loaded. In order to satisfy userspace, the
kernel then automatically load all these required modules.

Historically, the kernel was always able to automatically load modules
if they are not blacklisted. This is one of the most important and
transparent operations of Linux, it allows to provide numerous other
features as they are needed which is crucial for a better user experience.
However, as Linux is popular now and used for different appliances some
of these may need to control such operations. For such systems, recent
needs showed that in some cases allowing to control automatic module
loading is as important as the operation itself. Restricting unprivileged
programs or attackers that abuse this feature to load unused modules or
modules that contain bugs is a significant security measure.

This allows administrators or some special programs to have the
appropriate time to update and deny module autoloading in advance, then
blacklist the corresponding ones. Not doing so may affect the global state
of the machine, especially containers where some apps are moved from one
context to another and not having such mechanisms may allow to expose
and exploit the vulnerable parts to escape the container sandbox.

Embedded or IoT devices also started to ship as containers using generic
distros, some vendors do not have the appropriate time to make their own
OS, hence, using base images is getting popular. These setups may include
unnecessary modules that the final applications will not need. Untrusted
access may abuse the module auto-load feature to expose vulnerabilities.

As every code contains bugs or vulnerabilties, the following
vulnerabilities that affected some features that are often compiled as
modules could have been completely blocked, by restricting autoloading
modules if the system does not need them.

Past months:
* DCCP use after free CVE-2017-6074 [4]
  Unprivileged to local root.

* XFRM framework CVE-2017-7184 [5]
  As advertised it seems it was used to break Ubuntu on a security
  contest.

* n_hldc CVE-2017-2636
* L2TPv3 CVE-2016-10200

This is a short list.


To improve the current status, this series introduces a global
"modules_autoload_mode" sysctl flag, and a per-task one.

The sysctl controls modules auto-load feature and complements
"modules_disabled" which apply to all modules operations. This new flag
allows to control only automatic module loading and if it is allowed or
not, aligning in the process the implicit operation with the explicit one
where both now are covered by capabilities checks.

The three modes that "modules_autoload_mode" sysctl support allow to
provide restrictions on automatic module loading without breaking user
experience.

The sysctl flag is available at "/proc/sys/kernel/modules_autoload_mode"

*) When modules_autoload_mode is set to (0), the default, there are no
restrictions.

*) When modules_autoload_mode is set to (1), processes must have
CAP_SYS_MODULE to be able to trigger a module auto-load operation,
or CAP_NET_ADMIN for modules with a 'netdev-%s' alias.

*) When modules_autoload_mode is set to (2), automatic module loading is
disabled for all. Once set, this value can not be changed.

Notes on relation between "modules_disabled=0" and
"modules_autoload_mode=2":
1) Restricting automatic module loading does not interfere with
explicit module load or unload operations.

2) New features provided by modules can be made available without
rebooting the system.

3) A bad version of a module can be unloaded and replaced with a
better one without rebooting the system.


The original idea of module auto-load restriction comes from
'GRKERNSEC_MODHARDEN' config option.

==

The patches also support process trees, containers, and sandboxes by
providing an inherited per-task "modules_autoload_mode" flag that cannot be
re-enabled once disabled. This offers the following advantages:

1) Automatic module loading is still available to the rest of the
system.

2) It is easy to use in containers and sandboxes. DCCP example could
have been used to escape containers. The XFRM framework CVE-2017-7184
needs CA

[PATCH v4 next 2/3] modules:capabilities: automatic module loading restriction

2017-05-22 Thread Djalal Harouni

Currently, an explicit call to load or unload kernel modules require
CAP_SYS_MODULE capability. However unprivileged users have always been
able to load some modules using the implicit auto-load operation. An
automatic module loading happens when programs request a kernel feature
from a module that is not loaded. In order to satisfy userspace, the
kernel then automatically load all these required modules.

Historically, the kernel was always able to automatically load modules
if they are not blacklisted. This is one of the most important and
transparent operations of Linux, it allows to provide numerous other
features as they are needed which is crucial for a better user experience.
However, as Linux is popular now and used for different appliances some
of these may need to control such operations. For such systems, recent
needs showed that in some cases allowing to control automatic module
loading is as important as the operation itself. Restricting unprivileged
programs or attackers that abuse this feature to load unused modules or
modules that contain bugs is a significant security measure.

This allows administrators or some special programs to have the
appropriate time to update and deny module autoloading in advance, then
blacklist the corresponding ones. Not doing so may affect the global state
of the machine, especially containers where some apps are moved from one
context to another and not having such mechanisms may allow to expose
and exploit the vulnerable parts to escape the container sandbox.

Embedded or IoT devices also started to ship as containers using generic
distros, some vendors do not have the appropriate time to make their own
OS, hence, using base images is getting popular. These setups may include
unnecessary modules that the final applications will not need. Untrusted
access may abuse the module auto-load feature to expose vulnerabilities.

As every code contains bugs or vulnerabilties, the following
vulnerabilities that affected some features that are often compiled as
modules could have been completely blocked, by restricting autoloading
modules if the system does not need them.

Past months:
* DCCP use after free CVE-2017-6074 [1]
  Unprivileged to local root.

* XFRM framework CVE-2017-7184 [2]
  As advertised it seems it was used to break Ubuntu on a security
  contest.

* n_hldc CVE-2017-2636
* L2TPv3 CVE-2016-10200

This is a short list. Fixing this is a high priority.

To improve the current status, this patch introduces "modules_autoload_mode"
kernel sysctl flag. The flag controls modules auto-load feature and
complements "modules_disabled" which apply to all modules operations.
This new flag allows to control only automatic module loading and if it is
allowed or not, aligning in the process the implicit operation with the
explicit one where both now are covered by capabilities checks.

The three modes that "modules_autoload_mode" support allow to provide
restrictions on automatic module loading without breaking user
experience.

The sysctl flag is available at "/proc/sys/kernel/modules_autoload_mode"

When modules_autoload_mode is set to (0), the default, there are no
restrictions.

When modules_autoload_mode is set to (1), processes must have
CAP_SYS_MODULE to be able to trigger a module auto-load operation,
or CAP_NET_ADMIN for modules with a 'netdev-%s' alias.

When modules_autoload_mode is set to (2), automatic module loading is
disabled for all. Once set, this value can not be changed.

Notes on relation between "modules_disabled=0" and
"modules_autoload_mode=2":
1) Restricting automatic module loading does not interfere with
explicit module load or unload operations.

2) New features provided by modules can be made available without
rebooting the system.

3) A bad version of a module can be unloaded and replaced with a
better one without rebooting the system.

The original idea of module auto-load restriction comes from
'GRKERNSEC_MODHARDEN' config option.

Testing
---

Example 1)

Before:
$ lsmod | grep ipip -
$ sudo ip tunnel add mytun mode ipip remote 10.0.2.100 local 10.0.2.15 ttl 255
$ lsmod | grep ipip -
ipip   16384  0
tunnel416384  1 ipip
ip_tunnel  28672  1 ipip
$ cat /proc/sys/kernel/modules_autoload_mode
0

After:
$ lsmod | grep ipip -
# echo 2 > /proc/sys/kernel/modules_autoload_mode
$ sudo ip tunnel add mytun mode ipip remote 10.0.2.100 local 10.0.2.15 ttl 255
add tunnel "tunl0" failed: No such device
$ dmesg
...
[ 1876.378389] module: automatic module loading of netdev-tunl0 by "ip"[1453] 
was denied
[ 1876.380994] module: automatic module loading of tunl0 by "ip"[1453] was 
denied
...
$ lsmod | grep ipip -
$

Example 2)

DCCP use after free CVE-2017-6074:
The code path can be triggered by unprivileged, using the trigger.c
program for DCCP use after free [3] and that was fixed by
commit 5edabca9d4cff7f "dccp: fix freeing skb too early for IPV6_RECVPKTINFO".

Before:
$ lsmod | grep dccp
$ strace ./dccp_t

[PATCH v4 next 3/3] modules:capabilities: add a per-task modules auto-load mode

2017-05-22 Thread Djalal Harouni

Previous patches added the global sysctl "modules_autoload_mode". This patch
make it possible to support process trees, containers, and sandboxes by
providing an inherited per-task "modules_autoload_mode" flag that cannot be
re-enabled once disabled. This allows to restrict automatic module loading
without affecting the rest of the system.

Why we need this ?

Usually a request to a kernel feature that is implemented by a module
that is not loaded may trigger automatic module loading feature,
allowing to transparently satisfy userspace, and provide numeours
features as they are needed. In this case an implicit kernel module load
operation happens.

In most cases to load or unload a kernel module, an explicit operation
happens where programs are required to have CAP_SYS_MODULE capability to
perform so. However, in general with implicit module loading, no
capabilities are required as automatic module loading is one of the most
important and transparent operations of Linux.

Recent vulnerabilities showed that automatic module loading can be
abused in order to expose more bugs. Some of these vulnerabilities are:

* DCCP use after free CVE-2017-6074 [1]
  Unprivileged to local root PoC.

* XFRM framework CVE-2017-7184 [2]
  As advertised it seems it was used to break Ubuntu at a security
  contest.

* n_hldc CVE-2017-2636
* L2TPv3 CVE-2016-10200

Currently most of Linux code is in a form of modules, and not all
modules are written or maintained in the same way. In a container or
sandbox world, apps can be moved from one context to another or from
one Linux system to another one, the ability to restrict some of these
apps to load extra kernel modules will prevent exposing some kernel
interfaces that have not been updated withing such systems.

The DCCP vulnerability CVE-2017-6074 that can be triggered by
unprivileged, or CVE-2017-7184 in the XFRM framework are some recent
real examples. CVE-2017-7184 was used to break Ubuntu at a security
contest. Ubuntu is more of desktop distro, using a global switch to
disable automatic module loading will harm users. Actually this design
will always end up being ignored by such kind of systems that need to
offer a competitive and interactive solution for their users.

>From this and from observing how apps are being run, this patch
introduces a per-task "modules_autoload_mode" to restrict automatic
module loading. This offers the following advantages:

1) Automatic module loading is still available to the rest of the
system.

2) It is easy to use in containers and sandboxes. DCCP example could
have been used to escape containers. The XFRM framework CVE-2017-7184
needs CAP_NET_ADMIN, but attackers may start to target CAP_NET_ADMIN,
a per-task flag will make it harder.

3) Suitable for desktop and more interactive Linux systems.

4) Will allow in future to implement a per user policy.
The user database format is old and not extensible, as discussed maybe
with a modern format we may achieve the following:

User=djalal
NewKernelFeatures=yes

Which means that that interactive user will be allowed to load extra
Linux features. Others, volatile accounts or guests can be easily
blocked from doing so.

5) CAP_NET_ADMIN is useful, it handles lot of operations, at same time it
started to look more like CAP_SYS_ADMIN which is overloaded. We need
CAP_NET_ADMIN, containers need it, but at same time maybe we do not
want programs running with it to load 'netdev-%s' modules. Having an
extra per-task flag allow to discharge a bit CAP_NET_ADMIN and clearly
target automatic module loading operations.

Usage:
--

To set the per-task "modules_autoload_mode":

prctl(PR_SET_MODULES_AUTOLOAD_MODE, mode, 0, 0, 0);

When a module auto-load request is triggered by current task, then the
operation has first to satisfy the per-task access mode before attempting
to implicitly load the module. Once set, this setting is inherited across
fork, clone and execve.

Prior to use, the task must call prctl(PR_SET_NO_NEW_PRIVS, 1) or run with
CAP_SYS_ADMIN privileges in its namespace.  If these are not true, -EACCES
will be returned.  This requirement ensures that unprivileged programs cannot
affect the behaviour or surprise privileged children.

The per-task "modules_autoload_mode" supports the following values:
0   There are no restrictions, usually the default unless set
by parent.
1   The task must have CAP_SYS_MODULE to be able to trigger a
module auto-load operation, or CAP_NET_ADMIN for modules with
a 'netdev-%s' alias.
2   Automatic modules loading is disabled for the current task.

The mode may only be increased, never decreased, thus ensuring that once
applied, processes can never relax their setting. This make it easy for
developers and users to handle.

Note that even if the per-task "modules_autoload_mode" allows to auto-load
the corresponding modules, automatic module loading may still fail due to
the global sysctl "modules_autoload_mode". For more detai

[PATCH v4 next 1/3] modules:capabilities: allow __request_module() to take a capability argument

2017-05-22 Thread Djalal Harouni

This is a preparation patch for the module auto-load restriction feature.

In order to restrict module auto-load operations we need to check if the
caller has CAP_SYS_MODULE capability. This allows to align security
checks of automatic module loading with the checks of the explicit operations.

However for "netdev-%s" modules, they are allowed to be loaded if
CAP_NET_ADMIN is set. Therefore, in order to not break this assumption,
and allow userspace to only load "netdev-%s" modules with CAP_NET_ADMIN
capability which is considered a privileged operation, we have two
choices: 1) parse "netdev-%s" alias and check the capability or 2) hand
the capability form request_module() to security_kernel_module_request()
hook and let the capability subsystem decide.

After a discussion with Rusty Russell [1], the suggestion was to pass
the capability from request_module() to security_kernel_module_request()
for 'netdev-%s' modules that need CAP_NET_ADMIN.

The patch does not update request_module(), it updates the internal
__request_module() that will take an extra "allow_cap" argument. If
positive, then automatic module load operation can be allowed.

__request_module() will be only called by networking code which is the
exception to this, so we do not break userspace and CAP_NET_ADMIN can
continue to load 'netdev-%s' modules. Other kernel code should continue
to use request_module() which calls security_kernel_module_request() and
will check for CAP_SYS_MODULE capability in next patch. Allowing more
control on who can trigger automatic module loading.

This patch updates security_kernel_module_request() to take the
'allow_cap' argument and SELinux which is currently the only user of
security_kernel_module_request() hook.

Based on patch by Rusty Russell:
https://lkml.org/lkml/2017/4/26/735

Cc: Serge Hallyn 
Cc: Andy Lutomirski 
Suggested-by: Rusty Russell 
Suggested-by: Kees Cook 
Signed-off-by: Djalal Harouni 

[1] https://lkml.org/lkml/2017/4/24/7
---
 include/linux/kmod.h  | 15 ---
 include/linux/lsm_hooks.h |  4 +++-
 include/linux/security.h  |  4 ++--
 kernel/kmod.c | 15 +--
 net/core/dev_ioctl.c  | 10 +-
 security/security.c   |  4 ++--
 security/selinux/hooks.c  |  2 +-
 7 files changed, 38 insertions(+), 16 deletions(-)

diff --git a/include/linux/kmod.h b/include/linux/kmod.h
index c4e441e..a314432 100644
--- a/include/linux/kmod.h
+++ b/include/linux/kmod.h
@@ -32,18 +32,19 @@
 extern char modprobe_path[]; /* for sysctl */
 /* modprobe exit status on success, -ve on error.  Return value
  * usually useless though. */
-extern __printf(2, 3)
-int __request_module(bool wait, const char *name, ...);
-#define request_module(mod...) __request_module(true, mod)
-#define request_module_nowait(mod...) __request_module(false, mod)
+extern __printf(3, 4)
+int __request_module(bool wait, int allow_cap, const char *name, ...);
 #define try_then_request_module(x, mod...) \
-   ((x) ?: (__request_module(true, mod), (x)))
+   ((x) ?: (__request_module(true, -1, mod), (x)))
 #else
-static inline int request_module(const char *name, ...) { return -ENOSYS; }
-static inline int request_module_nowait(const char *name, ...) { return 
-ENOSYS; }
+static inline __printf(3, 4)
+int __request_module(bool wait, int allow_cap, const char *name, ...)
+{ return -ENOSYS; }
 #define try_then_request_module(x, mod...) (x)
 #endif
 
+#define request_module(mod...) __request_module(true, -1, mod)
+#define request_module_nowait(mod...) __request_module(false, -1, mod)
 
 struct cred;
 struct file;
diff --git a/include/linux/lsm_hooks.h b/include/linux/lsm_hooks.h
index f7914d9..7688f79 100644
--- a/include/linux/lsm_hooks.h
+++ b/include/linux/lsm_hooks.h
@@ -578,6 +578,8 @@
  * Ability to trigger the kernel to automatically upcall to userspace for
  * userspace to load a kernel module with the given name.
  * @kmod_name name of the module requested by the kernel
+ * @allow_cap capability that allows to automatically load a kernel
+ * module.
  * Return 0 if successful.
  * @kernel_read_file:
  * Read a file specified by userspace.
@@ -1516,7 +1518,7 @@ union security_list_options {
void (*cred_transfer)(struct cred *new, const struct cred *old);
int (*kernel_act_as)(struct cred *new, u32 secid);
int (*kernel_create_files_as)(struct cred *new, struct inode *inode);
-   int (*kernel_module_request)(char *kmod_name);
+   int (*kernel_module_request)(char *kmod_name, int allow_cap);
int (*kernel_read_file)(struct file *file, enum kernel_read_file_id id);
int (*kernel_post_read_file)(struct file *file, char *buf, loff_t size,
 enum kernel_read_file_id id);
diff --git a/include/linux/security.h b/include/linux/security.h
index 549cb82..2f4c9d3 100644
--- a/include/linux/security.h
+++ b/include/linux/security.h
@@ -325,7 +325,7 @@ int security_prepare_creds(struct cred *ne

Re: [PATCH v2] kexec/kdump: Minor Documentation updates for arm64 and Image

2017-05-22 Thread Simon Horman

On Mon, May 22, 2017 at 12:39:59PM +0530, Pratyush Anand wrote:
> 
> 
> On Thursday 18 May 2017 04:23 PM, Bharat Bhushan wrote:
> >This patch have minor updates in Documentation for arm64i as
> arm64
> >relocatable kernel.
> >Also this patch updates documentation for using uncompressed
> >image "Image" which is used for ARM64.
> >
> >Signed-off-by: Bharat Bhushan 
> >---
> >v1->v2
> >  - "a uncompressed" replaced with "an uncompressed"
> 
> Reviewed-by: Pratyush Anand 

Reviewed-by: Simon Horman 
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v10 02/10] doc: Add documentation for Coresight CPU debug

2017-05-22 Thread Leo Yan

On Mon, May 22, 2017 at 11:16:00AM +0100, Liviu Dudau wrote:
> On Fri, May 19, 2017 at 12:25:49PM +0800, Leo Yan wrote:
> > Add detailed documentation for Coresight CPU debug driver, which
> > contains the info for driver implementation, Mike Leach excellent
> > summary for "clock and power domain". At the end some examples on how
> > to enable the debugging functionality are provided.
> 
> Hi Leo,
> 
> Below are my minor suggestions to improve readability of the documentation.

Thanks, Liviu. Accept all suggestions and will spin a new version.

[...]

Thanks,
Leo Yan
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v10 02/10] doc: Add documentation for Coresight CPU debug

2017-05-22 Thread Liviu Dudau

On Fri, May 19, 2017 at 12:25:49PM +0800, Leo Yan wrote:
> Add detailed documentation for Coresight CPU debug driver, which
> contains the info for driver implementation, Mike Leach excellent
> summary for "clock and power domain". At the end some examples on how
> to enable the debugging functionality are provided.

Hi Leo,

Below are my minor suggestions to improve readability of the documentation.

Thanks,
Liviu

> 
> Suggested-by: Mike Leach 
> Reviewed-by: Mathieu Poirier 
> Signed-off-by: Leo Yan 
> ---
>  Documentation/trace/coresight-cpu-debug.txt | 174 
> 
>  1 file changed, 174 insertions(+)
>  create mode 100644 Documentation/trace/coresight-cpu-debug.txt
> 
> diff --git a/Documentation/trace/coresight-cpu-debug.txt 
> b/Documentation/trace/coresight-cpu-debug.txt
> new file mode 100644
> index 000..f0c3f0f
> --- /dev/null
> +++ b/Documentation/trace/coresight-cpu-debug.txt
> @@ -0,0 +1,174 @@
> + Coresight CPU Debug Module
> + ==
> +
> +   Author:   Leo Yan 
> +   Date: April 5th, 2017
> +
> +Introduction
> +
> +
> +Coresight CPU debug module is defined in ARMv8-a architecture reference 
> manual
> +(ARM DDI 0487A.k) Chapter 'Part H: External debug', the CPU can integrate
> +debug module and it is mainly used for two modes: self-hosted debug and
> +external debug. Usually the external debug mode is well known as the external
> +debugger connects with SoC from JTAG port; on the other hand the program can
> +explore debugging method which rely on self-hosted debug mode, this document
> +is to focus on this part.
> +
> +The debug module provides sample-based profiling extension, which can be used
> +to sample CPU program counter, secure state and exception level, etc; usually
> +every CPU has one dedicated debug module to be connected. Based on 
> self-hosted
> +debug mechanism, Linux kernel can access these related registers from mmio
> +region when the kernel panic happens. The callback notifier for kernel panic
> +will dump related registers for every CPU; finally this is good for assistant
> +analysis for panic.
> +
> +
> +Implementation
> +--
> +
> +- During driver registration, use EDDEVID and EDDEVID1 two device ID

During driver registration, it uses EDDEVID and EDDEVID1 - two device ID

> +  registers to decide if sample-based profiling is implemented or not. On 
> some
> +  platforms this hardware feature is fully or partialy implemented; and if
> +  this feature is not supported then registration will fail.
> +
> +- When write this doc, the debug driver mainly relies on three sampling
> +  registers. The kernel panic callback notifier gathers info from EDPCSR
> +  EDVIDSR and EDCIDSR; from EDPCSR we can get program counter, EDVIDSR has

At the time this documentation was writen, the debug driver mainly relies on
information gathered by the kernel panic callback notifier from three sampling
registers: EDPCSR, EDVIDSR and EDCIDSR: 

> +  information for secure state, exception level, bit width, etc; EDCIDSR is
> +  context ID value which contains the sampled value of CONTEXTIDR_EL1.
> +
> +- The driver supports CPU running mode with either AArch64 or AArch32. The

... supports a CPU running in either AArch64 or AArch32 mode. The

> +  registers naming convention is a bit different between them, AArch64 uses
> +  'ED' for register prefix (ARM DDI 0487A.k, chapter H9.1) and AArch32 uses
> +  'DBG' as prefix (ARM DDI 0487A.k, chapter G5.1). The driver is unified to
> +  use AArch64 naming convention.
> +
> +- ARMv8-a (ARM DDI 0487A.k) and ARMv7-a (ARM DDI 0406C.b) have different
> +  register bits definition. So the driver consolidates two difference:
> +
> +  If PCSROffset=0b, on ARMv8-a the feature of EDPCSR is not implemented;
> +  but ARMv7-a defines "PCSR samples are offset by a value that depends on the
> +  instruction set state". For ARMv7-a, the driver checks furthermore if CPU
> +  runs with ARM or thumb instruction set and calibrate PCSR value, the
> +  detailed description for offset is in ARMv7-a ARM (ARM DDI 0406C.b) chapter
> +  C11.11.34 "DBGPCSR, Program Counter Sampling Register".
> +
> +  If PCSROffset=0b0010, ARMv8-a defines "EDPCSR implemented, and samples have
> +  no offset applied and do not sample the instruction set state in AArch32
> +  state". So on ARMv8 if EDDEVID1.PCSROffset is 0b0010 and the CPU operates
> +  in AArch32 state, EDPCSR is not sampled; when the CPU operates in AArch64
> +  state EDPCSR is sampled and no offset are applied.
> +
> +
> +Clock and power domain
> +--
> +
> +Before accessing debug registers, we should ensure the clock and power domain
> +have been enabled properly. In ARMv8-a ARM (ARM DDI 0487A.k) chapter 'H9.1
> +Debug registers', the debug registers are spread into two domains: the debug
> +domain and the CPU domain.
> +
> ++---+
> +

Re: [PATCH v2 00/10] Initial Allwinner R40 support

2017-05-22 Thread Linus Walleij

On Thu, May 4, 2017 at 3:49 PM, Icenowy Zheng  wrote:

> This is the first non-RFC version of this patchset, which added basical
> support including I2C, UART and MMC to the mainline Linux.
>
> The pinctrl driver of A20 is also merged into the one of A10 before
> R40 support is added into the A10 driver.

I'd be happy to merge the pinctrl parts as soon as you fixed the
things pointed out during review. Include Rob's ACKs on the
DT binding patches please.

Yours,
Linus Walleij
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2] kexec/kdump: Minor Documentation updates for arm64 and Image

2017-05-22 Thread Pratyush Anand




On Thursday 18 May 2017 04:23 PM, Bharat Bhushan wrote:

This patch have minor updates in Documentation for arm64i as

arm64

relocatable kernel.
Also this patch updates documentation for using uncompressed
image "Image" which is used for ARM64.

Signed-off-by: Bharat Bhushan 
---
v1->v2
  - "a uncompressed" replaced with "an uncompressed"


Reviewed-by: Pratyush Anand 



 Documentation/kdump/kdump.txt | 12 +---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/Documentation/kdump/kdump.txt b/Documentation/kdump/kdump.txt
index 615434d..5181445 100644
--- a/Documentation/kdump/kdump.txt
+++ b/Documentation/kdump/kdump.txt
@@ -112,8 +112,8 @@ There are two possible methods of using Kdump.
 2) Or use the system kernel binary itself as dump-capture kernel and there is
no need to build a separate dump-capture kernel. This is possible
only with the architectures which support a relocatable kernel. As
-   of today, i386, x86_64, ppc64, ia64 and arm architectures support 
relocatable
-   kernel.
+   of today, i386, x86_64, ppc64, ia64, arm and arm64 architectures support
+   relocatable kernel.

 Building a relocatable kernel is advantageous from the point of view that
 one does not have to build a second kernel for capturing the dump. But
@@ -339,7 +339,7 @@ For arm:
 For arm64:
- Use vmlinux or Image

-If you are using a uncompressed vmlinux image then use following command
+If you are using an uncompressed vmlinux image then use following command
 to load dump-capture kernel.

kexec -p  \
@@ -361,6 +361,12 @@ to load dump-capture kernel.
--dtb= \
--append="root= "

+If you are using an uncompressed Image, then use following command
+to load dump-capture kernel.
+
+   kexec -p  \
+   --initrd= \
+   --append="root= "

 Please note, that --args-linux does not need to be specified for ia64.
 It is planned to make this a no-op on that architecture, but for now


--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] kexec/kdump: Minor Documentation updates for arm64 and Image

2017-05-22 Thread Pratyush Anand




On Monday 22 May 2017 12:19 PM, Bharat Bhushan wrote:

On Friday 19 May 2017 09:15 AM, AKASHI Takahiro wrote:

+to load dump-capture kernel.
+
+   kexec -p  \
+   --initrd= \
+   --append="root= "

For uncompressed Image, dtb is not necessary?

Just for clarification, dtb is optional for both vmlinux and Image on
arm64. (This means you can specify it if you want.) But this is also
true for initrd and append(command line) to some extent.


Yes, I agree.


Should I mention "-dtb" also for "Image"?


No,I think it is fine.

This documentation is representing a typical use case and so above changes is 
OK for me. I think,your v2 is fine.


~Pratyush



Also do we need to mention that it is optional somewhere in this document? I do not see 
"optional" is mentioned for other parameters and architecture.

Does this looks ok:

" -dtb= \"

Thanks
-Bharat



More precisely, whether these parameters are optional or not will
depend on architectures, not formats, I believe.


May be not architecture, rather a distro environment.

For example, we should be able to work without --initrd for any arch if kernel
has been compiled by configuring CONFG_INITRAMFS_SOURCE.

~Pratyush



--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2 0/4] crypto: async crypto op fixes

Re: [PATCH] input: edt-ft5x06: increase allowed data range for threshold parameter

Re: [kernel-hardening] [PATCH v4 next 0/3] modules: automatic module loading restrictions

Re: [kernel-hardening] [PATCH v4 next 0/3] modules: automatic module loading restrictions

Re: [kernel-hardening] [PATCH v4 next 0/3] modules: automatic module loading restrictions

Re: [PATCH v4 next 2/3] modules:capabilities: automatic module loading restriction

Re: [PATCH v4 next 1/3] modules:capabilities: allow __request_module() to take a capability argument

Re: [kernel-hardening] [PATCH v4 next 0/3] modules: automatic module loading restrictions

[GIT PULL] Power management updates for v4.12-rc3

Re: [PATCH v5 0/3] watchdog: allow setting deadline for opening /dev/watchdogN

Re: [PATCH 23/29] vfio-mediated-device.txt: standardize document format

Re: [PATCH v5 0/3] watchdog: allow setting deadline for opening /dev/watchdogN

Re: [RFC PATCH v2 11/17] cgroup: Implement new thread mode semantics

Re: [RFC PATCH v2 11/17] cgroup: Implement new thread mode semantics

Re: [RFC PATCH] mm, oom: cgroup-aware OOM-killer

Re: [RFC PATCH v2 12/17] cgroup: Remove cgroup v2 no internal process constraint

Re: [kernel-hardening] [PATCH v4 next 0/3] modules: automatic module loading restrictions

Re: [PATCH v4 2/3] usb: gadget: f_uac2: split out audio core

[PATCH v5 2/3] watchdog: introduce watchdog.open_timeout commandline parameter

[PATCH v5 0/3] watchdog: allow setting deadline for opening /dev/watchdogN

[PATCH v5 3/3] watchdog: introduce CONFIG_WATCHDOG_OPEN_TIMEOUT

Re: [kernel-hardening] [PATCH v4 next 0/3] modules: automatic module loading restrictions

Re: [PATCH 00/41] omap_hsmmc: Add ADMA support and UHS/HS200/DDR support

[PATCH v8 4/9] Documentation: perf: hisi: Documentation for HiP05/06/07 PMU event counting.

Re: [kernel-hardening] [PATCH v4 next 0/3] modules: automatic module loading restrictions

[PATCH v4 next 0/3] modules: automatic module loading restrictions

[PATCH v4 next 2/3] modules:capabilities: automatic module loading restriction

[PATCH v4 next 3/3] modules:capabilities: add a per-task modules auto-load mode

[PATCH v4 next 1/3] modules:capabilities: allow __request_module() to take a capability argument

Re: [PATCH v2] kexec/kdump: Minor Documentation updates for arm64 and Image

Re: [PATCH v10 02/10] doc: Add documentation for Coresight CPU debug

Re: [PATCH v10 02/10] doc: Add documentation for Coresight CPU debug

Re: [PATCH v2 00/10] Initial Allwinner R40 support

Re: [PATCH v2] kexec/kdump: Minor Documentation updates for arm64 and Image

Re: [PATCH] kexec/kdump: Minor Documentation updates for arm64 and Image

35 matches

Site Navigation

Mail list logo

Footer information