Re: [patch -mm 3/4] mm, memcg: replace memory.oom_group with policy tunable

2018-01-23 Thread David Rientjes
On Tue, 23 Jan 2018, Michal Hocko wrote:

> > It can't, because the current patchset locks the system into a single 
> > selection criteria that is unnecessary and the mount option would become a 
> > no-op after the policy per subtree becomes configurable by the user as 
> > part of the hierarchy itself.
> 
> This is simply not true! OOM victim selection has changed in the
> past and will be always a subject to changes in future. Current
> implementation doesn't provide any externally controlable selection
> policy and therefore the default can be assumed. Whatever that default
> means now or in future. The only contract added here is the kill full
> memcg if selected and that can be implemented on _any_ selection policy.
> 

The current implementation of memory.oom_group is based on top of a 
selection implementation that is broken in three ways I have listed for 
months:

 - allows users to intentionally/unintentionally evade the oom killer,
   requires not locking the selection implementation for the entire
   system, requires subtree control to prevent, makes a mount option
   obsolete, and breaks existing users who would use the implementation
   based on 4.16 if this were merged,

 - unfairly compares the root mem cgroup vs leaf mem cgroup such that
   users must structure their hierarchy only for 4.16 in such a way
   that _all_ processes are under hierarchical control and have no
   power to create sub cgroups because of the point above and
   completely breaks any user of oom_score_adj in a completely
   undocumented and unspecified way, such that fixing that breakage
   would also break any existing users who would use the implementation
   based on 4.16 if this were merged, and

 - does not allow userspace to protect important cgroups, which can be
   built on top.

I'm focused on fixing the breakage in the first two points since it 
affects the API and we don't want to switch that out from the user.  I 
have brought these points up repeatedly and everybody else has actively 
disengaged from development, so I'm proposing incremental changes that 
make the cgroup aware oom killer have a sustainable API and isn't useful 
only for a highly specialized usecase where everything is containerized, 
nobody can create subcgroups, and nobody uses oom_score_adj to break the 
root mem cgroup accounting.
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v9 6/7] arm64: kvm: Set Virtual SError Exception Syndrome for guest

2018-01-23 Thread James Morse
Hi Dongjiu Geng,

On 06/01/18 16:02, Dongjiu Geng wrote:
> RAS Extension add a VSESR_EL2 register which can provide
> the syndrome value reported to software on taking a virtual
> SError interrupt exception. This patch supports to specify
> this Syndrome.
> 
> In the RAS Extensions we can not set all-zero syndrome value
> for SError, which means 'RAS error: Uncategorized' instead of
> 'no valid ISS'. So set it to IMPLEMENTATION DEFINED syndrome
> by default.
> 
> We also need to support userspace to specify a valid syndrome
> value, Because in some case, the recovery is driven by userspace.
> This patch can support that userspace specify it.
> 
> In the guest/host world switch, restore this value to VSESR_EL2
> only when HCR_EL2.VSE is set. This value no need to be saved
> because it is stale vale when guest exit.

A version of this patch has been queued by Catalin.

Now that the cpufeature bits are queued, I think this can be split up into two
separate series for v4.16-rc1, one to tackle NOTIFY_SEI and the associated
plumbing. The second for the KVM 'make SError pending' API.


> Signed-off-by: Dongjiu Geng 
> [Set an impdef ESR for Virtual-SError]
> Signed-off-by: James Morse 

I didn't sign-off this patch. If you pick some bits from another version and
want to credit someone else you can 'CC:' them or just mention it in the
commit-message.


> diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
> index 47b967d..3b035cc 100644
> --- a/arch/arm64/include/asm/sysreg.h
> +++ b/arch/arm64/include/asm/sysreg.h
> @@ -86,6 +86,9 @@
>  #define REG_PSTATE_PAN_IMM   sys_reg(0, 0, 4, 0, 4)
>  #define REG_PSTATE_UAO_IMM   sys_reg(0, 0, 4, 0, 3)
>  
> +/* virtual SError exception syndrome register */
> +#define REG_VSESR_EL2  sys_reg(3, 4, 5, 2, 3)

Irrelevant-Nit: sys-regs usually have a 'SYS_' prefix, and are in instruction
encoding order lower down the file.

(These PSTATE PAN things are a bit odd as they were used to generate and
instruction before the fancy {read,write}_sysreg() helpers were added).


>  #define SET_PSTATE_PAN(x) __emit_inst(0xd500 | REG_PSTATE_PAN_IMM |  
> \
> (!!x)<<8 | 0x1f)
>  #define SET_PSTATE_UAO(x) __emit_inst(0xd500 | REG_PSTATE_UAO_IMM |  
> \
> diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
> index 738ae90..ffad42b 100644
> --- a/arch/arm64/kvm/guest.c
> +++ b/arch/arm64/kvm/guest.c
> @@ -279,7 +279,16 @@ int kvm_arch_vcpu_ioctl_set_sregs(struct kvm_vcpu *vcpu,
>  
>  int kvm_arm_set_sei_esr(struct kvm_vcpu *vcpu, u32 *syndrome)

Bits of this are spread between patches 5 and 6. If you put them in the other
order this wouldn't happen.

(but after a rebase most of this patch should disappear)

>  {
> - return -EINVAL;
> + u64 reg = *syndrome;
> +
> + /* inject virtual system Error or asynchronous abort */
> + kvm_inject_vabt(vcpu);

So this writes an impdef ESR, because its the existing code-path in KVM.


> + if (reg)
> + /* set vsesr_el2[24:0] with value that user space specified */
> + kvm_vcpu_set_vsesr(vcpu, reg & ESR_ELx_ISS_MASK);

And then you overwrite it. Which is a bit odd as there is a helper to do both in
one go:


> +
> + return 0;
>  }

>  int __attribute_const__ kvm_target_cpu(void)

> diff --git a/arch/arm64/kvm/inject_fault.c b/arch/arm64/kvm/inject_fault.c
> index 3556715..fb94b5e 100644
> --- a/arch/arm64/kvm/inject_fault.c
> +++ b/arch/arm64/kvm/inject_fault.c
> @@ -246,14 +246,25 @@ void kvm_inject_undefined(struct kvm_vcpu *vcpu)
>   inject_undef64(vcpu);
>  }
>  
> +static void pend_guest_serror(struct kvm_vcpu *vcpu, u64 esr)
> +{
> + kvm_vcpu_set_vsesr(vcpu, esr);
> + vcpu_set_hcr(vcpu, vcpu_get_hcr(vcpu) | HCR_VSE);
> +}

How come you don't use this in kvm_arm_set_sei_esr()?



Thanks,

James
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v9 5/7] arm64: kvm: Introduce KVM_ARM_SET_SERROR_ESR ioctl

2018-01-23 Thread James Morse
Hi Dongjiu Geng,

On 06/01/18 16:02, Dongjiu Geng wrote:
> The ARM64 RAS SError Interrupt(SEI) syndrome value is specific to the
> guest and user space needs a way to tell KVM this value. So we add a
> new ioctl. Before user space specifies the Exception Syndrome Register
> ESR(ESR), it firstly checks that whether KVM has the capability to
> set the guest ESR, If has, will set it. Otherwise, nothing to do.
> 
> For this ESR specifying, Only support for AArch64, not support AArch32.

After this patch user-space can trigger an SError in the guest. If it wants to
migrate the guest, how does the pending SError get migrated?

I think we need to fix migration first. Andrew Jones suggested using
KVM_GET/SET_VCPU_EVENTS:
https://www.spinics.net/lists/arm-kernel/msg616846.html

Given KVM uses kvm_inject_vabt() on v8.0 hardware too, we should cover systems
without the v8.2 RAS Extensions with the same API. I think this means a bit to
read/write whether SError is pending, and another to indicate the ESR should be
set/read.
CPUs without the v8.2 RAS Extensions can reject pending-SError that had an ESR.

user-space can then use the 'for migration' calls to make a 'new' SError 
pending.

Now that the cpufeature bits are queued, I think this can be split up into two
separate series for v4.16-rc1, one to tackle NOTIFY_SEI and the associated
plumbing. The second for the KVM 'make SError pending' API.


> diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
> index 5c7f657..738ae90 100644
> --- a/arch/arm64/kvm/guest.c
> +++ b/arch/arm64/kvm/guest.c
> @@ -277,6 +277,11 @@ int kvm_arch_vcpu_ioctl_set_sregs(struct kvm_vcpu *vcpu,
>   return -EINVAL;
>  }
>  
> +int kvm_arm_set_sei_esr(struct kvm_vcpu *vcpu, u32 *syndrome)
> +{
> + return -EINVAL;
> +}

Does nothing in the patch that adds the support? This is a bit odd.
(oh, its hiding in patch 6...)


Thanks,

James

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/3] s390/cmf: fix kerneldoc

2018-01-23 Thread Sebastian Ott
On Tue, 23 Jan 2018, Cornelia Huck wrote:
> Make sure we use proper Return sections, and make the output
> for cmf_enable() less odd.
> 
> Signed-off-by: Cornelia Huck 
> ---
> +++ b/drivers/s390/cio/cmf.c
> @@ -1118,9 +1118,10 @@ int ccw_set_cmf(struct ccw_device *cdev, int enable)
>   * enable_cmf() - switch on the channel measurement for a specific device
>   *  @cdev:   The ccw device to be enabled
>   *
> - *  Returns %0 for success or a negative error value.
> - *  Note: If this is called on a device for which channel measurement is 
> already
> - * enabled a reset of the measurement data is triggered.
> + *  Enable channel measurements for @cdev. If this is called on a device
> + *  for which channel measurement is already enabled a reset of the
> + *  measurement data is triggered.
> + *  Returns: %0 for success or a negative error value
^
I took the liberty to do re-add the dot at the end..
Applied. Thanks!

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch -mm 3/4] mm, memcg: replace memory.oom_group with policy tunable

2018-01-23 Thread Michal Hocko
On Mon 22-01-18 14:34:39, David Rientjes wrote:
> On Sat, 20 Jan 2018, Tejun Heo wrote:
[...]
> > I don't see any blocker here.  The issue you're raising can and should
> > be handled separately.
> > 
> 
> It can't, because the current patchset locks the system into a single 
> selection criteria that is unnecessary and the mount option would become a 
> no-op after the policy per subtree becomes configurable by the user as 
> part of the hierarchy itself.

This is simply not true! OOM victim selection has changed in the
past and will be always a subject to changes in future. Current
implementation doesn't provide any externally controlable selection
policy and therefore the default can be assumed. Whatever that default
means now or in future. The only contract added here is the kill full
memcg if selected and that can be implemented on _any_ selection policy.

-- 
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 6/8] thermal/drivers/cpu_cooling: Add idle cooling device documentation

2018-01-23 Thread Daniel Lezcano
Provide some documentation for the idle injection cooling effect in
order to let people to understand the rational of the approach for the
idle injection CPU cooling device.

Signed-off-by: Daniel Lezcano 
---
 Documentation/thermal/cpu-idle-cooling.txt | 165 +
 1 file changed, 165 insertions(+)
 create mode 100644 Documentation/thermal/cpu-idle-cooling.txt

diff --git a/Documentation/thermal/cpu-idle-cooling.txt 
b/Documentation/thermal/cpu-idle-cooling.txt
new file mode 100644
index 000..29fc651
--- /dev/null
+++ b/Documentation/thermal/cpu-idle-cooling.txt
@@ -0,0 +1,165 @@
+
+Situation:
+--
+
+Under certain circumstances, the SoC reaches a temperature exceeding
+the allocated power budget or the maximum temperature limit. The
+former must be mitigated to stabilize the SoC temperature around the
+temperature control using the defined cooling devices, the latter is a
+catastrophic situation where a radical decision must be taken to
+reduce the temperature under the critical threshold, that can impact
+the performances.
+
+Another situation is when the silicon reaches a certain temperature
+which continues to increase even if the dynamic leakage is reduced to
+its minimum by clock gating the component. The runaway phenomena will
+continue with the static leakage and only powering down the component,
+thus dropping the dynamic and static leakage will allow the component
+to cool down. This situation is critical.
+
+Last but not least, the system can ask for a specific power budget but
+because of the OPP density, we can only choose an OPP with a power
+budget lower than the requested one and underuse the CPU, thus losing
+performances. In other words, one OPP under uses the CPU with a power
+lesser than the power budget and the next OPP exceed the power budget,
+an intermediate OPP could have been used if it were present.
+
+Solutions:
+--
+
+If we can remove the static and the dynamic leakage for a specific
+duration in a controlled period, the SoC temperature will
+decrease. Acting at the idle state duration or the idle cycle
+injection period, we can mitigate the temperature by modulating the
+power budget.
+
+The Operating Performance Point (OPP) density has a great influence on
+the control precision of cpufreq, however different vendors have a
+plethora of OPP density, and some have large power gap between OPPs,
+that will result in loss of performance during thermal control and
+loss of power in other scenes.
+
+At a specific OPP, we can assume injecting idle cycle on all CPUs,
+belonging to the same cluster, with a duration greater than the
+cluster idle state target residency, we drop the static and the
+dynamic leakage for this period (modulo the energy needed to enter
+this state). So the sustainable power with idle cycles has a linear
+relation with the OPP’s sustainable power and can be computed with a
+coefficient similar to:
+
+   Power(IdleCycle) = Coef x Power(OPP)
+
+Idle Injection:
+---
+
+The base concept of the idle injection is to force the CPU to go to an
+idle state for a specified time each control cycle, it provides
+another way to control CPU power and heat in addition to
+cpufreq. Ideally, if all CPUs of a cluster inject idle synchronously,
+this cluster can get into the deepest idle state and achieve minimum
+power consumption, but that will also increase system response latency
+if we inject less than cpuidle latency.
+
+ ^
+ |
+ |
+ |---   ---   ---
+ |___|_|___|_|___|___
+
+  <->
+   idle  <>
+  running
+
+With the fixed idle injection duration, we can give a value which is
+an acceptable performance drop off or latency when we reach a specific
+temperature and we begin to mitigate by varying the Idle injection
+period.
+
+The mitigation begins with a maximum period value which decrease when
+more cooling effect is requested. When the period duration is equal to
+the idle duration, then we are in a situation the platform can’t
+dissipate the heat enough and the mitigation fails. In this case the
+situation is considered critical and there is nothing to do. The idle
+injection duration must be changed by configuration and until we reach
+the cooling effect, otherwise an additionnal cooling device must be
+used or ultimately decrease the SoC performance by dropping the
+highest OPP point of the SoC.
+
+The idle injection duration value must comply with the constraints:
+
+- It is lesser or equal to the latency we tolerate when the mitigation
+  begins. It is platform dependent and will depend on the user
+  experience, reactivity vs performance trade off we want. This value
+  should be specified.
+
+- It is greater than the idle state’s target residency we want to go
+  for thermal mitigation, otherwise we end up consuming more energy.
+
+Minimum period
+--
+
+The idle 

Re: [patch -mm 3/4] mm, memcg: replace memory.oom_group with policy tunable

2018-01-23 Thread Michal Hocko
On Wed 17-01-18 14:18:33, David Rientjes wrote:
> On Wed, 17 Jan 2018, Michal Hocko wrote:
> 
> > Absolutely agreed! And moreover, there are not all that many ways what
> > to do as an action. You just kill a logical entity - be it a process or
> > a logical group of processes. But you have way too many policies how
> > to select that entity. Do you want to chose the youngest process/group
> > because all the older ones have been computing real stuff and you would
> > lose days of your cpu time? Or should those who pay more should be
> > protected (aka give them static priorities), or you name it...
> > 
> 
> That's an argument for making the interface extensible, yes.

And there is no interface to control the selection yet so we can develop
one on top.
 
> > I am sorry, I still didn't grasp the full semantic of the proposed
> > soluton but the mere fact it is starting by conflating selection and the
> > action is a no go and a wrong API. This is why I've said that what you
> > (David) outlined yesterday is probably going to suffer from a much
> > longer discussion and most likely to be not acceptable. Your patchset
> > proves me correct...
> 
> I'm very happy to change the API if there are better suggestions.  That 
> may end up just being an memory.oom_policy file, as this implements, and 
> separating out a new memory.oom_action that isn't a boolean value to 
> either do a full group kill or only a single process.  Or it could be what 
> I suggested in my mail to Tejun, such as "hierarchy killall" written to
> memory.oom_policy, which would specify a single policy and then an 
> optional mechanism.  With my proposed patchset, there would then be three 
> policies: "none", "cgroup", and "tree" and one possible optional 
> mechanism: "killall".

You haven't convinced me at all. This all sounds more like "what if"
than a really thought through interface. I've tried to point out that
having a real policy driven victim selection is a _hard_ thing to do
_right_.

On the other hand oom_group makes semantic sense. It controls the
killable entity and there are usecases which want to consider the full
memcg as a single killable entity. No matter what selection policy we
chose on top. It is just a natural API.

Now you keep arguing about the victim selection and different strategies
to implement it. We will not move forward as long as you keep conflating
the two things, I am afraid.
-- 
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/3] s390/docs: mention subchannel types

2018-01-23 Thread Cornelia Huck
Since the original inception of the s390-drivers document, the
common I/O layer has grown support for more types of subchannels.
Give at least a pointer for the various types.

Signed-off-by: Cornelia Huck 
---
 Documentation/driver-api/s390-drivers.rst | 19 +++
 1 file changed, 19 insertions(+)

diff --git a/Documentation/driver-api/s390-drivers.rst 
b/Documentation/driver-api/s390-drivers.rst
index ecf8851d3565..42350f21357d 100644
--- a/Documentation/driver-api/s390-drivers.rst
+++ b/Documentation/driver-api/s390-drivers.rst
@@ -22,9 +22,28 @@ While most I/O devices on a s390 system are typically driven 
through the
 channel I/O mechanism described here, there are various other methods
 (like the diag interface). These are out of the scope of this document.
 
+The s390 common I/O layer also provides access to some devices that are
+not strictly considered I/O devices. They are considered here as well,
+although they are not the focus of this document.
+
 Some additional information can also be found in the kernel source under
 Documentation/s390/driver-model.txt.
 
+The css bus
+===
+
+The css bus contains the subchannels available on the system. They fall
+into several categories:
+
+* Standard I/O subchannels, for use by the system. They have a child
+  device on the ccw bus and are described below.
+* I/O subchannels bound to the vfio-ccw driver. See
+  Documentation/s390/vfio-ccw.txt.
+* Message subchannels. No Linux driver currently exists.
+* CHSC subchannels (at most one). The chsc subchannel driver can be used
+  to send asynchronous chsc commands.
+* eADM subchannels. Used for talking to storage class memory.
+
 The ccw bus
 ===
 
-- 
2.13.6

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/3] s390/docs: reword airq section

2018-01-23 Thread Cornelia Huck
Also mention the iv helpers as well.

Signed-off-by: Cornelia Huck 
---
 Documentation/driver-api/s390-drivers.rst | 13 +
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/Documentation/driver-api/s390-drivers.rst 
b/Documentation/driver-api/s390-drivers.rst
index 42350f21357d..30e6aa7e160b 100644
--- a/Documentation/driver-api/s390-drivers.rst
+++ b/Documentation/driver-api/s390-drivers.rst
@@ -121,10 +121,15 @@ ccw group devices
 Generic interfaces
 ==
 
-Some interfaces are available to other drivers that do not necessarily
-have anything to do with the busses described above, but still are
-indirectly using basic infrastructure in the common I/O layer. One
-example is the support for adapter interrupts.
+The following section contains interfaces in use not only by drivers
+dealing with ccw devices, but drivers for various other s390 hardware
+as well.
+
+Adapter interrupts
+--
+
+The common I/O layer provides helper functions for dealing with adapter
+interrupts and interrupt vectors.
 
 .. kernel-doc:: drivers/s390/cio/airq.c
:export:
-- 
2.13.6

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/3] s390/cmf: fix kerneldoc

2018-01-23 Thread Cornelia Huck
Make sure we use proper Return sections, and make the output
for cmf_enable() less odd.

Signed-off-by: Cornelia Huck 
---
 drivers/s390/cio/cmf.c | 15 ---
 1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/drivers/s390/cio/cmf.c b/drivers/s390/cio/cmf.c
index 5e495c62cfa7..6e4f50d5b655 100644
--- a/drivers/s390/cio/cmf.c
+++ b/drivers/s390/cio/cmf.c
@@ -1118,9 +1118,10 @@ int ccw_set_cmf(struct ccw_device *cdev, int enable)
  * enable_cmf() - switch on the channel measurement for a specific device
  *  @cdev: The ccw device to be enabled
  *
- *  Returns %0 for success or a negative error value.
- *  Note: If this is called on a device for which channel measurement is 
already
- *   enabled a reset of the measurement data is triggered.
+ *  Enable channel measurements for @cdev. If this is called on a device
+ *  for which channel measurement is already enabled a reset of the
+ *  measurement data is triggered.
+ *  Returns: %0 for success or a negative error value
  *  Context:
  *non-atomic
  */
@@ -1160,7 +1161,7 @@ int enable_cmf(struct ccw_device *cdev)
  * __disable_cmf() - switch off the channel measurement for a specific device
  *  @cdev: The ccw device to be disabled
  *
- *  Returns %0 for success or a negative error value.
+ *  Returns: %0 for success or a negative error value.
  *
  *  Context:
  *non-atomic, device_lock() held.
@@ -1184,7 +1185,7 @@ int __disable_cmf(struct ccw_device *cdev)
  * disable_cmf() - switch off the channel measurement for a specific device
  *  @cdev: The ccw device to be disabled
  *
- *  Returns %0 for success or a negative error value.
+ *  Returns: %0 for success or a negative error value.
  *
  *  Context:
  *non-atomic
@@ -1205,7 +1206,7 @@ int disable_cmf(struct ccw_device *cdev)
  * @cdev:  the channel to be read
  * @index: the index of the value to be read
  *
- * Returns the value read or %0 if the value cannot be read.
+ * Returns: The value read or %0 if the value cannot be read.
  *
  *  Context:
  *any
@@ -1220,7 +1221,7 @@ u64 cmf_read(struct ccw_device *cdev, int index)
  * @cdev:  the channel to be read
  * @data:  a pointer to a data block that will be filled
  *
- * Returns %0 on success, a negative error value otherwise.
+ * Returns: %0 on success, a negative error value otherwise.
  *
  *  Context:
  *any
-- 
2.13.6

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/3] s390: documentation update

2018-01-23 Thread Cornelia Huck
The first two are updates that I had around for a long time, but
somehow forgot to send them out.

While looking at the result of make pdfdocs, I found that the output
for the cmf stuff looked a bit suboptimal, so I added a bonus patch
for that.

A branch based on the s390 features branch can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/cohuck/linux.git s390-doc-update

Cornelia Huck (3):
  s390/docs: mention subchannel types
  s390/docs: reword airq section
  s390/cmf: fix kerneldoc

 Documentation/driver-api/s390-drivers.rst | 32 +++
 drivers/s390/cio/cmf.c| 15 ---
 2 files changed, 36 insertions(+), 11 deletions(-)

-- 
2.13.6

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] acpi, spcr: Make SPCR available to x86

2018-01-23 Thread Prarit Bhargava


On 01/22/2018 04:49 PM, Timur Tabi wrote:
> On 01/18/2018 09:09 AM, Prarit Bhargava wrote:
>>   if (acpi_disabled) {
>> -if (earlycon_init_is_deferred)
>> +if (earlycon_acpi_spcr_enable)
> 
> This patch works for me, so I can ACK it, but first you might want to rename
> earlycon_acpi_spcr_enable, because these two lines don't make much sense.
> 
> "If ACPI is disabled, and ACPI SCPR is enabled, then "
> 
> If ACPI is disabled, then how can a variable called 
> "earlycon_acpi_spcr_enable"
> be true?
> 
> Would it make more sense to rename it to earlycon_spcr_enable?
> 

acpi_disabled is a global runtime flag that can be set via "acpi=off" on the
command line.  It does not disable the tables, only the reading and interpreting
of the data in the tables.

P.
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v9 3/7] acpi: apei: Add SEI notification type support for ARMv8

2018-01-23 Thread gengdongjiu
sorry fix a typo.

On 2018/1/23 17:23, gengdongjiu wrote:
>> There are problems with doing this:
>>
>> Oct. 18, 2017, 10:26 a.m. James Morse wrote:
>> | How do SEA and SEI interact?
>> |
>> | As far as I can see they can both interrupt each other, which isn't 
>> something
>> | the single in_nmi() path in APEI can handle. I thinks we should fix this
>> | first.
>>
>> [..]
>>
>> | SEA gets away with a lot of things because its synchronous. SEI isn't. Xie
>> | XiuQi pointed to the memory_failure_queue() code. We can use this directly
>> | from SEA, but not SEI. (what happens if an SError arrives while we are
>> | queueing memory_failure work from an IRQ).
>> |
>> | The one that scares me is the trace-point reporting stuff. What happens if 
>> an
>> | SError arrives while we are enabling a trace point? (these are static-keys
>> | right?)
>> |
>> |  I don't think we can just plumb SEI in like this and be done with it.
>> |  (I'm looking at teasing out the estatus cache code from being x86:NMI 
>> only.
>> |  This way we solve the same 'cant do this from NMI context' with the same
>> |  code'.)
>>
>>
>> I will post what I've got for this estatus-cache thing as an RFC, its not 
>> ready
>> to be considered yet.

Yes, I know you are dong that. Your serial's patch will consider all above 
things, right?
If your patch can be consider that, this patch can based on your patchset. 
thanks.

> 
>>

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/3] mm: docs: add blank lines to silence sphinx "Unexpected indentation" errors

2018-01-23 Thread Mike Rapoport
Signed-off-by: Mike Rapoport 
---
 mm/pagewalk.c  | 1 +
 mm/process_vm_access.c | 2 ++
 mm/vmscan.c| 1 +
 3 files changed, 4 insertions(+)

diff --git a/mm/pagewalk.c b/mm/pagewalk.c
index 23a3e415ac2c..8d2da5dec1e0 100644
--- a/mm/pagewalk.c
+++ b/mm/pagewalk.c
@@ -265,6 +265,7 @@ static int __walk_page_range(unsigned long start, unsigned 
long end,
  * pte_entry(), and/or hugetlb_entry(). If you don't set up for some of these
  * callbacks, the associated entries/pages are just ignored.
  * The return values of these callbacks are commonly defined like below:
+ *
  *  - 0  : succeeded to handle the current entry, and if you don't reach the
  * end address yet, continue to walk.
  *  - >0 : succeeded to handle the current entry, and return to the caller
diff --git a/mm/process_vm_access.c b/mm/process_vm_access.c
index 011edefd3c92..1a27c837f004 100644
--- a/mm/process_vm_access.c
+++ b/mm/process_vm_access.c
@@ -147,6 +147,7 @@ static int process_vm_rw_single_vec(unsigned long addr,
  * @riovcnt: size of rvec array
  * @flags: currently unused
  * @vm_write: 0 if reading from other process, 1 if writing to other process
+ *
  * Returns the number of bytes read/written or error code. May
  *  return less bytes than expected if an error occurs during the copying
  *  process.
@@ -253,6 +254,7 @@ static ssize_t process_vm_rw_core(pid_t pid, struct 
iov_iter *iter,
  * @riovcnt: size of rvec array
  * @flags: currently unused
  * @vm_write: 0 if reading from other process, 1 if writing to other process
+ *
  * Returns the number of bytes read/written or error code. May
  *  return less bytes than expected if an error occurs during the copying
  *  process.
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 47d5ced51f2d..8d01b095d97b 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1606,6 +1606,7 @@ static unsigned long isolate_lru_pages(unsigned long 
nr_to_scan,
  * found will be decremented.
  *
  * Restrictions:
+ *
  * (1) Must be called with an elevated refcount on the page. This is a
  * fundamentnal difference from isolate_lru_pages (which is called
  * without a stable reference).
-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/3] mm: docs: fixup punctuation

2018-01-23 Thread Mike Rapoport
so that kernel-doc will properly recognize the parameter and function
descriptions.

Signed-off-by: Mike Rapoport 
---
 mm/ksm.c|  2 +-
 mm/memcontrol.c |  4 ++--
 mm/mlock.c  |  2 +-
 mm/nommu.c  |  2 +-
 mm/sparse-vmemmap.c |  4 ++--
 mm/zpool.c  | 44 ++--
 6 files changed, 29 insertions(+), 29 deletions(-)

diff --git a/mm/ksm.c b/mm/ksm.c
index be8f4576f842..64207d936659 100644
--- a/mm/ksm.c
+++ b/mm/ksm.c
@@ -2309,7 +2309,7 @@ static struct rmap_item *scan_get_next_rmap_item(struct 
page **page)
 
 /**
  * ksm_do_scan  - the ksm scanner main worker function.
- * @scan_npages - number of pages we want to scan before we return.
+ * @scan_npages:  number of pages we want to scan before we return.
  */
 static void ksm_do_scan(unsigned int scan_npages)
 {
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index ac2ffd5e02b9..cddea3ed8e86 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -5885,8 +5885,8 @@ bool mem_cgroup_charge_skmem(struct mem_cgroup *memcg, 
unsigned int nr_pages)
 
 /**
  * mem_cgroup_uncharge_skmem - uncharge socket memory
- * @memcg - memcg to uncharge
- * @nr_pages - number of pages to uncharge
+ * @memcg: memcg to uncharge
+ * @nr_pages: number of pages to uncharge
  */
 void mem_cgroup_uncharge_skmem(struct mem_cgroup *memcg, unsigned int nr_pages)
 {
diff --git a/mm/mlock.c b/mm/mlock.c
index 30472d438794..3d2e834a6cb7 100644
--- a/mm/mlock.c
+++ b/mm/mlock.c
@@ -157,7 +157,7 @@ static void __munlock_isolation_failed(struct page *page)
 
 /**
  * munlock_vma_page - munlock a vma page
- * @page - page to be unlocked, either a normal page or THP page head
+ * @page: page to be unlocked, either a normal page or THP page head
  *
  * returns the size of the page as a page mask (0 for normal page,
  * HPAGE_PMD_NR - 1 for THP head page)
diff --git a/mm/nommu.c b/mm/nommu.c
index 17c00d93de2e..52c14127a861 100644
--- a/mm/nommu.c
+++ b/mm/nommu.c
@@ -1843,7 +1843,7 @@ int __access_remote_vm(struct task_struct *tsk, struct 
mm_struct *mm,
 }
 
 /**
- * @access_remote_vm - access another process' address space
+ * access_remote_vm - access another process' address space
  * @mm:the mm_struct of the target address space
  * @addr:  start address to access
  * @buf:   source or destination buffer
diff --git a/mm/sparse-vmemmap.c b/mm/sparse-vmemmap.c
index 17acf01791fa..015ee4eb79bc 100644
--- a/mm/sparse-vmemmap.c
+++ b/mm/sparse-vmemmap.c
@@ -108,8 +108,8 @@ static unsigned long __meminit vmem_altmap_nr_free(struct 
vmem_altmap *altmap)
 
 /**
  * vmem_altmap_alloc - allocate pages from the vmem_altmap reservation
- * @altmap - reserved page pool for the allocation
- * @nr_pfns - size (in pages) of the allocation
+ * @altmap: reserved page pool for the allocation
+ * @nr_pfns: size (in pages) of the allocation
  *
  * Allocations are aligned to the size of the request
  */
diff --git a/mm/zpool.c b/mm/zpool.c
index fd3ff719c32c..2e6f9c3cebe7 100644
--- a/mm/zpool.c
+++ b/mm/zpool.c
@@ -100,7 +100,7 @@ static void zpool_put_driver(struct zpool_driver *driver)
 
 /**
  * zpool_has_pool() - Check if the pool driver is available
- * @type   The type of the zpool to check (e.g. zbud, zsmalloc)
+ * @type:  The type of the zpool to check (e.g. zbud, zsmalloc)
  *
  * This checks if the @type pool driver is available.  This will try to load
  * the requested module, if needed, but there is no guarantee the module will
@@ -135,10 +135,10 @@ EXPORT_SYMBOL(zpool_has_pool);
 
 /**
  * zpool_create_pool() - Create a new zpool
- * @type   The type of the zpool to create (e.g. zbud, zsmalloc)
- * @name   The name of the zpool (e.g. zram0, zswap)
- * @gfpThe GFP flags to use when allocating the pool.
- * @opsThe optional ops callback.
+ * @type:  The type of the zpool to create (e.g. zbud, zsmalloc)
+ * @name:  The name of the zpool (e.g. zram0, zswap)
+ * @gfp:   The GFP flags to use when allocating the pool.
+ * @ops:   The optional ops callback.
  *
  * This creates a new zpool of the specified type.  The gfp flags will be
  * used when allocating memory, if the implementation supports it.  If the
@@ -199,7 +199,7 @@ struct zpool *zpool_create_pool(const char *type, const 
char *name, gfp_t gfp,
 
 /**
  * zpool_destroy_pool() - Destroy a zpool
- * @pool   The zpool to destroy.
+ * @pool:  The zpool to destroy.
  *
  * Implementations must guarantee this to be thread-safe,
  * however only when destroying different pools.  The same
@@ -222,7 +222,7 @@ void zpool_destroy_pool(struct zpool *zpool)
 
 /**
  * zpool_get_type() - Get the type of the zpool
- * @pool   The zpool to check
+ * @pool:  The zpool to check
  *
  * This returns the type of the pool.
  *
@@ -237,10 +237,10 @@ const char *zpool_get_type(struct zpool *zpool)
 
 /**
  * zpool_malloc() - Allocate memory
- * @pool   

[PATCH 0/3] mm: docs: trivial fixes

2018-01-23 Thread Mike Rapoport
Hi,

These are some trvial fixes to the kernel-doc descriptions.

Mike Rapoport (3):
  mm: docs: fixup punctuation
  mm: docs: fix parameter names mismatch
  mm: docs: add blank lines to silence sphinx "Unexpected indentation" errors

 mm/bootmem.c   |  2 +-
 mm/ksm.c   |  2 +-
 mm/maccess.c   |  2 +-
 mm/memcontrol.c|  6 +++---
 mm/mlock.c |  2 +-
 mm/nommu.c |  2 +-
 mm/pagewalk.c  |  1 +
 mm/process_vm_access.c |  4 +++-
 mm/sparse-vmemmap.c|  4 ++--
 mm/swap.c  |  4 ++--
 mm/vmscan.c|  1 +
 mm/z3fold.c|  4 ++--
 mm/zbud.c  |  4 ++--
 mm/zpool.c | 46 +++---
 14 files changed, 44 insertions(+), 40 deletions(-)

-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/3] mm: docs: fix parameter names mismatch

2018-01-23 Thread Mike Rapoport
There are several places where parameter descriptions do no match the
actual code.
Fix it.

Signed-off-by: Mike Rapoport 
---
 mm/bootmem.c   |  2 +-
 mm/maccess.c   |  2 +-
 mm/memcontrol.c|  2 +-
 mm/process_vm_access.c |  2 +-
 mm/swap.c  |  4 ++--
 mm/z3fold.c|  4 ++--
 mm/zbud.c  |  4 ++--
 mm/zpool.c | 20 ++--
 8 files changed, 20 insertions(+), 20 deletions(-)

diff --git a/mm/bootmem.c b/mm/bootmem.c
index 6aef64254203..9e197987b67d 100644
--- a/mm/bootmem.c
+++ b/mm/bootmem.c
@@ -410,7 +410,7 @@ void __init free_bootmem_node(pg_data_t *pgdat, unsigned 
long physaddr,
 
 /**
  * free_bootmem - mark a page range as usable
- * @addr: starting physical address of the range
+ * @physaddr: starting physical address of the range
  * @size: size of the range in bytes
  *
  * Partial pages will be considered reserved and left as they are.
diff --git a/mm/maccess.c b/mm/maccess.c
index 78f9274dd49d..ec00be51a24f 100644
--- a/mm/maccess.c
+++ b/mm/maccess.c
@@ -70,7 +70,7 @@ EXPORT_SYMBOL_GPL(probe_kernel_write);
  * strncpy_from_unsafe: - Copy a NUL terminated string from unsafe address.
  * @dst:   Destination address, in kernel space.  This buffer must be at
  * least @count bytes long.
- * @src:   Unsafe address.
+ * @unsafe_addr: Unsafe address.
  * @count: Maximum number of bytes to copy, including the trailing NUL.
  *
  * Copies a NUL-terminated string from unsafe address to kernel buffer.
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index cddea3ed8e86..0975cde3e83b 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -946,7 +946,7 @@ int mem_cgroup_scan_tasks(struct mem_cgroup *memcg,
 /**
  * mem_cgroup_page_lruvec - return lruvec for isolating/putting an LRU page
  * @page: the page
- * @zone: zone of the page
+ * @pgdat: pgdat of the page
  *
  * This function is only safe when following the LRU page isolation
  * and putback protocol: the LRU lock must be held, and the page must
diff --git a/mm/process_vm_access.c b/mm/process_vm_access.c
index 8973cd231ece..011edefd3c92 100644
--- a/mm/process_vm_access.c
+++ b/mm/process_vm_access.c
@@ -25,7 +25,7 @@
 /**
  * process_vm_rw_pages - read/write pages from task specified
  * @pages: array of pointers to pages we want to copy
- * @start_offset: offset in page to start copying from/to
+ * @offset: offset in page to start copying from/to
  * @len: number of bytes to copy
  * @iter: where to copy to/from locally
  * @vm_write: 0 means copy from, 1 means copy to
diff --git a/mm/swap.c b/mm/swap.c
index 38e1b6374a97..2e8dae403474 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -913,11 +913,11 @@ EXPORT_SYMBOL(__pagevec_lru_add);
  * @pvec:  Where the resulting entries are placed
  * @mapping:   The address_space to search
  * @start: The starting entry index
- * @nr_entries:The maximum number of entries
+ * @nr_pages:  The maximum number of pages
  * @indices:   The cache indices corresponding to the entries in @pvec
  *
  * pagevec_lookup_entries() will search for and return a group of up
- * to @nr_entries pages and shadow entries in the mapping.  All
+ * to @nr_pages pages and shadow entries in the mapping.  All
  * entries are placed in @pvec.  pagevec_lookup_entries() takes a
  * reference against actual pages in @pvec.
  *
diff --git a/mm/z3fold.c b/mm/z3fold.c
index 39e19125d6a0..d589d318727f 100644
--- a/mm/z3fold.c
+++ b/mm/z3fold.c
@@ -769,7 +769,7 @@ static void z3fold_free(struct z3fold_pool *pool, unsigned 
long handle)
 /**
  * z3fold_reclaim_page() - evicts allocations from a pool page and frees it
  * @pool:  pool from which a page will attempt to be evicted
- * @retires:   number of pages on the LRU list for which eviction will
+ * @retries:   number of pages on the LRU list for which eviction will
  * be attempted before failing
  *
  * z3fold reclaim is different from normal system reclaim in that it is done
@@ -779,7 +779,7 @@ static void z3fold_free(struct z3fold_pool *pool, unsigned 
long handle)
  * z3fold and the user, however.
  *
  * To avoid these, this is how z3fold_reclaim_page() should be called:
-
+ *
  * The user detects a page should be reclaimed and calls z3fold_reclaim_page().
  * z3fold_reclaim_page() will remove a z3fold page from the pool LRU list and
  * call the user-defined eviction handler with the pool and handle as
diff --git a/mm/zbud.c b/mm/zbud.c
index b42322e50f63..28458f7d1e84 100644
--- a/mm/zbud.c
+++ b/mm/zbud.c
@@ -466,7 +466,7 @@ void zbud_free(struct zbud_pool *pool, unsigned long handle)
 /**
  * zbud_reclaim_page() - evicts allocations from a pool page and frees it
  * @pool:  pool from which a page will attempt to be evicted
- * @retires:   number of pages on the LRU list for which eviction will
+ * @retries:   number of pages on the LRU list for which eviction will
  * be attempted before failing
  *
  * zbud reclaim is different 

Re: [PATCH v9 3/7] acpi: apei: Add SEI notification type support for ARMv8

2018-01-23 Thread gengdongjiu
Hi James,

On 2018/1/23 3:39, James Morse wrote:
> Hi Dongjiu Geng,
> 
> (versions of patches 1,2 and 4 have been queued by Catalin)
> 
> (Nit 'ACPI / APEI:' is the normal subject prefix for ghes.c, this helps the
> maintainers know which patches they need to pay attention to when you are
> touching multiple trees)
> 
> On 06/01/18 16:02, Dongjiu Geng wrote:
>> ARMv8.2 requires implementation of the RAS extension.
> 
>> In
>> this extension, it adds SEI(SError Interrupt) notification
>> type, this patch adds new GHES error source SEI handling
>> functions. 
> 
> This reads as if this patch is handling SError RAS notifications generated by 
> a
> CPU with the RAS extensions. These are about CPU->Software notifications. APEI
> and GHES are a firmware first mechanism which is Software->Software.
> Reading the v8.2 documents won't help anyone with the APEI/GHES code.
> 
> Please describe this from the ACPI view, "ACPI 6.x adds support for NOTIFY_SEI
> as a GHES notification mechanism... ",  its up to the arch code to spot a v8.2
> RAS Error based on the cpu caps.Ok, I will modify it.

> 
> 
>> This error source parsing and handling method
>> is similar with the SEA.
> 
> There are problems with doing this:
> 
> Oct. 18, 2017, 10:26 a.m. James Morse wrote:
> | How do SEA and SEI interact?
> |
> | As far as I can see they can both interrupt each other, which isn't 
> something
> | the single in_nmi() path in APEI can handle. I thinks we should fix this
> | first.
> 
> [..]
> 
> | SEA gets away with a lot of things because its synchronous. SEI isn't. Xie
> | XiuQi pointed to the memory_failure_queue() code. We can use this directly
> | from SEA, but not SEI. (what happens if an SError arrives while we are
> | queueing memory_failure work from an IRQ).
> |
> | The one that scares me is the trace-point reporting stuff. What happens if 
> an
> | SError arrives while we are enabling a trace point? (these are static-keys
> | right?)
> |
> |  I don't think we can just plumb SEI in like this and be done with it.
> |  (I'm looking at teasing out the estatus cache code from being x86:NMI only.
> |  This way we solve the same 'cant do this from NMI context' with the same
> |  code'.)
> 
> 
> I will post what I've got for this estatus-cache thing as an RFC, its not 
> ready
> to be considered yet.Yes, I know you are dong that. Your serial's patch will 
> consider all above things, right?
If your patch can be consider that, this patch can based on your patchset. 
thanks.

> 
> 
>> Expose API ghes_notify_sei() to external users. External
>> modules can call this exposed API to parse APEI table and
>> handle the SEI notification.
> 
> external modules? You mean called by the arch code when it gets this 
> NOTIFY_SEI?
yes, called by kernel ARCH code, such as below, I remember I have discussed 
with you.

 asmlinkage void do_serror(struct pt_regs *regs, unsigned int esr)
 {
nmi_enter();


if (!ghes_notify_sei())
return;



/* non-RAS errors are not containable */
if (!arm64_is_ras_serror(esr) || arm64_is_fatal_ras_serror(regs, esr))
arm64_serror_panic(regs, esr);

nmi_exit();
}

> 
> 
> Thanks,
> 
> James
> 
> .
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html