Re: [patch -mm 3/4] mm, memcg: replace memory.oom_group with policy tunable
On Tue, 23 Jan 2018, Michal Hocko wrote: > > It can't, because the current patchset locks the system into a single > > selection criteria that is unnecessary and the mount option would become a > > no-op after the policy per subtree becomes configurable by the user as > > part of the hierarchy itself. > > This is simply not true! OOM victim selection has changed in the > past and will be always a subject to changes in future. Current > implementation doesn't provide any externally controlable selection > policy and therefore the default can be assumed. Whatever that default > means now or in future. The only contract added here is the kill full > memcg if selected and that can be implemented on _any_ selection policy. > The current implementation of memory.oom_group is based on top of a selection implementation that is broken in three ways I have listed for months: - allows users to intentionally/unintentionally evade the oom killer, requires not locking the selection implementation for the entire system, requires subtree control to prevent, makes a mount option obsolete, and breaks existing users who would use the implementation based on 4.16 if this were merged, - unfairly compares the root mem cgroup vs leaf mem cgroup such that users must structure their hierarchy only for 4.16 in such a way that _all_ processes are under hierarchical control and have no power to create sub cgroups because of the point above and completely breaks any user of oom_score_adj in a completely undocumented and unspecified way, such that fixing that breakage would also break any existing users who would use the implementation based on 4.16 if this were merged, and - does not allow userspace to protect important cgroups, which can be built on top. I'm focused on fixing the breakage in the first two points since it affects the API and we don't want to switch that out from the user. I have brought these points up repeatedly and everybody else has actively disengaged from development, so I'm proposing incremental changes that make the cgroup aware oom killer have a sustainable API and isn't useful only for a highly specialized usecase where everything is containerized, nobody can create subcgroups, and nobody uses oom_score_adj to break the root mem cgroup accounting. -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v9 6/7] arm64: kvm: Set Virtual SError Exception Syndrome for guest
Hi Dongjiu Geng, On 06/01/18 16:02, Dongjiu Geng wrote: > RAS Extension add a VSESR_EL2 register which can provide > the syndrome value reported to software on taking a virtual > SError interrupt exception. This patch supports to specify > this Syndrome. > > In the RAS Extensions we can not set all-zero syndrome value > for SError, which means 'RAS error: Uncategorized' instead of > 'no valid ISS'. So set it to IMPLEMENTATION DEFINED syndrome > by default. > > We also need to support userspace to specify a valid syndrome > value, Because in some case, the recovery is driven by userspace. > This patch can support that userspace specify it. > > In the guest/host world switch, restore this value to VSESR_EL2 > only when HCR_EL2.VSE is set. This value no need to be saved > because it is stale vale when guest exit. A version of this patch has been queued by Catalin. Now that the cpufeature bits are queued, I think this can be split up into two separate series for v4.16-rc1, one to tackle NOTIFY_SEI and the associated plumbing. The second for the KVM 'make SError pending' API. > Signed-off-by: Dongjiu Geng> [Set an impdef ESR for Virtual-SError] > Signed-off-by: James Morse I didn't sign-off this patch. If you pick some bits from another version and want to credit someone else you can 'CC:' them or just mention it in the commit-message. > diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h > index 47b967d..3b035cc 100644 > --- a/arch/arm64/include/asm/sysreg.h > +++ b/arch/arm64/include/asm/sysreg.h > @@ -86,6 +86,9 @@ > #define REG_PSTATE_PAN_IMM sys_reg(0, 0, 4, 0, 4) > #define REG_PSTATE_UAO_IMM sys_reg(0, 0, 4, 0, 3) > > +/* virtual SError exception syndrome register */ > +#define REG_VSESR_EL2 sys_reg(3, 4, 5, 2, 3) Irrelevant-Nit: sys-regs usually have a 'SYS_' prefix, and are in instruction encoding order lower down the file. (These PSTATE PAN things are a bit odd as they were used to generate and instruction before the fancy {read,write}_sysreg() helpers were added). > #define SET_PSTATE_PAN(x) __emit_inst(0xd500 | REG_PSTATE_PAN_IMM | > \ > (!!x)<<8 | 0x1f) > #define SET_PSTATE_UAO(x) __emit_inst(0xd500 | REG_PSTATE_UAO_IMM | > \ > diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c > index 738ae90..ffad42b 100644 > --- a/arch/arm64/kvm/guest.c > +++ b/arch/arm64/kvm/guest.c > @@ -279,7 +279,16 @@ int kvm_arch_vcpu_ioctl_set_sregs(struct kvm_vcpu *vcpu, > > int kvm_arm_set_sei_esr(struct kvm_vcpu *vcpu, u32 *syndrome) Bits of this are spread between patches 5 and 6. If you put them in the other order this wouldn't happen. (but after a rebase most of this patch should disappear) > { > - return -EINVAL; > + u64 reg = *syndrome; > + > + /* inject virtual system Error or asynchronous abort */ > + kvm_inject_vabt(vcpu); So this writes an impdef ESR, because its the existing code-path in KVM. > + if (reg) > + /* set vsesr_el2[24:0] with value that user space specified */ > + kvm_vcpu_set_vsesr(vcpu, reg & ESR_ELx_ISS_MASK); And then you overwrite it. Which is a bit odd as there is a helper to do both in one go: > + > + return 0; > } > int __attribute_const__ kvm_target_cpu(void) > diff --git a/arch/arm64/kvm/inject_fault.c b/arch/arm64/kvm/inject_fault.c > index 3556715..fb94b5e 100644 > --- a/arch/arm64/kvm/inject_fault.c > +++ b/arch/arm64/kvm/inject_fault.c > @@ -246,14 +246,25 @@ void kvm_inject_undefined(struct kvm_vcpu *vcpu) > inject_undef64(vcpu); > } > > +static void pend_guest_serror(struct kvm_vcpu *vcpu, u64 esr) > +{ > + kvm_vcpu_set_vsesr(vcpu, esr); > + vcpu_set_hcr(vcpu, vcpu_get_hcr(vcpu) | HCR_VSE); > +} How come you don't use this in kvm_arm_set_sei_esr()? Thanks, James -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v9 5/7] arm64: kvm: Introduce KVM_ARM_SET_SERROR_ESR ioctl
Hi Dongjiu Geng, On 06/01/18 16:02, Dongjiu Geng wrote: > The ARM64 RAS SError Interrupt(SEI) syndrome value is specific to the > guest and user space needs a way to tell KVM this value. So we add a > new ioctl. Before user space specifies the Exception Syndrome Register > ESR(ESR), it firstly checks that whether KVM has the capability to > set the guest ESR, If has, will set it. Otherwise, nothing to do. > > For this ESR specifying, Only support for AArch64, not support AArch32. After this patch user-space can trigger an SError in the guest. If it wants to migrate the guest, how does the pending SError get migrated? I think we need to fix migration first. Andrew Jones suggested using KVM_GET/SET_VCPU_EVENTS: https://www.spinics.net/lists/arm-kernel/msg616846.html Given KVM uses kvm_inject_vabt() on v8.0 hardware too, we should cover systems without the v8.2 RAS Extensions with the same API. I think this means a bit to read/write whether SError is pending, and another to indicate the ESR should be set/read. CPUs without the v8.2 RAS Extensions can reject pending-SError that had an ESR. user-space can then use the 'for migration' calls to make a 'new' SError pending. Now that the cpufeature bits are queued, I think this can be split up into two separate series for v4.16-rc1, one to tackle NOTIFY_SEI and the associated plumbing. The second for the KVM 'make SError pending' API. > diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c > index 5c7f657..738ae90 100644 > --- a/arch/arm64/kvm/guest.c > +++ b/arch/arm64/kvm/guest.c > @@ -277,6 +277,11 @@ int kvm_arch_vcpu_ioctl_set_sregs(struct kvm_vcpu *vcpu, > return -EINVAL; > } > > +int kvm_arm_set_sei_esr(struct kvm_vcpu *vcpu, u32 *syndrome) > +{ > + return -EINVAL; > +} Does nothing in the patch that adds the support? This is a bit odd. (oh, its hiding in patch 6...) Thanks, James -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/3] s390/cmf: fix kerneldoc
On Tue, 23 Jan 2018, Cornelia Huck wrote: > Make sure we use proper Return sections, and make the output > for cmf_enable() less odd. > > Signed-off-by: Cornelia Huck> --- > +++ b/drivers/s390/cio/cmf.c > @@ -1118,9 +1118,10 @@ int ccw_set_cmf(struct ccw_device *cdev, int enable) > * enable_cmf() - switch on the channel measurement for a specific device > * @cdev: The ccw device to be enabled > * > - * Returns %0 for success or a negative error value. > - * Note: If this is called on a device for which channel measurement is > already > - * enabled a reset of the measurement data is triggered. > + * Enable channel measurements for @cdev. If this is called on a device > + * for which channel measurement is already enabled a reset of the > + * measurement data is triggered. > + * Returns: %0 for success or a negative error value ^ I took the liberty to do re-add the dot at the end.. Applied. Thanks! -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch -mm 3/4] mm, memcg: replace memory.oom_group with policy tunable
On Mon 22-01-18 14:34:39, David Rientjes wrote: > On Sat, 20 Jan 2018, Tejun Heo wrote: [...] > > I don't see any blocker here. The issue you're raising can and should > > be handled separately. > > > > It can't, because the current patchset locks the system into a single > selection criteria that is unnecessary and the mount option would become a > no-op after the policy per subtree becomes configurable by the user as > part of the hierarchy itself. This is simply not true! OOM victim selection has changed in the past and will be always a subject to changes in future. Current implementation doesn't provide any externally controlable selection policy and therefore the default can be assumed. Whatever that default means now or in future. The only contract added here is the kill full memcg if selected and that can be implemented on _any_ selection policy. -- Michal Hocko SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 6/8] thermal/drivers/cpu_cooling: Add idle cooling device documentation
Provide some documentation for the idle injection cooling effect in order to let people to understand the rational of the approach for the idle injection CPU cooling device. Signed-off-by: Daniel Lezcano--- Documentation/thermal/cpu-idle-cooling.txt | 165 + 1 file changed, 165 insertions(+) create mode 100644 Documentation/thermal/cpu-idle-cooling.txt diff --git a/Documentation/thermal/cpu-idle-cooling.txt b/Documentation/thermal/cpu-idle-cooling.txt new file mode 100644 index 000..29fc651 --- /dev/null +++ b/Documentation/thermal/cpu-idle-cooling.txt @@ -0,0 +1,165 @@ + +Situation: +-- + +Under certain circumstances, the SoC reaches a temperature exceeding +the allocated power budget or the maximum temperature limit. The +former must be mitigated to stabilize the SoC temperature around the +temperature control using the defined cooling devices, the latter is a +catastrophic situation where a radical decision must be taken to +reduce the temperature under the critical threshold, that can impact +the performances. + +Another situation is when the silicon reaches a certain temperature +which continues to increase even if the dynamic leakage is reduced to +its minimum by clock gating the component. The runaway phenomena will +continue with the static leakage and only powering down the component, +thus dropping the dynamic and static leakage will allow the component +to cool down. This situation is critical. + +Last but not least, the system can ask for a specific power budget but +because of the OPP density, we can only choose an OPP with a power +budget lower than the requested one and underuse the CPU, thus losing +performances. In other words, one OPP under uses the CPU with a power +lesser than the power budget and the next OPP exceed the power budget, +an intermediate OPP could have been used if it were present. + +Solutions: +-- + +If we can remove the static and the dynamic leakage for a specific +duration in a controlled period, the SoC temperature will +decrease. Acting at the idle state duration or the idle cycle +injection period, we can mitigate the temperature by modulating the +power budget. + +The Operating Performance Point (OPP) density has a great influence on +the control precision of cpufreq, however different vendors have a +plethora of OPP density, and some have large power gap between OPPs, +that will result in loss of performance during thermal control and +loss of power in other scenes. + +At a specific OPP, we can assume injecting idle cycle on all CPUs, +belonging to the same cluster, with a duration greater than the +cluster idle state target residency, we drop the static and the +dynamic leakage for this period (modulo the energy needed to enter +this state). So the sustainable power with idle cycles has a linear +relation with the OPP’s sustainable power and can be computed with a +coefficient similar to: + + Power(IdleCycle) = Coef x Power(OPP) + +Idle Injection: +--- + +The base concept of the idle injection is to force the CPU to go to an +idle state for a specified time each control cycle, it provides +another way to control CPU power and heat in addition to +cpufreq. Ideally, if all CPUs of a cluster inject idle synchronously, +this cluster can get into the deepest idle state and achieve minimum +power consumption, but that will also increase system response latency +if we inject less than cpuidle latency. + + ^ + | + | + |--- --- --- + |___|_|___|_|___|___ + + <-> + idle <> + running + +With the fixed idle injection duration, we can give a value which is +an acceptable performance drop off or latency when we reach a specific +temperature and we begin to mitigate by varying the Idle injection +period. + +The mitigation begins with a maximum period value which decrease when +more cooling effect is requested. When the period duration is equal to +the idle duration, then we are in a situation the platform can’t +dissipate the heat enough and the mitigation fails. In this case the +situation is considered critical and there is nothing to do. The idle +injection duration must be changed by configuration and until we reach +the cooling effect, otherwise an additionnal cooling device must be +used or ultimately decrease the SoC performance by dropping the +highest OPP point of the SoC. + +The idle injection duration value must comply with the constraints: + +- It is lesser or equal to the latency we tolerate when the mitigation + begins. It is platform dependent and will depend on the user + experience, reactivity vs performance trade off we want. This value + should be specified. + +- It is greater than the idle state’s target residency we want to go + for thermal mitigation, otherwise we end up consuming more energy. + +Minimum period +-- + +The idle
Re: [patch -mm 3/4] mm, memcg: replace memory.oom_group with policy tunable
On Wed 17-01-18 14:18:33, David Rientjes wrote: > On Wed, 17 Jan 2018, Michal Hocko wrote: > > > Absolutely agreed! And moreover, there are not all that many ways what > > to do as an action. You just kill a logical entity - be it a process or > > a logical group of processes. But you have way too many policies how > > to select that entity. Do you want to chose the youngest process/group > > because all the older ones have been computing real stuff and you would > > lose days of your cpu time? Or should those who pay more should be > > protected (aka give them static priorities), or you name it... > > > > That's an argument for making the interface extensible, yes. And there is no interface to control the selection yet so we can develop one on top. > > I am sorry, I still didn't grasp the full semantic of the proposed > > soluton but the mere fact it is starting by conflating selection and the > > action is a no go and a wrong API. This is why I've said that what you > > (David) outlined yesterday is probably going to suffer from a much > > longer discussion and most likely to be not acceptable. Your patchset > > proves me correct... > > I'm very happy to change the API if there are better suggestions. That > may end up just being an memory.oom_policy file, as this implements, and > separating out a new memory.oom_action that isn't a boolean value to > either do a full group kill or only a single process. Or it could be what > I suggested in my mail to Tejun, such as "hierarchy killall" written to > memory.oom_policy, which would specify a single policy and then an > optional mechanism. With my proposed patchset, there would then be three > policies: "none", "cgroup", and "tree" and one possible optional > mechanism: "killall". You haven't convinced me at all. This all sounds more like "what if" than a really thought through interface. I've tried to point out that having a real policy driven victim selection is a _hard_ thing to do _right_. On the other hand oom_group makes semantic sense. It controls the killable entity and there are usecases which want to consider the full memcg as a single killable entity. No matter what selection policy we chose on top. It is just a natural API. Now you keep arguing about the victim selection and different strategies to implement it. We will not move forward as long as you keep conflating the two things, I am afraid. -- Michal Hocko SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/3] s390/docs: mention subchannel types
Since the original inception of the s390-drivers document, the common I/O layer has grown support for more types of subchannels. Give at least a pointer for the various types. Signed-off-by: Cornelia Huck--- Documentation/driver-api/s390-drivers.rst | 19 +++ 1 file changed, 19 insertions(+) diff --git a/Documentation/driver-api/s390-drivers.rst b/Documentation/driver-api/s390-drivers.rst index ecf8851d3565..42350f21357d 100644 --- a/Documentation/driver-api/s390-drivers.rst +++ b/Documentation/driver-api/s390-drivers.rst @@ -22,9 +22,28 @@ While most I/O devices on a s390 system are typically driven through the channel I/O mechanism described here, there are various other methods (like the diag interface). These are out of the scope of this document. +The s390 common I/O layer also provides access to some devices that are +not strictly considered I/O devices. They are considered here as well, +although they are not the focus of this document. + Some additional information can also be found in the kernel source under Documentation/s390/driver-model.txt. +The css bus +=== + +The css bus contains the subchannels available on the system. They fall +into several categories: + +* Standard I/O subchannels, for use by the system. They have a child + device on the ccw bus and are described below. +* I/O subchannels bound to the vfio-ccw driver. See + Documentation/s390/vfio-ccw.txt. +* Message subchannels. No Linux driver currently exists. +* CHSC subchannels (at most one). The chsc subchannel driver can be used + to send asynchronous chsc commands. +* eADM subchannels. Used for talking to storage class memory. + The ccw bus === -- 2.13.6 -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/3] s390/docs: reword airq section
Also mention the iv helpers as well. Signed-off-by: Cornelia Huck--- Documentation/driver-api/s390-drivers.rst | 13 + 1 file changed, 9 insertions(+), 4 deletions(-) diff --git a/Documentation/driver-api/s390-drivers.rst b/Documentation/driver-api/s390-drivers.rst index 42350f21357d..30e6aa7e160b 100644 --- a/Documentation/driver-api/s390-drivers.rst +++ b/Documentation/driver-api/s390-drivers.rst @@ -121,10 +121,15 @@ ccw group devices Generic interfaces == -Some interfaces are available to other drivers that do not necessarily -have anything to do with the busses described above, but still are -indirectly using basic infrastructure in the common I/O layer. One -example is the support for adapter interrupts. +The following section contains interfaces in use not only by drivers +dealing with ccw devices, but drivers for various other s390 hardware +as well. + +Adapter interrupts +-- + +The common I/O layer provides helper functions for dealing with adapter +interrupts and interrupt vectors. .. kernel-doc:: drivers/s390/cio/airq.c :export: -- 2.13.6 -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 3/3] s390/cmf: fix kerneldoc
Make sure we use proper Return sections, and make the output for cmf_enable() less odd. Signed-off-by: Cornelia Huck--- drivers/s390/cio/cmf.c | 15 --- 1 file changed, 8 insertions(+), 7 deletions(-) diff --git a/drivers/s390/cio/cmf.c b/drivers/s390/cio/cmf.c index 5e495c62cfa7..6e4f50d5b655 100644 --- a/drivers/s390/cio/cmf.c +++ b/drivers/s390/cio/cmf.c @@ -1118,9 +1118,10 @@ int ccw_set_cmf(struct ccw_device *cdev, int enable) * enable_cmf() - switch on the channel measurement for a specific device * @cdev: The ccw device to be enabled * - * Returns %0 for success or a negative error value. - * Note: If this is called on a device for which channel measurement is already - * enabled a reset of the measurement data is triggered. + * Enable channel measurements for @cdev. If this is called on a device + * for which channel measurement is already enabled a reset of the + * measurement data is triggered. + * Returns: %0 for success or a negative error value * Context: *non-atomic */ @@ -1160,7 +1161,7 @@ int enable_cmf(struct ccw_device *cdev) * __disable_cmf() - switch off the channel measurement for a specific device * @cdev: The ccw device to be disabled * - * Returns %0 for success or a negative error value. + * Returns: %0 for success or a negative error value. * * Context: *non-atomic, device_lock() held. @@ -1184,7 +1185,7 @@ int __disable_cmf(struct ccw_device *cdev) * disable_cmf() - switch off the channel measurement for a specific device * @cdev: The ccw device to be disabled * - * Returns %0 for success or a negative error value. + * Returns: %0 for success or a negative error value. * * Context: *non-atomic @@ -1205,7 +1206,7 @@ int disable_cmf(struct ccw_device *cdev) * @cdev: the channel to be read * @index: the index of the value to be read * - * Returns the value read or %0 if the value cannot be read. + * Returns: The value read or %0 if the value cannot be read. * * Context: *any @@ -1220,7 +1221,7 @@ u64 cmf_read(struct ccw_device *cdev, int index) * @cdev: the channel to be read * @data: a pointer to a data block that will be filled * - * Returns %0 on success, a negative error value otherwise. + * Returns: %0 on success, a negative error value otherwise. * * Context: *any -- 2.13.6 -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 0/3] s390: documentation update
The first two are updates that I had around for a long time, but somehow forgot to send them out. While looking at the result of make pdfdocs, I found that the output for the cmf stuff looked a bit suboptimal, so I added a bonus patch for that. A branch based on the s390 features branch can be found at git://git.kernel.org/pub/scm/linux/kernel/git/cohuck/linux.git s390-doc-update Cornelia Huck (3): s390/docs: mention subchannel types s390/docs: reword airq section s390/cmf: fix kerneldoc Documentation/driver-api/s390-drivers.rst | 32 +++ drivers/s390/cio/cmf.c| 15 --- 2 files changed, 36 insertions(+), 11 deletions(-) -- 2.13.6 -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] acpi, spcr: Make SPCR available to x86
On 01/22/2018 04:49 PM, Timur Tabi wrote: > On 01/18/2018 09:09 AM, Prarit Bhargava wrote: >> if (acpi_disabled) { >> -if (earlycon_init_is_deferred) >> +if (earlycon_acpi_spcr_enable) > > This patch works for me, so I can ACK it, but first you might want to rename > earlycon_acpi_spcr_enable, because these two lines don't make much sense. > > "If ACPI is disabled, and ACPI SCPR is enabled, then " > > If ACPI is disabled, then how can a variable called > "earlycon_acpi_spcr_enable" > be true? > > Would it make more sense to rename it to earlycon_spcr_enable? > acpi_disabled is a global runtime flag that can be set via "acpi=off" on the command line. It does not disable the tables, only the reading and interpreting of the data in the tables. P. -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v9 3/7] acpi: apei: Add SEI notification type support for ARMv8
sorry fix a typo. On 2018/1/23 17:23, gengdongjiu wrote: >> There are problems with doing this: >> >> Oct. 18, 2017, 10:26 a.m. James Morse wrote: >> | How do SEA and SEI interact? >> | >> | As far as I can see they can both interrupt each other, which isn't >> something >> | the single in_nmi() path in APEI can handle. I thinks we should fix this >> | first. >> >> [..] >> >> | SEA gets away with a lot of things because its synchronous. SEI isn't. Xie >> | XiuQi pointed to the memory_failure_queue() code. We can use this directly >> | from SEA, but not SEI. (what happens if an SError arrives while we are >> | queueing memory_failure work from an IRQ). >> | >> | The one that scares me is the trace-point reporting stuff. What happens if >> an >> | SError arrives while we are enabling a trace point? (these are static-keys >> | right?) >> | >> | I don't think we can just plumb SEI in like this and be done with it. >> | (I'm looking at teasing out the estatus cache code from being x86:NMI >> only. >> | This way we solve the same 'cant do this from NMI context' with the same >> | code'.) >> >> >> I will post what I've got for this estatus-cache thing as an RFC, its not >> ready >> to be considered yet. Yes, I know you are dong that. Your serial's patch will consider all above things, right? If your patch can be consider that, this patch can based on your patchset. thanks. > >> -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 3/3] mm: docs: add blank lines to silence sphinx "Unexpected indentation" errors
Signed-off-by: Mike Rapoport--- mm/pagewalk.c | 1 + mm/process_vm_access.c | 2 ++ mm/vmscan.c| 1 + 3 files changed, 4 insertions(+) diff --git a/mm/pagewalk.c b/mm/pagewalk.c index 23a3e415ac2c..8d2da5dec1e0 100644 --- a/mm/pagewalk.c +++ b/mm/pagewalk.c @@ -265,6 +265,7 @@ static int __walk_page_range(unsigned long start, unsigned long end, * pte_entry(), and/or hugetlb_entry(). If you don't set up for some of these * callbacks, the associated entries/pages are just ignored. * The return values of these callbacks are commonly defined like below: + * * - 0 : succeeded to handle the current entry, and if you don't reach the * end address yet, continue to walk. * - >0 : succeeded to handle the current entry, and return to the caller diff --git a/mm/process_vm_access.c b/mm/process_vm_access.c index 011edefd3c92..1a27c837f004 100644 --- a/mm/process_vm_access.c +++ b/mm/process_vm_access.c @@ -147,6 +147,7 @@ static int process_vm_rw_single_vec(unsigned long addr, * @riovcnt: size of rvec array * @flags: currently unused * @vm_write: 0 if reading from other process, 1 if writing to other process + * * Returns the number of bytes read/written or error code. May * return less bytes than expected if an error occurs during the copying * process. @@ -253,6 +254,7 @@ static ssize_t process_vm_rw_core(pid_t pid, struct iov_iter *iter, * @riovcnt: size of rvec array * @flags: currently unused * @vm_write: 0 if reading from other process, 1 if writing to other process + * * Returns the number of bytes read/written or error code. May * return less bytes than expected if an error occurs during the copying * process. diff --git a/mm/vmscan.c b/mm/vmscan.c index 47d5ced51f2d..8d01b095d97b 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -1606,6 +1606,7 @@ static unsigned long isolate_lru_pages(unsigned long nr_to_scan, * found will be decremented. * * Restrictions: + * * (1) Must be called with an elevated refcount on the page. This is a * fundamentnal difference from isolate_lru_pages (which is called * without a stable reference). -- 2.7.4 -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/3] mm: docs: fixup punctuation
so that kernel-doc will properly recognize the parameter and function descriptions. Signed-off-by: Mike Rapoport--- mm/ksm.c| 2 +- mm/memcontrol.c | 4 ++-- mm/mlock.c | 2 +- mm/nommu.c | 2 +- mm/sparse-vmemmap.c | 4 ++-- mm/zpool.c | 44 ++-- 6 files changed, 29 insertions(+), 29 deletions(-) diff --git a/mm/ksm.c b/mm/ksm.c index be8f4576f842..64207d936659 100644 --- a/mm/ksm.c +++ b/mm/ksm.c @@ -2309,7 +2309,7 @@ static struct rmap_item *scan_get_next_rmap_item(struct page **page) /** * ksm_do_scan - the ksm scanner main worker function. - * @scan_npages - number of pages we want to scan before we return. + * @scan_npages: number of pages we want to scan before we return. */ static void ksm_do_scan(unsigned int scan_npages) { diff --git a/mm/memcontrol.c b/mm/memcontrol.c index ac2ffd5e02b9..cddea3ed8e86 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -5885,8 +5885,8 @@ bool mem_cgroup_charge_skmem(struct mem_cgroup *memcg, unsigned int nr_pages) /** * mem_cgroup_uncharge_skmem - uncharge socket memory - * @memcg - memcg to uncharge - * @nr_pages - number of pages to uncharge + * @memcg: memcg to uncharge + * @nr_pages: number of pages to uncharge */ void mem_cgroup_uncharge_skmem(struct mem_cgroup *memcg, unsigned int nr_pages) { diff --git a/mm/mlock.c b/mm/mlock.c index 30472d438794..3d2e834a6cb7 100644 --- a/mm/mlock.c +++ b/mm/mlock.c @@ -157,7 +157,7 @@ static void __munlock_isolation_failed(struct page *page) /** * munlock_vma_page - munlock a vma page - * @page - page to be unlocked, either a normal page or THP page head + * @page: page to be unlocked, either a normal page or THP page head * * returns the size of the page as a page mask (0 for normal page, * HPAGE_PMD_NR - 1 for THP head page) diff --git a/mm/nommu.c b/mm/nommu.c index 17c00d93de2e..52c14127a861 100644 --- a/mm/nommu.c +++ b/mm/nommu.c @@ -1843,7 +1843,7 @@ int __access_remote_vm(struct task_struct *tsk, struct mm_struct *mm, } /** - * @access_remote_vm - access another process' address space + * access_remote_vm - access another process' address space * @mm:the mm_struct of the target address space * @addr: start address to access * @buf: source or destination buffer diff --git a/mm/sparse-vmemmap.c b/mm/sparse-vmemmap.c index 17acf01791fa..015ee4eb79bc 100644 --- a/mm/sparse-vmemmap.c +++ b/mm/sparse-vmemmap.c @@ -108,8 +108,8 @@ static unsigned long __meminit vmem_altmap_nr_free(struct vmem_altmap *altmap) /** * vmem_altmap_alloc - allocate pages from the vmem_altmap reservation - * @altmap - reserved page pool for the allocation - * @nr_pfns - size (in pages) of the allocation + * @altmap: reserved page pool for the allocation + * @nr_pfns: size (in pages) of the allocation * * Allocations are aligned to the size of the request */ diff --git a/mm/zpool.c b/mm/zpool.c index fd3ff719c32c..2e6f9c3cebe7 100644 --- a/mm/zpool.c +++ b/mm/zpool.c @@ -100,7 +100,7 @@ static void zpool_put_driver(struct zpool_driver *driver) /** * zpool_has_pool() - Check if the pool driver is available - * @type The type of the zpool to check (e.g. zbud, zsmalloc) + * @type: The type of the zpool to check (e.g. zbud, zsmalloc) * * This checks if the @type pool driver is available. This will try to load * the requested module, if needed, but there is no guarantee the module will @@ -135,10 +135,10 @@ EXPORT_SYMBOL(zpool_has_pool); /** * zpool_create_pool() - Create a new zpool - * @type The type of the zpool to create (e.g. zbud, zsmalloc) - * @name The name of the zpool (e.g. zram0, zswap) - * @gfpThe GFP flags to use when allocating the pool. - * @opsThe optional ops callback. + * @type: The type of the zpool to create (e.g. zbud, zsmalloc) + * @name: The name of the zpool (e.g. zram0, zswap) + * @gfp: The GFP flags to use when allocating the pool. + * @ops: The optional ops callback. * * This creates a new zpool of the specified type. The gfp flags will be * used when allocating memory, if the implementation supports it. If the @@ -199,7 +199,7 @@ struct zpool *zpool_create_pool(const char *type, const char *name, gfp_t gfp, /** * zpool_destroy_pool() - Destroy a zpool - * @pool The zpool to destroy. + * @pool: The zpool to destroy. * * Implementations must guarantee this to be thread-safe, * however only when destroying different pools. The same @@ -222,7 +222,7 @@ void zpool_destroy_pool(struct zpool *zpool) /** * zpool_get_type() - Get the type of the zpool - * @pool The zpool to check + * @pool: The zpool to check * * This returns the type of the pool. * @@ -237,10 +237,10 @@ const char *zpool_get_type(struct zpool *zpool) /** * zpool_malloc() - Allocate memory - * @pool
[PATCH 0/3] mm: docs: trivial fixes
Hi, These are some trvial fixes to the kernel-doc descriptions. Mike Rapoport (3): mm: docs: fixup punctuation mm: docs: fix parameter names mismatch mm: docs: add blank lines to silence sphinx "Unexpected indentation" errors mm/bootmem.c | 2 +- mm/ksm.c | 2 +- mm/maccess.c | 2 +- mm/memcontrol.c| 6 +++--- mm/mlock.c | 2 +- mm/nommu.c | 2 +- mm/pagewalk.c | 1 + mm/process_vm_access.c | 4 +++- mm/sparse-vmemmap.c| 4 ++-- mm/swap.c | 4 ++-- mm/vmscan.c| 1 + mm/z3fold.c| 4 ++-- mm/zbud.c | 4 ++-- mm/zpool.c | 46 +++--- 14 files changed, 44 insertions(+), 40 deletions(-) -- 2.7.4 -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/3] mm: docs: fix parameter names mismatch
There are several places where parameter descriptions do no match the actual code. Fix it. Signed-off-by: Mike Rapoport--- mm/bootmem.c | 2 +- mm/maccess.c | 2 +- mm/memcontrol.c| 2 +- mm/process_vm_access.c | 2 +- mm/swap.c | 4 ++-- mm/z3fold.c| 4 ++-- mm/zbud.c | 4 ++-- mm/zpool.c | 20 ++-- 8 files changed, 20 insertions(+), 20 deletions(-) diff --git a/mm/bootmem.c b/mm/bootmem.c index 6aef64254203..9e197987b67d 100644 --- a/mm/bootmem.c +++ b/mm/bootmem.c @@ -410,7 +410,7 @@ void __init free_bootmem_node(pg_data_t *pgdat, unsigned long physaddr, /** * free_bootmem - mark a page range as usable - * @addr: starting physical address of the range + * @physaddr: starting physical address of the range * @size: size of the range in bytes * * Partial pages will be considered reserved and left as they are. diff --git a/mm/maccess.c b/mm/maccess.c index 78f9274dd49d..ec00be51a24f 100644 --- a/mm/maccess.c +++ b/mm/maccess.c @@ -70,7 +70,7 @@ EXPORT_SYMBOL_GPL(probe_kernel_write); * strncpy_from_unsafe: - Copy a NUL terminated string from unsafe address. * @dst: Destination address, in kernel space. This buffer must be at * least @count bytes long. - * @src: Unsafe address. + * @unsafe_addr: Unsafe address. * @count: Maximum number of bytes to copy, including the trailing NUL. * * Copies a NUL-terminated string from unsafe address to kernel buffer. diff --git a/mm/memcontrol.c b/mm/memcontrol.c index cddea3ed8e86..0975cde3e83b 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -946,7 +946,7 @@ int mem_cgroup_scan_tasks(struct mem_cgroup *memcg, /** * mem_cgroup_page_lruvec - return lruvec for isolating/putting an LRU page * @page: the page - * @zone: zone of the page + * @pgdat: pgdat of the page * * This function is only safe when following the LRU page isolation * and putback protocol: the LRU lock must be held, and the page must diff --git a/mm/process_vm_access.c b/mm/process_vm_access.c index 8973cd231ece..011edefd3c92 100644 --- a/mm/process_vm_access.c +++ b/mm/process_vm_access.c @@ -25,7 +25,7 @@ /** * process_vm_rw_pages - read/write pages from task specified * @pages: array of pointers to pages we want to copy - * @start_offset: offset in page to start copying from/to + * @offset: offset in page to start copying from/to * @len: number of bytes to copy * @iter: where to copy to/from locally * @vm_write: 0 means copy from, 1 means copy to diff --git a/mm/swap.c b/mm/swap.c index 38e1b6374a97..2e8dae403474 100644 --- a/mm/swap.c +++ b/mm/swap.c @@ -913,11 +913,11 @@ EXPORT_SYMBOL(__pagevec_lru_add); * @pvec: Where the resulting entries are placed * @mapping: The address_space to search * @start: The starting entry index - * @nr_entries:The maximum number of entries + * @nr_pages: The maximum number of pages * @indices: The cache indices corresponding to the entries in @pvec * * pagevec_lookup_entries() will search for and return a group of up - * to @nr_entries pages and shadow entries in the mapping. All + * to @nr_pages pages and shadow entries in the mapping. All * entries are placed in @pvec. pagevec_lookup_entries() takes a * reference against actual pages in @pvec. * diff --git a/mm/z3fold.c b/mm/z3fold.c index 39e19125d6a0..d589d318727f 100644 --- a/mm/z3fold.c +++ b/mm/z3fold.c @@ -769,7 +769,7 @@ static void z3fold_free(struct z3fold_pool *pool, unsigned long handle) /** * z3fold_reclaim_page() - evicts allocations from a pool page and frees it * @pool: pool from which a page will attempt to be evicted - * @retires: number of pages on the LRU list for which eviction will + * @retries: number of pages on the LRU list for which eviction will * be attempted before failing * * z3fold reclaim is different from normal system reclaim in that it is done @@ -779,7 +779,7 @@ static void z3fold_free(struct z3fold_pool *pool, unsigned long handle) * z3fold and the user, however. * * To avoid these, this is how z3fold_reclaim_page() should be called: - + * * The user detects a page should be reclaimed and calls z3fold_reclaim_page(). * z3fold_reclaim_page() will remove a z3fold page from the pool LRU list and * call the user-defined eviction handler with the pool and handle as diff --git a/mm/zbud.c b/mm/zbud.c index b42322e50f63..28458f7d1e84 100644 --- a/mm/zbud.c +++ b/mm/zbud.c @@ -466,7 +466,7 @@ void zbud_free(struct zbud_pool *pool, unsigned long handle) /** * zbud_reclaim_page() - evicts allocations from a pool page and frees it * @pool: pool from which a page will attempt to be evicted - * @retires: number of pages on the LRU list for which eviction will + * @retries: number of pages on the LRU list for which eviction will * be attempted before failing * * zbud reclaim is different
Re: [PATCH v9 3/7] acpi: apei: Add SEI notification type support for ARMv8
Hi James, On 2018/1/23 3:39, James Morse wrote: > Hi Dongjiu Geng, > > (versions of patches 1,2 and 4 have been queued by Catalin) > > (Nit 'ACPI / APEI:' is the normal subject prefix for ghes.c, this helps the > maintainers know which patches they need to pay attention to when you are > touching multiple trees) > > On 06/01/18 16:02, Dongjiu Geng wrote: >> ARMv8.2 requires implementation of the RAS extension. > >> In >> this extension, it adds SEI(SError Interrupt) notification >> type, this patch adds new GHES error source SEI handling >> functions. > > This reads as if this patch is handling SError RAS notifications generated by > a > CPU with the RAS extensions. These are about CPU->Software notifications. APEI > and GHES are a firmware first mechanism which is Software->Software. > Reading the v8.2 documents won't help anyone with the APEI/GHES code. > > Please describe this from the ACPI view, "ACPI 6.x adds support for NOTIFY_SEI > as a GHES notification mechanism... ", its up to the arch code to spot a v8.2 > RAS Error based on the cpu caps.Ok, I will modify it. > > >> This error source parsing and handling method >> is similar with the SEA. > > There are problems with doing this: > > Oct. 18, 2017, 10:26 a.m. James Morse wrote: > | How do SEA and SEI interact? > | > | As far as I can see they can both interrupt each other, which isn't > something > | the single in_nmi() path in APEI can handle. I thinks we should fix this > | first. > > [..] > > | SEA gets away with a lot of things because its synchronous. SEI isn't. Xie > | XiuQi pointed to the memory_failure_queue() code. We can use this directly > | from SEA, but not SEI. (what happens if an SError arrives while we are > | queueing memory_failure work from an IRQ). > | > | The one that scares me is the trace-point reporting stuff. What happens if > an > | SError arrives while we are enabling a trace point? (these are static-keys > | right?) > | > | I don't think we can just plumb SEI in like this and be done with it. > | (I'm looking at teasing out the estatus cache code from being x86:NMI only. > | This way we solve the same 'cant do this from NMI context' with the same > | code'.) > > > I will post what I've got for this estatus-cache thing as an RFC, its not > ready > to be considered yet.Yes, I know you are dong that. Your serial's patch will > consider all above things, right? If your patch can be consider that, this patch can based on your patchset. thanks. > > >> Expose API ghes_notify_sei() to external users. External >> modules can call this exposed API to parse APEI table and >> handle the SEI notification. > > external modules? You mean called by the arch code when it gets this > NOTIFY_SEI? yes, called by kernel ARCH code, such as below, I remember I have discussed with you. asmlinkage void do_serror(struct pt_regs *regs, unsigned int esr) { nmi_enter(); if (!ghes_notify_sei()) return; /* non-RAS errors are not containable */ if (!arm64_is_ras_serror(esr) || arm64_is_fatal_ras_serror(regs, esr)) arm64_serror_panic(regs, esr); nmi_exit(); } > > > Thanks, > > James > > . > -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html