date:20171103

Re: [RFC v2] prctl: prctl(PR_SET_IDLE, PR_IDLE_MODE_KILLME), for stateless idle loops

2017-11-03 Thread Michal Hocko

On Thu 02-11-17 23:35:44, Shawn Landden wrote:
> It is common for services to be stateless around their main event loop.
> If a process sets PR_SET_IDLE to PR_IDLE_MODE_KILLME then it
> signals to the kernel that epoll_wait() and friends may not complete,
> and the kernel may send SIGKILL if resources get tight.
> 
> See my systemd patch: https://github.com/shawnl/systemd/tree/prctl
> 
> Android uses this memory model for all programs, and having it in the
> kernel will enable integration with the page cache (not in this
> series).
> 
> 16 bytes per process is kinda spendy, but I want to keep
> lru behavior, which mem_score_adj does not allow. When a supervisor,
> like Android's user input is keeping track this can be done in user-space.
> It could be pulled out of task_struct if an cross-indexing additional
> red-black tree is added to support pid-based lookup.

This is still an abuse and the patch is wrong. We really do have an API
to use I fail to see why you do not use it.

[...]
> @@ -1018,6 +1060,24 @@ bool out_of_memory(struct oom_control *oc)
>   return true;
>   }
>  
> + /*
> +  * Check death row for current memcg or global.
> +  */
> + l = oom_target_get_queue(current);
> + if (!list_empty(l)) {
> + struct task_struct *ts = list_first_entry(l,
> + struct task_struct, se.oom_target_queue);
> +
> + pr_debug("Killing pid %u from EPOLL_KILLME death row.",
> +  ts->pid);
> +
> + /* We use SIGKILL instead of the oom killer
> +  * so as to cleanly interrupt ep_poll()
> +  */
> + send_sig(SIGKILL, ts, 1);
> + return true;
> + }

Still not NUMA aware and completely backwards. If this is a memcg OOM
then it is _memcg_ to evaluate not the current. The oom might happen up
the hierarchy due to hard limit.

But still, you should be very clear _why_ the existing oom tuning is not
appropropriate and we can think of a way to hanle it better but cramming
the oom selection this way is simply not acceptable.
-- 
Michal Hocko
SUSE Labs

Re: [PATCH v4 3/3] KVM: MMU: consider host cache mode in MMIO page check

2017-11-03 Thread Xiao Guangrong




On 11/03/2017 04:51 PM, Haozhong Zhang wrote:

On 11/03/17 14:54 +0800, Xiao Guangrong wrote:



On 11/03/2017 01:53 PM, Haozhong Zhang wrote:

Some reserved pages, such as those from NVDIMM DAX devices, are
not for MMIO, and can be mapped with cached memory type for better
performance. However, the above check misconceives those pages as
MMIO.  Because KVM maps MMIO pages with UC memory type, the
performance of guest accesses to those pages would be harmed.
Therefore, we check the host memory type by lookup_memtype() in
addition and only treat UC/UC- pages as MMIO.

Signed-off-by: Haozhong Zhang 
Reported-by: Cuevas Escareno, Ivan D 
Reported-by: Kumar, Karthik 
---
   arch/x86/kvm/mmu.c | 19 ++-
   1 file changed, 18 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 0b481cc9c725..e9ed0e666a83 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -2708,7 +2708,24 @@ static bool mmu_need_write_protect(struct kvm_vcpu 
*vcpu, gfn_t gfn,
   static bool kvm_is_mmio_pfn(kvm_pfn_t pfn)
   {
if (pfn_valid(pfn))
-   return !is_zero_pfn(pfn) && PageReserved(pfn_to_page(pfn));
+   return !is_zero_pfn(pfn) && PageReserved(pfn_to_page(pfn)) &&
+   /*
+* Some reserved pages, such as those from
+* NVDIMM DAX devices, are not for MMIO, and
+* can be mapped with cached memory type for
+* better performance. However, the above
+* check misconceives those pages as MMIO.
+* Because KVM maps MMIO pages with UC memory
+* type, the performance of guest accesses to
+* those pages would be harmed. Therefore, we
+* check the host memory type in addition and
+* only treat UC/UC- pages as MMIO.
+*
+* pat_pfn_is_uc() works only when PAT is enabled,
+* so check pat_enabled() as well.
+*/
+   (!pat_enabled() ||
+pat_pfn_is_uc(kvm_pfn_t_to_pfn_t(pfn)));


Can it be compiled if !CONFIG_PAT?


Yes.

What I check via pat_enabled() is not only whether PAT support is
compiled, but also whether PAT is enabled at runtime.


The issue is about pat_pfn_is_uc() which is implemented only if CONFIG_PAT is
enabled, but you used it here unconditionally.

I am not sure if gcc is smart enough to omit pat_pfn_is_uc() completely under
this case. If you really have done the test to compile kernel and KVM module
with CONFIG_PAT disabled, it is fine.

Re: [PATCH] s390/mm: fix pud table accounting

2017-11-03 Thread Michal Hocko

On Fri 03-11-17 10:05:51, Heiko Carstens wrote:
> With "mm: account pud page tables" and "mm: consolidate page table
> accounting" pud page table accounting was introduced which now results
> in tons of warnings like this one on s390:
> 
> BUG: non-zero pgtables_bytes on freeing mm: -16384
> 
> Reason for this are our run-time folded page tables: by default new
> processes start with three page table levels where the allocated pgd
> is the same as the first pud. In this case there won't ever be a pud
> allocated and therefore mm_inc_nr_puds() will also never be called.
> 
> However when freeing the address space free_pud_range() will call
> exactly once mm_dec_nr_puds() which leads to misaccounting.
> 
> Therefore call mm_inc_nr_puds() within init_new_context() to fix
> this. This is the same like we have it already for processes that run
> with two page table levels (aka compat processes).
> 
> While at it also adjust the comment, since there is no "mm->nr_pmds"
> anymore.

Subtle...

Thanks for the fix, I didn't have any idea about this when reviewing.

> Cc: Kirill A. Shutemov 
> Cc: Michal Hocko 
> Cc: Gerald Schaefer 
> Cc: Martin Schwidefsky 
> Signed-off-by: Heiko Carstens 
> ---
>  arch/s390/include/asm/mmu_context.h | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/s390/include/asm/mmu_context.h 
> b/arch/s390/include/asm/mmu_context.h
> index 3c9abedc323c..4f943d58cbac 100644
> --- a/arch/s390/include/asm/mmu_context.h
> +++ b/arch/s390/include/asm/mmu_context.h
> @@ -43,6 +43,8 @@ static inline int init_new_context(struct task_struct *tsk,
>   mm->context.asce_limit = STACK_TOP_MAX;
>   mm->context.asce = __pa(mm->pgd) | _ASCE_TABLE_LENGTH |
>  _ASCE_USER_BITS | _ASCE_TYPE_REGION3;
> + /* pgd_alloc() did not account this pud */
> + mm_inc_nr_puds(mm);
>   break;
>   case -PAGE_SIZE:
>   /* forked 5-level task, set new asce with new_mm->pgd */
> @@ -58,7 +60,7 @@ static inline int init_new_context(struct task_struct *tsk,
>   /* forked 2-level compat task, set new asce with new mm->pgd */
>   mm->context.asce = __pa(mm->pgd) | _ASCE_TABLE_LENGTH |
>  _ASCE_USER_BITS | _ASCE_TYPE_SEGMENT;
> - /* pgd_alloc() did not increase mm->nr_pmds */
> + /* pgd_alloc() did not account this pmd */
>   mm_inc_nr_pmds(mm);
>   }
>   crst_table_init((unsigned long *) mm->pgd, pgd_entry_type(mm));
> -- 
> 2.13.5
> 

-- 
Michal Hocko
SUSE Labs

Re: [PATCH v3 14/17] PCI: dwc: artpec6: Add support for endpoint mode

2017-11-03 Thread Niklas Cassel

On 11/02/2017 10:13 AM, Arnd Bergmann wrote:
> On Tue, Oct 31, 2017 at 11:39 PM, Niklas Cassel  
> wrote:
>> Signed-off-by: Niklas Cassel 
> 
> It seems like you are missing a changelog text. Please explain what
> your work is good for
> in any patch you send.

You are correct, this patch is missing a changelog text.
I will send a V4 of the patch series for this.

> 
>> V3:
>> * Removed ifdefs around match table and match table data.
>> * Removed ifdefs in probe, use dummy implementations instead.
> 
> I think there is room for more of the same ;-)
> 
>>
>> +#ifdef CONFIG_PCIE_ARTPEC6_HOST
>>  static void artpec6_pcie_enable_interrupts(struct artpec6_pcie 
>> *artpec6_pcie)
>>  {
>> struct dw_pcie *pci = artpec6_pcie->pci;
>> @@ -231,11 +257,92 @@ static int artpec6_add_pcie_port(struct artpec6_pcie 
>> *artpec6_pcie,
>>
>> return 0;
>>  }
>> +#else
>> +static inline int artpec6_add_pcie_port(struct artpec6_pcie *artpec6_pcie,
>> +   struct platform_device *pdev)
>> +{
>> +   return -ENODEV;
>> +}
>> +#endif
> 
> 
> Can you try replacing the #ifdef with
> 
> 
> if (!IS_ENABLED(CONFIG_PCIE_ARTPEC6_HOST))
>  return -ENODEV;
> 
> at the start of artpec6_pcie_enable_interrupts? I think that would improve
> readability here.
> 

artpec6_pcie_enable_interrupts is a void function, so
I guess that you meant at the start of artpec6_add_pcie_port.
That would not really help since artpec6_add_pcie_port
calls artpec6_pcie_msi_handler, and uses artpec6_pcie_host_ops,
which is still inside the CONFIG_PCIE_ARTPEC6_HOST ifdef block.

Please note that there are several functions, as well as
artpec6_pcie_host_ops inside the 
CONFIG_PCIE_ARTPEC6_HOST ifdef block.

The reason for this is because Bjorn was surprised that
this driver at V1 didn't have any ifdefs, even though
it supports two different modes: HOST and EP.
I suspected that his reasoning was that if you compile
the driver with only one of the modes, it is wasteful
to compile and include the functions that belong to the
mode that we are not using in the vmlinux.

>> +static int artpec6_add_pcie_ep(struct artpec6_pcie *artpec6_pcie,
>> +  struct platform_device *pdev)
>> +{
>> +   int ret;
>> +   struct dw_pcie_ep *ep;
>> +   struct resource *res;
>> +   struct device *dev = >dev;
>> +   struct dw_pcie *pci = artpec6_pcie->pci;
> 
> The same trick should work here with the other symbol.

While artpec6_add_pcie_ep doesn't call any other
function in this file, it does use pcie_ep_ops,
which does reference other functions in this file
(which are inside the ifdef block).

Regards,
Niklas

Re: [PATCH 11/17] coresight etr: Handle driver mode specific ETR buffers

2017-11-03 Thread Suzuki K Poulose


On 02/11/17 20:26, Mathieu Poirier wrote:

On Thu, Oct 19, 2017 at 06:15:47PM +0100, Suzuki K Poulose wrote:

Since the ETR could be driven either by SYSFS or by perf, it
becomes complicated how we deal with the buffers used for each
of these modes. The ETR driver cannot simply free the current
attached buffer without knowing the provider (i.e, sysfs vs perf).

To solve this issue, we provide:
1) the driver-mode specific etr buffer to be retained in the drvdata
2) the etr_buf for a session should be passed on when enabling the
hardware, which will be stored in drvdata->etr_buf. This will be
replaced (not free'd) as soon as the hardware is disabled, after
necessary sync operation.


If I get you right the problem you're trying to solve is what to do with a sysFS
buffer that hasn't been read (and freed) when a perf session is requested.  In
my opinion it should simply be freed.  Indeed the user probably doesn't care
much about that sysFS buffer, if it did the data would have been harvested.


Not only that. If we simply use the drvdata->etr_buf, we cannot track the mode
which uses it. If we keep the etr_buf around, how do the new mode user decide
how to free the existing one ? (e.g, the perf etr_buf could be associated with
other perf data structures). This change would allow us to leave the handling
of the etr_buf to its respective modes.

And whether to keep the sysfs etr_buf around is a separate decision from the
above.


Cheers
Suzuki

Re: next-20171102: ARM64 dies on boot

2017-11-03 Thread Sudeep Holla



On 03/11/17 04:18, Yury Norov wrote:
> Hi all,
> 
> I reproduce it with qemu. The exact reason of panic is the NULL-dereference
> in memory_present:
> (gdb) bt
> #0  0x08dd8c6c in sparse_index_init (nid=, 
> section_nr=)
> at mm/sparse.c:80
> #1  memory_present (nid=0, start=18446462598881083392, end=0) at 
> mm/sparse.c:215
> #2  0x08dc518c in arm64_memory_present () at arch/arm64/mm/init.c:307
> #3  bootmem_init () at arch/arm64/mm/init.c:500
> #4  0x08dc28fc in setup_arch (cmdline_p=) at 
> arch/arm64/kernel/setup.c:287
> #5  0x08dc083c in start_kernel () at init/main.c:530
> #6  0x in ?? ()
> 

[...]

> This is very early stage, so there's no messages in console.
> Config is attached. If no ideas, I can bisect it later.
> 

Reported and fixed[1], may be not in -next yet.

-- 
Regards,
Sudeep

[1] https://marc.info/?l=linux-kernel=150962592016250=2

Re: [PATCH v2 0/2] pinctrl: Allow indicating loss of state across suspend/resume

2017-11-03 Thread Charles Keepax

On Thu, Nov 02, 2017 at 04:15:49PM -0700, Florian Fainelli wrote:
> Hello Linus,
> 
> It's me again, so I have been thinking about the problem originally
> reported in: [PATCH fixes v3] pinctrl: Really force states during 
> suspend/resume
> 
> and other similar patches a while ago, and this new version allows a platform
> using pinctrl-single to specify whether its pins are going to lose their state
> during a system deep sleep.
> 
> Note that this is still checked at the pinctrl_select_state() because 
> consumers
> of the pinctrl API might be calling this from their suspend/resume functions
> and should not have to know whether the provider does lose its pin states.
> 

Still feels to me like it should be the providers job to the
restore the state rather than expecting the consumer to
re-request any state it had. But lets wait and see what Linus
thinks.

Also not sure if you have seen this chain, but probably worth a
look:

https://www.spinics.net/lists/devicetree/msg200649.html

It is adding support to the GPIO code for controllers that can
have options to retain state across reset, not the same but
probably at least slightly related to this series.

Thanks,
Charles

Re: [PATCH 1/2] staging: greybus: remove unused kfifo_ts

2017-11-03 Thread Viresh Kumar

On 02-11-17, 15:32, Arnd Bergmann wrote:
> As of commit 8e1d6c336d74 ("greybus: loopback: drop bus aggregate
> calculation"), nothing ever reads from kfifo_ts, so there is no
> reason to write to it or even allocate it any more.
> 
> Signed-off-by: Arnd Bergmann 
> ---
>  drivers/staging/greybus/loopback.c | 27 +++
>  1 file changed, 3 insertions(+), 24 deletions(-)

Reviewed-by: Viresh Kumar 

-- 
viresh

Re: [Patch v11 1/4] irqchip/qeic: merge qeic init code from platforms to a common function

2017-11-03 Thread Marc Zyngier

On 01/11/17 01:36, Zhao Qiang wrote:
> The codes of qe_ic init from a variety of platforms are redundant,
> merge them to a common function and put it to irqchip/irq-qeic.c
> 
> For non-p1021_mds mpc85xx_mds boards, use "qe_ic_init(np, 0,
> qe_ic_cascade_low_mpic, qe_ic_cascade_high_mpic);" instead of
> "qe_ic_init(np, 0, qe_ic_cascade_muxed_mpic, NULL);".
> 
> qe_ic_cascade_muxed_mpic was used for boards has the same interrupt
> number for low interrupt and high interrupt, qe_ic_init has checked
> if "low interrupt == high interrupt"
> 
> Signed-off-by: Zhao Qiang 
> ---
>  arch/powerpc/platforms/83xx/misc.c| 15 ---
>  arch/powerpc/platforms/85xx/corenet_generic.c |  9 -
>  arch/powerpc/platforms/85xx/mpc85xx_mds.c | 14 --
>  arch/powerpc/platforms/85xx/mpc85xx_rdb.c | 16 
>  arch/powerpc/platforms/85xx/twr_p102x.c   | 14 --
>  drivers/irqchip/irq-qeic.c| 13 +
>  6 files changed, 13 insertions(+), 68 deletions(-)

[...]

> diff --git a/drivers/irqchip/irq-qeic.c b/drivers/irqchip/irq-qeic.c
> index 9b4660cf9267..8287c22d954a 100644
> --- a/drivers/irqchip/irq-qeic.c
> +++ b/drivers/irqchip/irq-qeic.c
> @@ -18,6 +18,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -598,4 +599,16 @@ static int __init init_qe_ic_sysfs(void)
>   return 0;
>  }
>  
> +static int __init qeic_of_init(struct device_node *node,
> +struct device_node *parent)
> +{
> + if (!node)
> + return -ENODEV;
> + qe_ic_init(node, 0, qe_ic_cascade_low_mpic,
> +qe_ic_cascade_high_mpic);
> + of_node_put(node);
> + return 0;
> +}
> +
> +IRQCHIP_DECLARE(qeic, "fsl,qe-ic", qeic_of_init);
>  subsys_initcall(init_qe_ic_sysfs);
> 

Where is that file? It doesn't seem to be in mainline. I'm afraid this
doesn't help reviewing this series...

M.
-- 
Jazz is not dead. It just smells funny...

Re: [PATCH v2] printk: Add console owner and waiter logic to load balance console writes

2017-11-03 Thread Steven Rostedt

On Fri, 3 Nov 2017 11:19:53 +0100
Jan Kara  wrote:

> Hi,
> 
> On Thu 02-11-17 13:06:05, Steven Rostedt wrote:
> > +   if (spin) {
> > +   /* We spin waiting for the owner to release us 
> > */
> > +   spin_acquire(_owner_dep_map, 0, 0, 
> > _THIS_IP_);
> > +   /* Owner will clear console_waiter on hand off 
> > */
> > +   while (!READ_ONCE(console_waiter))
> > +   cpu_relax();  
> 
> Hum, what prevents us from rescheduling here? And what if the process
> stored in console_owner is scheduled out? Both seem to be possible with
> CONFIG_PREEMPT kernel? Unless I'm missing something you will need to
> disable preemption in some places...

Yes you are missing something ;-)

> 
> Other than that I like the simplicity of your approach.
> 
>   Honza
> 
> > +
> > +   spin_release(_owner_dep_map, 1, 
> > _THIS_IP_);
> > +   printk_safe_exit_irqrestore(flags);

The above line re-enables interrupts. And is done for both the
console_owner and the console_waiter. These are only held with
interrupts disabled. Nothing will preempt it. In fact, if it could,
lockdep would complain (it did in when I screwed it up at first ;-)


-- Steve

Re: [PATCH v2 1/3] Sony-laptop: Fix exception handling in sony_nc_setup_rfkill()

2017-11-03 Thread Andy Shevchenko

On Wed, Nov 1, 2017 at 8:46 PM, SF Markus Elfring
 wrote:
> From: Markus Elfring 
> Date: Wed, 1 Nov 2017 18:42:45 +0100
>
> Source code review for a specific software refactoring showed the need
> for another correction because the error code "-1" was returned so far
> if a call of the function "sony_call_snc_handle" failed here.
> Thus assign the return value from these two function calls also to
> the variable "err" and provide it in case of a failure.
>

Applied to my review and testing queue, thanks!

> Fixes: d6f15ed876b83a1a0eba1d0473eef58acc95444a ("sony-laptop: use soft 
> rfkill status stored in hw")
> Suggested-by: Andy Shevchenko 
> Link: https://lkml.org/lkml/2017/10/31/463
> Link: 
> https://lkml.kernel.org/r/
> Signed-off-by: Markus Elfring 
> ---
>  drivers/platform/x86/sony-laptop.c | 14 --
>  1 file changed, 8 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/platform/x86/sony-laptop.c 
> b/drivers/platform/x86/sony-laptop.c
> index a16cea2be9c3..4332cc982ce0 100644
> --- a/drivers/platform/x86/sony-laptop.c
> +++ b/drivers/platform/x86/sony-laptop.c
> @@ -1660,17 +1660,19 @@ static int sony_nc_setup_rfkill(struct acpi_device 
> *device,
> if (!rfk)
> return -ENOMEM;
>
> -   if (sony_call_snc_handle(sony_rfkill_handle, 0x200, ) < 0) {
> +   err = sony_call_snc_handle(sony_rfkill_handle, 0x200, );
> +   if (err < 0) {
> rfkill_destroy(rfk);
> -   return -1;
> +   return err;
> }
> hwblock = !(result & 0x1);
>
> -   if (sony_call_snc_handle(sony_rfkill_handle,
> -   sony_rfkill_address[nc_type],
> -   ) < 0) {
> +   err = sony_call_snc_handle(sony_rfkill_handle,
> +  sony_rfkill_address[nc_type],
> +  );
> +   if (err < 0) {
> rfkill_destroy(rfk);
> -   return -1;
> +   return err;
> }
> swblock = !(result & 0x2);
>
> --
> 2.14.3
>



-- 
With Best Regards,
Andy Shevchenko

Re: [PATCH 1/2] perf: arm_spe: Prevent module unload while the PMU is in use

2017-11-03 Thread Mark Rutland

On Fri, Nov 03, 2017 at 11:45:17AM +, Suzuki K Poulose wrote:
> When the PMU driver is built as a module, the perf expects the
> pmu->module to be valid, so that the driver is prevented from
> being unloaded while it is in use. Fix the SPE pmu driver to
> fill in this field.
> 
> Cc: Will Deacon 
> Cc: Mark Rutland 
> Signed-off-by: Suzuki K Poulose 

Acked-by: Mark Rutland 

Mark.

> ---
>  drivers/perf/arm_spe_pmu.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/perf/arm_spe_pmu.c b/drivers/perf/arm_spe_pmu.c
> index 50511b13fd35..8ce262fc2561 100644
> --- a/drivers/perf/arm_spe_pmu.c
> +++ b/drivers/perf/arm_spe_pmu.c
> @@ -889,6 +889,7 @@ static int arm_spe_pmu_perf_init(struct arm_spe_pmu 
> *spe_pmu)
>   struct device *dev = _pmu->pdev->dev;
>  
>   spe_pmu->pmu = (struct pmu) {
> + .module = THIS_MODULE,
>   .capabilities   = PERF_PMU_CAP_EXCLUSIVE | PERF_PMU_CAP_ITRACE,
>   .attr_groups= arm_spe_pmu_attr_groups,
>   /*
> -- 
> 2.13.6
>

Re: [PATCH 2/2] arm-ccn: perf: Prevent module unload while PMU is in use

2017-11-03 Thread Mark Rutland

On Fri, Nov 03, 2017 at 11:45:18AM +, Suzuki K Poulose wrote:
> When the PMU driver is built as a module, the perf expects the
> pmu->module to be valid, so that the driver is prevented from
> being unloaded while it is in use. Fix the CCN pmu driver to
> fill in this field.
> 
> Fixes: commit a33b0daab73a0 ("bus: ARM CCN PMU driver")
> Cc: Pawel Moll 
> Cc: Will Deacon 
> Cc: Mark Rutland 
> Signed-off-by: Suzuki K Poulose 

With Will's nit on the Fixes tag cleaned up:

Acked-by: Mark Rutland 

Mark.

> ---
>  drivers/bus/arm-ccn.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/bus/arm-ccn.c b/drivers/bus/arm-ccn.c
> index e8c6946fed9d..3063f5312397 100644
> --- a/drivers/bus/arm-ccn.c
> +++ b/drivers/bus/arm-ccn.c
> @@ -1276,6 +1276,7 @@ static int arm_ccn_pmu_init(struct arm_ccn *ccn)
>  
>   /* Perf driver registration */
>   ccn->dt.pmu = (struct pmu) {
> + .module = THIS_MODULE,
>   .attr_groups = arm_ccn_pmu_attr_groups,
>   .task_ctx_nr = perf_invalid_context,
>   .event_init = arm_ccn_pmu_event_init,
> -- 
> 2.13.6
>

Re: [PATCH] platform/x86: hp-wmi: Fix tablet mode detection for convertibles

2017-11-03 Thread Andy Shevchenko

On Fri, Nov 3, 2017 at 4:01 AM, Stefan Brüns
 wrote:
> Commit f9cf3b2880cc ("platform/x86: hp-wmi: Refactor dock and tablet
> state fetchers") consolidated the methods for docking and laptop mode
> detection, but omitted to apply the correct mask for the laptop mode
> (it always uses the constant for docking).
>

Looks like a good catch!

Applied to my review and testing queue.

> Signed-off-by: Stefan Brüns 
>
> ---
>
> This change is untested, but restores the previous behaviour.
> ---
>  drivers/platform/x86/hp-wmi.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/platform/x86/hp-wmi.c b/drivers/platform/x86/hp-wmi.c
> index b4ed3dc983d5..b4224389febe 100644
> --- a/drivers/platform/x86/hp-wmi.c
> +++ b/drivers/platform/x86/hp-wmi.c
> @@ -297,7 +297,7 @@ static int hp_wmi_hw_state(int mask)
> if (state < 0)
> return state;
>
> -   return state & 0x1;
> +   return !!(state & mask);
>  }
>
>  static int __init hp_wmi_bios_2008_later(void)
> --
> 2.14.3
>



-- 
With Best Regards,
Andy Shevchenko

[PATCH] dpaa_eth: avoid uninitialized variable false-positive warning

2017-11-03 Thread Arnd Bergmann

We can now build this driver on ARM, so I ran into a randconfig build
warning that presumably had existed on powerpc already.

drivers/net/ethernet/freescale/dpaa/dpaa_eth.c: In function 'sg_fd_to_skb':
drivers/net/ethernet/freescale/dpaa/dpaa_eth.c:1712:18: error: 'skb' may be 
used uninitialized in this function [-Werror=maybe-uninitialized]

I'm slightly changing the logic here, to make it obvious to the
compiler that 'skb' is always initialized.

Signed-off-by: Arnd Bergmann 
---
 drivers/net/ethernet/freescale/dpaa/dpaa_eth.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c 
b/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c
index 969f6b12952e..ebc55b6a6349 100644
--- a/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c
+++ b/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c
@@ -1721,6 +1721,7 @@ static struct sk_buff *sg_fd_to_skb(const struct 
dpaa_priv *priv,
 
/* Iterate through the SGT entries and add data buffers to the skb */
sgt = vaddr + fd_off;
+   skb = NULL;
for (i = 0; i < DPAA_SGT_MAX_ENTRIES; i++) {
/* Extension bit is not supported */
WARN_ON(qm_sg_entry_is_ext([i]));
@@ -1738,7 +1739,7 @@ static struct sk_buff *sg_fd_to_skb(const struct 
dpaa_priv *priv,
count_ptr = this_cpu_ptr(dpaa_bp->percpu_count);
dma_unmap_single(dpaa_bp->dev, sg_addr, dpaa_bp->size,
 DMA_FROM_DEVICE);
-   if (i == 0) {
+   if (!skb) {
sz = dpaa_bp->size +
SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
skb = build_skb(sg_vaddr, sz);
-- 
2.9.0

Re: [PATCH 1/3] nvme: do not check for ns on rw path

2017-11-03 Thread Christoph Hellwig

> - if (ns && ns->ms &&
> + if (ns->ms &&
>   (!ns->pi_type || ns->ms != sizeof(struct t10_pi_tuple)) &&
>   !blk_integrity_rq(req) && !blk_rq_is_passthrough(req))
>   return BLK_STS_NOTSUPP;

blk_rq_is_passthrough also can't be true here.

How about:

if (ns->ms && !blk_integrity_rq(req) &&
(!ns->pi_type || ns->ms != sizeof(struct t10_pi_tuple)))
return BLK_STS_NOTSUPP;

Although I have to admit I don't really understand what this check
is even trying to do.  It basically checks for a namespace that has
a format with metadata that is not T10 protection information and
then rejects all I/O to it.  Why are we even creating a block device
node for such a thing?

Re: KASAN: use-after-free Read in do_raw_spin_lock

2017-11-03 Thread Dmitry Vyukov

On Fri, Nov 3, 2017 at 11:59 AM, Dmitry Vyukov  wrote:
> On Fri, Nov 3, 2017 at 2:51 AM, Paul Moore  wrote:
>> On Thu, Nov 2, 2017 at 1:52 PM, syzbot
>> 
>> wrote:
>>> Hello,
>>>
>>> syzkaller hit the following crash on
>>> ebe6e90ccc6679cb01d2b280e4b61e6092d4bedb
>>> git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/master
>>> compiler: gcc (GCC) 7.1.1 20170620
>>> .config is attached
>>> Raw console output is attached.
>>
>> I'm not sure a real person is watching for responses on this, but just
>> in case ... are you able to reproduce this failure at all?
>
> Yes, there are real people watching, at least initially. Long term we
> are aiming at self-service mostly.
> Please refer to the referenced doc (if there is anything unclear, we
> should improve it):
> https://github.com/google/syzkaller/blob/master/docs/syzbot.md#no-reproducer-at-all
>
>
>>  I'm
>> looking over the SELinux superblock code, as well as the corresponding
>> pieces in fs/super.c, and I'm not quite sure how we could get into the
>> situation where superblock's security blob is freed before the last
>> associated inode.
>
> So far we've seen this only once. So this is either caused by a very
> subtle race (e.g. inconsistency windows on 1 instruction), or a
> previously silently corrupted heap (however, in such cases KASAN
> reports frequently obviously inconsistent, e.g. allocation stack
> refers to an unrelated object, this is not the case as far as I see).
> Since this happened only once, this does not harm fuzzer. So if you
> don't see how this could happen in the code, we can leave it aside for
> now, then either we get new similar reports, or can close this later
> as invalid.
>
> Thanks



FWIW, from the log I see that this was this program that triggered the bug:

2017/10/18 09:57:08 executing program 6:
mmap(&(0x7f00/0xf64000)=nil, 0xf64000, 0x1, 0x31,
0x, 0x0)
mmap(&(0x7f00/0xfff000)=nil, 0xfff000, 0x3, 0x32,
0x, 0x0)
r0 = accept4$ax25(0xff9c, 0x0, &(0x7f001000-0x4)=0x0, 0x800)
r1 = accept4$unix(0x, &(0x7fb8e000)=@file={0x0,
"00"},
&(0x7fec1000)=0x55, 0x8)
bind$unix(r1, &(0x7f0fc000-0x8)=@abs={0x1, 0x0, 0x1}, 0x8)
mmap(&(0x7f00/0x21)=nil, 0x21, 0x3, 0x32, r0, 0x0)
mmap(&(0x7f21/0x1000)=nil, 0x1000, 0x3, 0x32, 0x, 0x0)
r2 = accept(r0, 0x0, &(0x7f211000-0x4)=0x0)
setsockopt$inet_sctp6_SCTP_INITMSG(r2, 0x84, 0x2,
&(0x7f0f-0x8)={0x2, 0x8001, 0x400abe5, 0x1}, 0x8)
setsockopt$netrom_NETROM_IDLE(r0, 0x103, 0x7,
&(0x7f0d2000-0x4)=0x8001, 0x4)
r3 = socket(0x10, 0x2, 0xc)
mmap(&(0x7f21/0x1000)=nil, 0x1000, 0x3, 0x32, 0x, 0x0)
write(r3, 
&(0x7f7b4000-0x20)="1f011303f900d4e80788060c41ff28000280061b0f0e96fa",
0x20)
setsockopt$llc_int(r3, 0x10c, 0x80006, &(0x7ff2d000-0x4)=0x2, 0x4)
r4 = socket$inet6(0xa, 0x1001, 0x800)
getsockopt$inet_sctp_SCTP_SOCKOPT_CONNECTX3(r3, 0x84, 0x6f,
&(0x7fb41000)={0x0, 0x3, &(0x7ff7d000-0x54)=[@in6={0xa, 0x0,
0x7fff, @empty={[0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0,
0x0, 0x0, 0x0, 0x0, 0x0, 0x0]}, 0x1000}, @in6={0xa, 0x3, 0x3,
@local={0xfe, 0x80, [0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0,
0x0, 0x0], 0x0, 0xaa}, 0x0}, @in6={0xa, 0x2, 0x5, @remote={0xfe, 0x80,
[0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0], 0x0,
0xbb}, 0x81}]}, &(0x7f5fb000-0x4)=0x10)
getsockopt$inet_sctp6_SCTP_LOCAL_AUTH_CHUNKS(r3, 0x84, 0x1b,

[PATCH] s390/mm: fix pud table accounting

2017-11-03 Thread Heiko Carstens

With "mm: account pud page tables" and "mm: consolidate page table
accounting" pud page table accounting was introduced which now results
in tons of warnings like this one on s390:

BUG: non-zero pgtables_bytes on freeing mm: -16384

Reason for this are our run-time folded page tables: by default new
processes start with three page table levels where the allocated pgd
is the same as the first pud. In this case there won't ever be a pud
allocated and therefore mm_inc_nr_puds() will also never be called.

However when freeing the address space free_pud_range() will call
exactly once mm_dec_nr_puds() which leads to misaccounting.

Therefore call mm_inc_nr_puds() within init_new_context() to fix
this. This is the same like we have it already for processes that run
with two page table levels (aka compat processes).

While at it also adjust the comment, since there is no "mm->nr_pmds"
anymore.

Cc: Kirill A. Shutemov 
Cc: Michal Hocko 
Cc: Gerald Schaefer 
Cc: Martin Schwidefsky 
Signed-off-by: Heiko Carstens 
---
 arch/s390/include/asm/mmu_context.h | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/s390/include/asm/mmu_context.h 
b/arch/s390/include/asm/mmu_context.h
index 3c9abedc323c..4f943d58cbac 100644
--- a/arch/s390/include/asm/mmu_context.h
+++ b/arch/s390/include/asm/mmu_context.h
@@ -43,6 +43,8 @@ static inline int init_new_context(struct task_struct *tsk,
mm->context.asce_limit = STACK_TOP_MAX;
mm->context.asce = __pa(mm->pgd) | _ASCE_TABLE_LENGTH |
   _ASCE_USER_BITS | _ASCE_TYPE_REGION3;
+   /* pgd_alloc() did not account this pud */
+   mm_inc_nr_puds(mm);
break;
case -PAGE_SIZE:
/* forked 5-level task, set new asce with new_mm->pgd */
@@ -58,7 +60,7 @@ static inline int init_new_context(struct task_struct *tsk,
/* forked 2-level compat task, set new asce with new mm->pgd */
mm->context.asce = __pa(mm->pgd) | _ASCE_TABLE_LENGTH |
   _ASCE_USER_BITS | _ASCE_TYPE_SEGMENT;
-   /* pgd_alloc() did not increase mm->nr_pmds */
+   /* pgd_alloc() did not account this pmd */
mm_inc_nr_pmds(mm);
}
crst_table_init((unsigned long *) mm->pgd, pgd_entry_type(mm));
-- 
2.13.5

Re: [PATCH v4 3/3] KVM: MMU: consider host cache mode in MMIO page check

2017-11-03 Thread Xiao Guangrong




On 11/03/2017 05:02 PM, Haozhong Zhang wrote:

On 11/03/17 16:51 +0800, Haozhong Zhang wrote:

On 11/03/17 14:54 +0800, Xiao Guangrong wrote:



On 11/03/2017 01:53 PM, Haozhong Zhang wrote:

Some reserved pages, such as those from NVDIMM DAX devices, are
not for MMIO, and can be mapped with cached memory type for better
performance. However, the above check misconceives those pages as
MMIO.  Because KVM maps MMIO pages with UC memory type, the
performance of guest accesses to those pages would be harmed.
Therefore, we check the host memory type by lookup_memtype() in
addition and only treat UC/UC- pages as MMIO.

Signed-off-by: Haozhong Zhang 
Reported-by: Cuevas Escareno, Ivan D 
Reported-by: Kumar, Karthik 
---
   arch/x86/kvm/mmu.c | 19 ++-
   1 file changed, 18 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 0b481cc9c725..e9ed0e666a83 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -2708,7 +2708,24 @@ static bool mmu_need_write_protect(struct kvm_vcpu 
*vcpu, gfn_t gfn,
   static bool kvm_is_mmio_pfn(kvm_pfn_t pfn)
   {
if (pfn_valid(pfn))
-   return !is_zero_pfn(pfn) && PageReserved(pfn_to_page(pfn));
+   return !is_zero_pfn(pfn) && PageReserved(pfn_to_page(pfn)) &&
+   /*
+* Some reserved pages, such as those from
+* NVDIMM DAX devices, are not for MMIO, and
+* can be mapped with cached memory type for
+* better performance. However, the above
+* check misconceives those pages as MMIO.
+* Because KVM maps MMIO pages with UC memory
+* type, the performance of guest accesses to
+* those pages would be harmed. Therefore, we
+* check the host memory type in addition and
+* only treat UC/UC- pages as MMIO.
+*
+* pat_pfn_is_uc() works only when PAT is enabled,
+* so check pat_enabled() as well.
+*/
+   (!pat_enabled() ||
+pat_pfn_is_uc(kvm_pfn_t_to_pfn_t(pfn)));


Can it be compiled if !CONFIG_PAT?


Yes.

What I check via pat_enabled() is not only whether PAT support is
compiled, but also whether PAT is enabled at runtime.



It would be better if we move pat_enabled out of kvm as well,


Surely I can combine them in one function like

bool pat_pfn_is_uc(pfn_t pfn)
{
enum page_cache_mode cm;

if (!pat_enabled())
return false;

cm = lookup_memtype(pfn_t_to_phys(pfn));

return cm == _PAGE_CACHE_MODE_UC || cm == _PAGE_CACHE_MODE_UC_MINUS;
}


In addition, I think it's better to split this function into
pat_pfn_is_uc() and pat_pfn_is_uc_minus() to avoid additional
confusion.


Why not use pat_pfn_is_uc_or_uc_minus(). :)

Re: [PATCH v3 14/17] PCI: dwc: artpec6: Add support for endpoint mode

2017-11-03 Thread Arnd Bergmann

On Fri, Nov 3, 2017 at 10:56 AM, Niklas Cassel  wrote:
> On 11/02/2017 10:13 AM, Arnd Bergmann wrote:

>>
>>
>> Can you try replacing the #ifdef with
>>
>>
>> if (!IS_ENABLED(CONFIG_PCIE_ARTPEC6_HOST))
>>  return -ENODEV;
>>
>> at the start of artpec6_pcie_enable_interrupts? I think that would improve
>> readability here.
>>
>
> artpec6_pcie_enable_interrupts is a void function, so
> I guess that you meant at the start of artpec6_add_pcie_port.

Right, sorry about that.

> That would not really help since artpec6_add_pcie_port
> calls artpec6_pcie_msi_handler, and uses artpec6_pcie_host_ops,
> which is still inside the CONFIG_PCIE_ARTPEC6_HOST ifdef block.
>
> Please note that there are several functions, as well as
> artpec6_pcie_host_ops inside the
> CONFIG_PCIE_ARTPEC6_HOST ifdef block.

What I meant is that you can remove the #ifdef entirely if you add

 if (!IS_ENABLED(CONFIG_PCIE_ARTPEC6_HOST))
  return -ENODEV;

to artpec6_pcie_probe(). Anything after that statement will get
silently dropped by the compiler, including static functions and
structures that are referenced indirectly from there.

  Arnd

Re: [Patch v11 4/4] QE: remove PPCisms for QE

2017-11-03 Thread Marc Zyngier

On 01/11/17 01:36, Zhao Qiang wrote:
> QE was supported on PowerPC, and dependent on PPC,
> Now it is supported on other platforms. so remove PPCisms.
> 
> Signed-off-by: Zhao Qiang 
> ---
>  drivers/net/ethernet/freescale/Kconfig | 11 ++---
>  drivers/soc/fsl/qe/Kconfig |  2 +-
>  drivers/soc/fsl/qe/qe.c| 82 
> +-
>  drivers/soc/fsl/qe/qe_io.c | 42 -
>  drivers/soc/fsl/qe/qe_tdm.c|  8 ++--
>  drivers/soc/fsl/qe/ucc.c   | 10 ++---
>  drivers/soc/fsl/qe/ucc_fast.c  | 74 +++---
>  drivers/tty/serial/Kconfig |  2 +-
>  drivers/tty/serial/ucc_uart.c  |  1 +
>  drivers/usb/gadget/udc/Kconfig |  2 +-
>  drivers/usb/host/Kconfig   |  2 +-
>  include/soc/fsl/qe/qe.h|  1 -
>  12 files changed, 126 insertions(+), 111 deletions(-)

This patch doesn't have anything to do with irqchips or interrupts in
general. It should be split per maintenance domain, and sent to the
relevant maintainers.

Thanks,

M.
-- 
Jazz is not dead. It just smells funny...

[RFT][PATCH v2 2/2] PM / QoS: Fix device resume latency framework

2017-11-03 Thread Rafael J. Wysocki

From: Rafael J. Wysocki 

The special value of 0 for device resume latency PM QoS means
"no restriction", but there are two problems with that.

First, device resume latency PM QoS requests with 0 as the
value are always put in front of requests with positive
values in the priority lists used internally by the PM QoS
framework, causing 0 to be chosen as an effective constraint
value.  However, that 0 is then interpreted as "no restriction"
effectively overriding the other requests with specific
restrictions which is incorrect.

Second, the users of device resume latency PM QoS have no
way to specify that *any* resume latency at all should be
avoided, which is an artificial limitation in general.

To address these issues, modify device resume latency PM QoS to
use S32_MAX as the "no constraint" value and 0 as the "no
latency at all" one and rework its users (the cpuidle menu
governor, the genpd QoS governor and the runtime PM framework)
to follow these changes.

Also add a special "n/a" value to the corresponding user space I/F
to allow user space to indicate that it cannot accept any resume
latencies at all for the given device.

Fixes: 85dc0b8a4019 (PM / QoS: Make it possible to expose PM QoS latency 
constraints)
Link: https://bugzilla.kernel.org/show_bug.cgi?id=197323
Reported-by: Reinette Chatre 
Signed-off-by: Rafael J. Wysocki 
---
 Documentation/ABI/testing/sysfs-devices-power |4 +++-
 drivers/base/cpu.c|3 ++-
 drivers/base/power/domain.c   |2 +-
 drivers/base/power/domain_governor.c  |   26 ++
 drivers/base/power/qos.c  |5 -
 drivers/base/power/runtime.c  |2 +-
 drivers/base/power/sysfs.c|   25 +
 drivers/cpuidle/governors/menu.c  |4 ++--
 include/linux/pm_qos.h|   26 ++
 9 files changed, 62 insertions(+), 35 deletions(-)

Index: linux-pm/drivers/base/power/sysfs.c
===
--- linux-pm.orig/drivers/base/power/sysfs.c
+++ linux-pm/drivers/base/power/sysfs.c
@@ -218,7 +218,14 @@ static ssize_t pm_qos_resume_latency_sho
  struct device_attribute *attr,
  char *buf)
 {
-   return sprintf(buf, "%d\n", dev_pm_qos_requested_resume_latency(dev));
+   s32 value = dev_pm_qos_requested_resume_latency(dev);
+
+   if (value == 0)
+   return sprintf(buf, "n/a\n");
+   else if (value == PM_QOS_RESUME_LATENCY_NO_CONSTRAINT)
+   value = 0;
+
+   return sprintf(buf, "%d\n", value);
 }
 
 static ssize_t pm_qos_resume_latency_store(struct device *dev,
@@ -228,11 +235,21 @@ static ssize_t pm_qos_resume_latency_sto
s32 value;
int ret;
 
-   if (kstrtos32(buf, 0, ))
-   return -EINVAL;
+   if (!kstrtos32(buf, 0, )) {
+   /*
+* Prevent users from writing negative or "no constraint" values
+* directly.
+*/
+   if (value < 0 || value == PM_QOS_RESUME_LATENCY_NO_CONSTRAINT)
+   return -EINVAL;
 
-   if (value < 0)
+   if (value == 0)
+   value = PM_QOS_RESUME_LATENCY_NO_CONSTRAINT;
+   } else if (!strcmp(buf, "n/a") || !strcmp(buf, "n/a\n")) {
+   value = 0;
+   } else {
return -EINVAL;
+   }
 
ret = dev_pm_qos_update_request(dev->power.qos->resume_latency_req,
value);
Index: linux-pm/include/linux/pm_qos.h
===
--- linux-pm.orig/include/linux/pm_qos.h
+++ linux-pm/include/linux/pm_qos.h
@@ -28,16 +28,19 @@ enum pm_qos_flags_status {
PM_QOS_FLAGS_ALL,
 };
 
-#define PM_QOS_DEFAULT_VALUE -1
+#define PM_QOS_DEFAULT_VALUE   (-1)
+#define PM_QOS_LATENCY_ANY S32_MAX
+#define PM_QOS_LATENCY_ANY_NS  ((s64)PM_QOS_LATENCY_ANY * NSEC_PER_USEC)
 
 #define PM_QOS_CPU_DMA_LAT_DEFAULT_VALUE   (2000 * USEC_PER_SEC)
 #define PM_QOS_NETWORK_LAT_DEFAULT_VALUE   (2000 * USEC_PER_SEC)
 #define PM_QOS_NETWORK_THROUGHPUT_DEFAULT_VALUE0
 #define PM_QOS_MEMORY_BANDWIDTH_DEFAULT_VALUE  0
-#define PM_QOS_RESUME_LATENCY_DEFAULT_VALUE0
+#define PM_QOS_RESUME_LATENCY_DEFAULT_VALUEPM_QOS_LATENCY_ANY
+#define PM_QOS_RESUME_LATENCY_NO_CONSTRAINTPM_QOS_LATENCY_ANY
+#define PM_QOS_RESUME_LATENCY_NO_CONSTRAINT_NS PM_QOS_LATENCY_ANY_NS
 #define PM_QOS_LATENCY_TOLERANCE_DEFAULT_VALUE 0
 #define PM_QOS_LATENCY_TOLERANCE_NO_CONSTRAINT (-1)
-#define PM_QOS_LATENCY_ANY ((s32)(~(__u32)0 >> 1))
 
 #define PM_QOS_FLAG_NO_POWER_OFF   (1 << 0)
 
@@ -174,7 +177,8 @@ static inline s32

[RFT][PATCH v2 0/2] PM / QoS: Device resume latency framework fix

2017-11-03 Thread Rafael J. Wysocki

On Thursday, November 2, 2017 12:00:27 AM CET Rafael J. Wysocki wrote:
> Hi,
> 
> This series is a replacement for commit 0cc2b4e5a020 (PM / QoS: Fix device
> resume latency PM QoS) that had to be reverted due to problems introduced by 
> it.
> 
> This time the genpd PM QoS governor is first updated to be more consistent
> and the PM QoS changes are made on top of that which simplifies the second
> patch quite a bit.
> 
> This is based on the linux-next branch from linux-pm.git as of now (should
> also apply to the current mainline just fine).
> 
> Please test if you can or let me know if you have any comments.

The v2 removes a couple of redundant checks from the first patch (and add
comments to explain why the checks are not needed) and fixes up the
"no constraint" value collision with a valid constraint multiplied by
NSEC_PER_USEC in the second patch.

Please test if possible and let me know about any issues.

Thanks,
Rafael

[RFT][PATCH v2 1/2] PM / domains: Rework governor code to be more consistent

2017-11-03 Thread Rafael J. Wysocki

From: Rafael J. Wysocki 

The genpd governor currently uses negative PM QoS values to indicate
the "no suspend" condition and 0 as "no restriction", but it doesn't
use them consistently.  Moreover, it tries to refresh QoS values for
already suspended devices in a quite questionable way.

For the above reasons, rework it to be a bit more consistent.

First off, note that dev_pm_qos_read_value() in
dev_update_qos_constraint() and __default_power_down_ok() is
evaluated for devices in suspend.  Moreover, that only happens if the
effective_constraint_ns value for them is negative (meaning "no
suspend").  It is not evaluated in any other cases, so effectively
the QoS values are only updated for devices in suspend that should
not have been suspended in the first place.  In all of the other
cases, the QoS values taken into account are the effective ones from
the time before the device has been suspended, so generally devices
need to be resumed and suspended again for new QoS values to take
effect anyway.  Thus evaluating dev_update_qos_constraint() in
those two places doesn't make sense at all, so drop it.

Second, initialize effective_constraint_ns to 0 ("no constraint")
rather than to (-1) ("no suspend"), which makes more sense in
general and in case effective_constraint_ns is never updated
(the device is in suspend all the time or it is never suspended)
it doesn't affect the device's parent and so on.

Finally, rework default_suspend_ok() to explicitly handle the
"no restriction" special case.

Signed-off-by: Rafael J. Wysocki 
---
 drivers/base/power/domain.c  |2 -
 drivers/base/power/domain_governor.c |   61 +--
 2 files changed, 38 insertions(+), 25 deletions(-)

Index: linux-pm/drivers/base/power/domain.c
===
--- linux-pm.orig/drivers/base/power/domain.c
+++ linux-pm/drivers/base/power/domain.c
@@ -1331,7 +1331,7 @@ static struct generic_pm_domain_data *ge
 
gpd_data->base.dev = dev;
gpd_data->td.constraint_changed = true;
-   gpd_data->td.effective_constraint_ns = -1;
+   gpd_data->td.effective_constraint_ns = 0;
gpd_data->nb.notifier_call = genpd_dev_pm_qos_notifier;
 
spin_lock_irq(>power.lock);
Index: linux-pm/drivers/base/power/domain_governor.c
===
--- linux-pm.orig/drivers/base/power/domain_governor.c
+++ linux-pm/drivers/base/power/domain_governor.c
@@ -14,22 +14,22 @@
 static int dev_update_qos_constraint(struct device *dev, void *data)
 {
s64 *constraint_ns_p = data;
-   s32 constraint_ns = -1;
-
-   if (dev->power.subsys_data && dev->power.subsys_data->domain_data)
-   constraint_ns = dev_gpd_data(dev)->td.effective_constraint_ns;
+   s64 constraint_ns;
 
-   if (constraint_ns < 0) {
-   constraint_ns = dev_pm_qos_read_value(dev);
-   constraint_ns *= NSEC_PER_USEC;
-   }
-   if (constraint_ns == 0)
+   if (!dev->power.subsys_data || !dev->power.subsys_data->domain_data)
return 0;
 
/*
-* constraint_ns cannot be negative here, because the device has been
-* suspended.
+* Only take suspend-time QoS constraints of devices into account,
+* because constraints updated after the device has been suspended are
+* not guaranteed to be taken into account anyway.  In order for them
+* to take effect, the device has to be resumed and suspended again.
 */
+   constraint_ns = dev_gpd_data(dev)->td.effective_constraint_ns;
+   /* 0 means "no constraint" */
+   if (constraint_ns == 0)
+   return 0;
+
if (constraint_ns < *constraint_ns_p || *constraint_ns_p == 0)
*constraint_ns_p = constraint_ns;
 
@@ -76,14 +76,29 @@ static bool default_suspend_ok(struct de
device_for_each_child(dev, _ns,
  dev_update_qos_constraint);
 
-   if (constraint_ns > 0) {
+   if (constraint_ns == 0) {
+   /* "No restriction", so the device is allowed to suspend. */
+   td->effective_constraint_ns = 0;
+   td->cached_suspend_ok = true;
+   } else {
+   /*
+* constraint_ns must be positive here, because the children
+* walked above are all suspended, so effective_constraint_ns
+* cannot be negative for them.
+*/
constraint_ns -= td->suspend_latency_ns +
td->resume_latency_ns;
-   if (constraint_ns == 0)
+   /*
+* effective_constraint_ns is negative already and
+* cached_suspend_ok is false, so if the computed value is not
+* positive, return right away.
+*/
+

Re: [PATCH] x86/mce/AMD: Fix mce_severity_amd_smca() signature

2017-11-03 Thread Borislav Petkov

On Wed, Nov 01, 2017 at 02:04:12PM -0500, Yazen Ghannam wrote:
> From: Yazen Ghannam 
> 
> Change the err_ctx type to "enum context" to match the type passed in.
> 
> Suggested-by: Borislav Petkov 
> Signed-off-by: Yazen Ghannam 
> ---
>  arch/x86/kernel/cpu/mcheck/mce-severity.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kernel/cpu/mcheck/mce-severity.c 
> b/arch/x86/kernel/cpu/mcheck/mce-severity.c
> index f5518706baa6..267311a7fc60 100644
> --- a/arch/x86/kernel/cpu/mcheck/mce-severity.c
> +++ b/arch/x86/kernel/cpu/mcheck/mce-severity.c
> @@ -204,7 +204,7 @@ static int error_context(struct mce *m)
>   return IN_KERNEL;
>  }
>  
> -static int mce_severity_amd_smca(struct mce *m, int err_ctx)
> +static int mce_severity_amd_smca(struct mce *m, enum context err_ctx)
>  {
>   u32 addr = MSR_AMD64_SMCA_MCx_CONFIG(m->bank);
>   u32 low, high;
> -- 

Applied, thanks.

-- 
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

Re: [PATCH] autofs: don't fail mount for transient error

2017-11-03 Thread Ian Kent

On 03/11/17 09:40, NeilBrown wrote:
> 

Hi Neil, and thanks taking the time to post the patch.

> Currently if the autofs kernel module gets an error when
> writing to the pipe which links to the daemon, then it
> marks the whole moutpoint as catatonic, and it will stop working.
> 
> It is possible that the error is transient.  This can happen
> if the daemon is slow and more than 16 requests queue up.
> If a subsequent process tries to queue a request, and is then signalled,
> the write to the pipe will return -ERESTARTSYS and autofs
> will take that as total failure.

Indeed it does.

And given the problems with a half dozen (or so) user space
applications consuming large amounts of CPU under heavy mount
and umount activity this could happen more easily than we
expect.

> 
> So change the code to assess -ERESTARTSYS and -ENOMEM as transient
> failures which only abort the current request, not the whole
> mountpoint.

This looks good to me.

> 
> Signed-off-by: NeilBrown 
> ---
> 
> Do people think this should got to -stable ??
> It isn't a crash or a data corruption, but having autofs mountpoints
> suddenly stop working is rather inconvenient.

Perhaps that's a good idea given the CPU usage problem I refer
to above has been around for a while now.

> 
> Thanks,
> NeilBrown
> 
> 
>  fs/autofs4/waitq.c | 15 ++-
>  1 file changed, 14 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/autofs4/waitq.c b/fs/autofs4/waitq.c
> index 4ac49d038bf3..8fc41705c7cd 100644
> --- a/fs/autofs4/waitq.c
> +++ b/fs/autofs4/waitq.c
> @@ -81,7 +81,8 @@ static int autofs4_write(struct autofs_sb_info *sbi,
>   spin_unlock_irqrestore(>sighand->siglock, flags);
>   }
>  
> - return (bytes > 0);
> + /* if 'wr' returned 0 (impossible) we assume -EIO (safe) */
> + return bytes == 0 ? 0 : wr < 0 ? wr : -EIO;
>  }
>  
>  static void autofs4_notify_daemon(struct autofs_sb_info *sbi,
> @@ -95,6 +96,7 @@ static void autofs4_notify_daemon(struct autofs_sb_info 
> *sbi,
>   } pkt;
>   struct file *pipe = NULL;
>   size_t pktsz;
> + int ret;
>  
>   pr_debug("wait id = 0x%08lx, name = %.*s, type=%d\n",
>(unsigned long) wq->wait_queue_token,
> @@ -169,7 +171,18 @@ static void autofs4_notify_daemon(struct autofs_sb_info 
> *sbi,
>   mutex_unlock(>wq_mutex);
>  
>   if (autofs4_write(sbi, pipe, , pktsz))
> + switch (ret = autofs4_write(sbi, pipe, , pktsz)) {
> + case 0:
> + break;
> + case -ENOMEM:
> + case -ERESTARTSYS:
> + /* Just fail this one */
> + autofs4_wait_release(sbi, wq->wait_queue_token, ret);
> + break;
> + default:
>   autofs4_catatonic_mode(sbi);
> + break;
> + }
>   fput(pipe);
>  }
>  
>

Re: [PATCH] Support resetting WARN_ONCE for all architectures

2017-11-03 Thread Michael Ellerman

Hi Andi,

Thanks for making it work with the flag, but ...

Andi Kleen  writes:
> diff --git a/lib/bug.c b/lib/bug.c
> index a6a1137d06db..7cb2d41845f7 100644
> --- a/lib/bug.c
> +++ b/lib/bug.c
> @@ -195,3 +195,24 @@ enum bug_trap_type report_bug(unsigned long bugaddr, 
> struct pt_regs *regs)
>  
>   return BUG_TRAP_TYPE_BUG;
>  }
> +
> +static void clear_once_table(struct bug_entry *start, struct bug_entry *end)
> +{
> + struct bug_entry *bug;
> +
> + for (bug = start; bug < end; bug++)
> + bug->flags &= ~BUGFLAG_ONCE;

Clearing BUGFLAG_ONCE removes the once-ness permanently. ie. it becomes
a WARN().

You should be clearing BUGFLAG_DONE, which is the flag that says this
WARN has already triggered.

cheers

[GIT PULL] Please pull powerpc/linux.git powerpc-4.14-6 tag

2017-11-03 Thread Michael Ellerman

Hi Linus,

Please pull some more powerpc fixes for 4.14.

This is bigger than I like to send at rc7, but that's at least partly
because I didn't send any fixes last week. If it wasn't for the IMC
driver, which is new and getting heavy testing, the diffstat would look
a bit better. I've also added ftrace on big endian to my test suite, so
we shouldn't break that again in future.

cheers


The following changes since commit 0d8ba16278ec30a262d931875018abee332f926f:

  powerpc/perf: Fix IMC initialization crash (2017-10-13 20:08:40 +1100)

are available in the git repository at:

  https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git 
tags/powerpc-4.14-6

for you to fetch changes up to 7ecb37f62fe58e3e4d9b03443b92d213b2c108ce:

  powerpc/perf: Fix core-imc hotplug callback failure during imc initialization 
(2017-11-03 09:38:05 +1100)


powerpc fixes for 4.14 #6

A fix to the handling of misaligned paste instructions (P9 only), where a change
to a #define has caused the check for the instruction to always fail.

The preempt handling was unbalanced in the radix THP flush (P9 only). Though we
don't generally use preempt we want to keep it working as much as possible.

Two fixes for IMC (P9 only), one when booting with restricted number of CPUs and
one in the error handling when initialisation fails due to firmware etc.

A revert to fix function_graph on big endian machines, and then a rework of the
reverted patch to fix kprobes blacklist handling on big endian machines.

Thanks to:
  Anju T Sudhakar, Guilherme G. Piccoli, Madhavan Srinivasan, Naveen N. Rao,
  Nicholas Piggin, Paul Mackerras.


Guilherme G. Piccoli (1):
  powerpc/perf: Fix IMC allocation routine

Madhavan Srinivasan (1):
  powerpc/perf: Fix core-imc hotplug callback failure during imc 
initialization

Naveen N. Rao (2):
  Revert "powerpc64/elfv1: Only dereference function descriptor for 
non-text symbols"
  powerpc/kprobes: Dereference function pointers only if the address does 
not belong to kernel text

Nicholas Piggin (1):
  powerpc/64s/radix: Fix preempt imbalance in TLB flush

Paul Mackerras (1):
  powerpc: Fix check for copy/paste instructions in alignment handler

 arch/powerpc/include/asm/code-patching.h | 10 +-
 arch/powerpc/kernel/align.c  |  2 +-
 arch/powerpc/kernel/kprobes.c|  7 ++-
 arch/powerpc/mm/tlb-radix.c  |  2 ++
 arch/powerpc/perf/imc-pmu.c  | 18 --
 5 files changed, 26 insertions(+), 13 deletions(-)


signature.asc
Description: PGP signature

Re: x86/module: Detect corrupt relocations against nonzero data

2017-11-03 Thread Jessica Yu


+++ Josh Poimboeuf [02/11/17 21:19 -0500]:

On Thu, Nov 02, 2017 at 04:57:11PM -0500, Josh Poimboeuf wrote:

There have been some cases where external tooling (e.g., kpatch-build)
creates a corrupt relocation which targets the wrong address.  This is a
silent failure which can corrupt memory in unexpected places.

On x86, the bytes of data being overwritten by relocations are always
initialized to zero beforehand.  Use that knowledge to add sanity checks
to detect such cases before they corrupt memory.

Signed-off-by: Josh Poimboeuf 
---
 arch/x86/kernel/module.c | 13 +
 1 file changed, 13 insertions(+)

diff --git a/arch/x86/kernel/module.c b/arch/x86/kernel/module.c
index 62e7d70aadd5..a69b12617820 100644
--- a/arch/x86/kernel/module.c
+++ b/arch/x86/kernel/module.c
@@ -172,19 +172,27 @@ int apply_relocate_add(Elf64_Shdr *sechdrs,
case R_X86_64_NONE:
break;
case R_X86_64_64:
+   if (*(u64 *)loc != 0)
+   goto nonzero;
*(u64 *)loc = val;
break;
case R_X86_64_32:
+   if (*(u32 *)loc != 0)
+   goto nonzero;
*(u32 *)loc = val;
if (val != *(u32 *)loc)
goto overflow;
break;
case R_X86_64_32S:
+   if (*(s32 *)loc != 0)
+   goto nonzero;
*(s32 *)loc = val;
if ((s64)val != *(s32 *)loc)
goto overflow;
break;
case R_X86_64_PC32:
+   if (*(u64 *)loc != 0)
+   goto nonzero;
val -= (u64)loc;
*(u32 *)loc = val;


NACK - this last bit is obviously a bug, not sure how it passed my
testing without module load failures...


Thanks for the patch Josh, btw - could you also CC the x86 folks when
you send out v2? 


Thanks!

Jessica

Re: [Part2 PATCH v7 14/38] crypto: ccp: Implement SEV_FACTORY_RESET ioctl command

2017-11-03 Thread Borislav Petkov

On Wed, Nov 01, 2017 at 04:15:59PM -0500, Brijesh Singh wrote:
> The SEV_FACTORY_RESET command can be used by the platform owner to
> reset the non-volatile SEV related data. The command is defined in
> SEV spec section 5.4
> 
> Cc: Paolo Bonzini 
> Cc: "Radim Krčmář" 
> Cc: Borislav Petkov 
> Cc: Herbert Xu 
> Cc: Gary Hook 
> Cc: Tom Lendacky 
> Cc: linux-cry...@vger.kernel.org
> Cc: k...@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> Improvements-by: Borislav Petkov 
> Signed-off-by: Brijesh Singh 
> Acked-by: Gary R Hook 

Acked-by: needs to go away too if you send a new revision with
non-trivial changes. Same as with Reviewed-by: tags.

> ---
>  drivers/crypto/ccp/psp-dev.c | 70 
> +++-
>  1 file changed, 69 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/crypto/ccp/psp-dev.c b/drivers/crypto/ccp/psp-dev.c
> index c61ca16096ca..a757bd1c34e8 100644
> --- a/drivers/crypto/ccp/psp-dev.c
> +++ b/drivers/crypto/ccp/psp-dev.c
> @@ -235,9 +235,77 @@ static int sev_platform_shutdown(int *error)
>   return rc;
>  }
>  
> +static int sev_platform_state(int *state, int *error)

It needs a verb in the name: sev_get_platform_state()

> +{
> + int rc;
> +
> + rc = __sev_do_cmd_locked(SEV_CMD_PLATFORM_STATUS,
> +  psp_master->sev_status, error);
> + if (rc)
> + return rc;
> +
> + *state = psp_master->sev_status->state;
> + return rc;
> +}
> +
> +static int sev_ioctl_do_reset(struct sev_issue_cmd *argp)
> +{
> + int state, rc;
> +
> + rc = sev_platform_state(, >error);
> + if (rc)
> + return rc;
> +
> + if (state == SEV_STATE_WORKING) {

You could write a short blurb somewhere in a comment around here, what
the logic now is going to be for the SEV device: If in working state,
reset is denied. All other states accept it, because... .

> + argp->error = SEV_RET_INVALID_PLATFORM_STATE;

If you're going to write enum psp_ret_code types into argp->error, then
it needs to be of enum psp_ret_code type and not an int.

> + return -EBUSY;
> + }
> +
> + if (state == SEV_STATE_INIT) {
> + rc = __sev_platform_shutdown_locked(>error);
> + if (rc)
> + return rc;
> + }
> +
> + return __sev_do_cmd_locked(SEV_CMD_FACTORY_RESET, 0, >error);
> +}

-- 
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

Re: [PATCH] pinctrl: armada-37xx: remove unused variable

2017-11-03 Thread Arnd Bergmann

On Fri, Nov 3, 2017 at 8:46 AM, Linus Walleij  wrote:
> On Thu, Nov 2, 2017 at 3:29 PM, Arnd Bergmann  wrote:
>
>> A cleanup left behind a temporary variable that is now unused:
>>
>> drivers/pinctrl/mvebu/pinctrl-armada-37xx.c: In function 
>> 'armada_37xx_irq_startup':
>> drivers/pinctrl/mvebu/pinctrl-armada-37xx.c:693:20: error: unused variable 
>> 'chip' [-Werror=unused-variable]
>>
>> This removes the declarations as well.
>>
>> Fixes: 3ee9e605caea ("pinctrl: armada-37xx: Stop using struct 
>> gpio_chip.irq_base")
>> Signed-off-by: Arnd Bergmann 
>
> It is used on the head of development, so it's fixed in -next.
>
> Is it such a big issue for v4.14 that you think I should send it
> to Torvalds as a fix at this point or can I just leave it?

I'm confused. The build warning showed up in linux-next yesterday after
3ee9e605caea got merged, which removes the user of that variable.
In v4.14 it is still used.

 Arnd

[PATCH] staging/media/davinci_vpfe: Use common error handling code in vpfe_attach_irq()

2017-11-03 Thread SF Markus Elfring

From: Markus Elfring 
Date: Fri, 3 Nov 2017 10:45:31 +0100

Add a jump target so that a bit of exception handling can be better reused
at the end of this function.

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring 
---
 drivers/staging/media/davinci_vpfe/vpfe_mc_capture.c | 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/drivers/staging/media/davinci_vpfe/vpfe_mc_capture.c 
b/drivers/staging/media/davinci_vpfe/vpfe_mc_capture.c
index bffe2153b910..80297d2df31d 100644
--- a/drivers/staging/media/davinci_vpfe/vpfe_mc_capture.c
+++ b/drivers/staging/media/davinci_vpfe/vpfe_mc_capture.c
@@ -309,8 +309,7 @@ static int vpfe_attach_irq(struct vpfe_device *vpfe_dev)
if (ret < 0) {
v4l2_err(_dev->v4l2_dev,
"Error: requesting VINT1 interrupt\n");
-   free_irq(vpfe_dev->ccdc_irq0, vpfe_dev);
-   return ret;
+   goto free_irq;
}
 
ret = request_irq(vpfe_dev->imp_dma_irq, vpfe_imp_dma_isr,
@@ -319,11 +318,14 @@ static int vpfe_attach_irq(struct vpfe_device *vpfe_dev)
v4l2_err(_dev->v4l2_dev,
 "Error: requesting IMP IRQ interrupt\n");
free_irq(vpfe_dev->ccdc_irq1, vpfe_dev);
-   free_irq(vpfe_dev->ccdc_irq0, vpfe_dev);
-   return ret;
+   goto free_irq;
}
 
return 0;
+
+free_irq:
+   free_irq(vpfe_dev->ccdc_irq0, vpfe_dev);
+   return ret;
 }
 
 /*
-- 
2.15.0

Re: [PATCHv2 1/1] locking/qspinlock/x86: Avoid test-and-set when PV_DEDICATED is set

2017-11-03 Thread Paolo Bonzini

On 02/11/2017 19:43, Eduardo Valentin wrote:
> On Thu, Nov 02, 2017 at 07:24:16PM +0100, Paolo Bonzini wrote:
>> On 02/11/2017 19:08, Eduardo Valentin wrote:
>>> On Thu, Nov 02, 2017 at 06:56:46PM +0100, Paolo Bonzini wrote:
 On 02/11/2017 18:45, Eduardo Valentin wrote:
> Currently, the existing qspinlock implementation will fallback to
> test-and-set if the hypervisor has not set the PV_UNHALT flag.
>
> This patch gives the opportunity to guest kernels to select
> between test-and-set and the regular queueu fair lock implementation
> based on the PV_DEDICATED KVM feature flag. When the PV_DEDICATED
> flag is not set, the code will still fall back to test-and-set,
> but when the PV_DEDICATED flag is set, the code will use
> the regular queue spinlock implementation.

 Have you seen Waiman's series that lets you specify this on the guest
 command line instead?  Would this be acceptable for your use case?
>>>
>>> No, can you please share a link to it? is it already merged to tip/master?
>>
>> [PATCH-tip v2 0/2] x86/paravirt: Enable users to choose PV lock type
>> https://lkml.org/lkml/2017/11/1/655
>>
 (In other words, is there a difference for you between making the host
 vs. guest administrator toggle the feature?  "@amazon.com" means you are
 the host admin, how would you use it?)
>>>
>>> The way I think of this is this is a flag set by host side so the
>>> guest adapts accordingly.
>>>
>>> If the admin in guest side wants to ignore what the host is
>>> flagging, that is a different story.
>>
>> Okay, this makes sense.  But perhaps it should be a separate CPUID leaf,
>> such as "configuration hints", rather than properly a feature.
> 
> Oh OK, you don't think this starts to deviate from the feature concept.
> But would the PV_UNHALT also go to "configuration hints" bucket?

PV_UNHALT says whether the pvqspinlock API is available, PV_DEDICATED
says whether it should be used.

> Another way to see this is we have three locking feature options to select 
> from,
> so we need at least two bits here.

PV_DEDICATED = 1, PV_UNHALT = anything: default is qspinlock
PV_DEDICATED = 0, PV_UNHALT = 1: default is pvqspinlock
PV_DEDICATED = 0, PV_UNHALT = 0: default is tas

What do you think?

Paolo

[PATCH 7/9] rcutorture/kvm-build.sh: Skip build directory check

2017-11-03 Thread SeongJae Park

Check of existence and write permisison for build directory exists in
'kvm-test-1-run.sh' an 'kvm-build.sh'.  Because the 'kvm-build.sh' is
dependent to 'kvm-test-1-run.sh' ('kvm-build.sh' uses variables that
defined from its caller.), it is an unnecessary duplicate.  This commit
removes, thus, the duplicated check in 'kvm-build.sh' script.

Signed-off-by: SeongJae Park 
---
 tools/testing/selftests/rcutorture/bin/kvm-build.sh | 5 -
 1 file changed, 5 deletions(-)

diff --git a/tools/testing/selftests/rcutorture/bin/kvm-build.sh 
b/tools/testing/selftests/rcutorture/bin/kvm-build.sh
index 46752c164676..df1c9bb40337 100755
--- a/tools/testing/selftests/rcutorture/bin/kvm-build.sh
+++ b/tools/testing/selftests/rcutorture/bin/kvm-build.sh
@@ -29,11 +29,6 @@ then
exit 1
 fi
 builddir=${2}
-if test -z "$builddir" -o ! -d "$builddir" -o ! -w "$builddir"
-then
-   echo "kvm-build.sh :$builddir: Not a writable directory, cannot build 
into it"
-   exit 1
-fi
 
 T=/tmp/test-linux.sh.$$
 trap 'rm -rf $T' 0
-- 
2.13.0

Re: [PATCH v10 02/13] x86/insn-eval: Compute linear address in several utility functions

2017-11-03 Thread Ingo Molnar


* Ricardo Neri  wrote:

> On Thu, Nov 02, 2017 at 09:51:08AM +0100, Ingo Molnar wrote:
> > 
> > * Ricardo Neri  wrote:
> > 
> > > + /*
> > > +  * -EDOM means that we must ignore the address_offset. In such a case,
> > > +  * in 64-bit mode the effective address relative to the RIP of the
> > > +  * following instruction.
> > > +  */
> > > + if (*regoff == -EDOM) {
> > > + if (user_64bit_mode(regs))
> > > + tmp = (long)regs->ip + insn->length;
> > > + else
> > > + tmp = 0;
> > > + } else if (*regoff < 0) {
> > > + return -EINVAL;
> > > + } else {
> > > + tmp = (long)regs_get_register(regs, *regoff);
> > > + }
> > 
> > > + else
> > > + indx = (long)regs_get_register(regs, indx_offset);
> > 
> > This and subsequent patches include a disgustly insane amount of type casts 
> > - why?
> > 
> > For example here 'tmp' is 'long', while regs_get_register() returns
> > 'unsigned long', but no type cast is necessary for that.
> > 
> > > + ret = get_eff_addr_modrm(insn, regs, _offset,
> > > +  _addr);
> 
> One of the goals of this series is to have the ability to compute 16-bit, 
> 32-bit
> and 64-bit addresses. I put lost of casts, between signed and unsigned types,
> between 64-bit and 32-bit and 16-bit casts. After seeing your comment I have 
> gone
> through the code and I have removed most of the casts. Instead I will use 
> masks.
> I will also inspect the resulting assembly code to make sure the arithmetic is
> performed in the address size pertinent to each case.

Well, casts are probably fine when the goal is to zero out high bits - but the 
ones I quoted converted types of the same with.

For register values it would also probably be cleaner to use the u8, u16, u32 
and 
u64 types instead of char/short/int/long - this goes hand in hand with how the 
instructions are documented in the SDMs.

Thanks,

Ingo

[PATCH 5/9] rcutorture/kvm.sh: Support execution from any directory

2017-11-03 Thread SeongJae Park

'kvm.sh' for rcutorture is restricting user to execute it from top of
linux source tree.  It is just a subtle restriction, but users using it
for the first time could forget the restriction and be confused.
Moreover, it makes commands a little longer and the long commands could
frustrate debugging people.  This commit let users to call the script
from any location.

Signed-off-by: SeongJae Park 
---
 tools/testing/selftests/rcutorture/bin/kvm.sh | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/rcutorture/bin/kvm.sh 
b/tools/testing/selftests/rcutorture/bin/kvm.sh
index 36f7f499698f..cd62933e33d7 100755
--- a/tools/testing/selftests/rcutorture/bin/kvm.sh
+++ b/tools/testing/selftests/rcutorture/bin/kvm.sh
@@ -1,8 +1,7 @@
 #!/bin/bash
 #
 # Run a series of 14 tests under KVM.  These are not particularly
-# well-selected or well-tuned, but are the current set.  Run from the
-# top level of the source tree.
+# well-selected or well-tuned, but are the current set.
 #
 # Edit the definitions below to set the locations of the various directories,
 # as well as the test duration.
@@ -34,6 +33,8 @@ T=/tmp/kvm.sh.$$
 trap 'rm -rf $T' 0
 mkdir $T
 
+cd `dirname $scriptname`/../../../../../
+
 dur=$((30*60))
 dryrun=""
 KVM="`pwd`/tools/testing/selftests/rcutorture"; export KVM
-- 
2.13.0

[PATCH 6/9] rcutorture/kvm-recheck-*: Improve result directory readability check

2017-11-03 Thread SeongJae Park

kvm-recheck-(lock|rcu|rcuperf).sh is checking whether user specified
result directory exists or not.  If not, it prints out error message
that says the specified directory is not readable directory.  To make
the message more precise, this commit adds readability check of the
directory.

Fixes: 2193e1604eac ("rcutorture: Abstract kvm-recheck.sh")

Signed-off-by: SeongJae Park 
---
 tools/testing/selftests/rcutorture/bin/kvm-recheck-lock.sh| 2 +-
 tools/testing/selftests/rcutorture/bin/kvm-recheck-rcu.sh | 2 +-
 tools/testing/selftests/rcutorture/bin/kvm-recheck-rcuperf.sh | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/tools/testing/selftests/rcutorture/bin/kvm-recheck-lock.sh 
b/tools/testing/selftests/rcutorture/bin/kvm-recheck-lock.sh
index 43f764098e50..2de92f43ee8c 100755
--- a/tools/testing/selftests/rcutorture/bin/kvm-recheck-lock.sh
+++ b/tools/testing/selftests/rcutorture/bin/kvm-recheck-lock.sh
@@ -23,7 +23,7 @@
 # Authors: Paul E. McKenney 
 
 i="$1"
-if test -d $i
+if test -d "$i" -a -r "$i"
 then
:
 else
diff --git a/tools/testing/selftests/rcutorture/bin/kvm-recheck-rcu.sh 
b/tools/testing/selftests/rcutorture/bin/kvm-recheck-rcu.sh
index 559e01ac86be..9e34656bf659 100755
--- a/tools/testing/selftests/rcutorture/bin/kvm-recheck-rcu.sh
+++ b/tools/testing/selftests/rcutorture/bin/kvm-recheck-rcu.sh
@@ -23,7 +23,7 @@
 # Authors: Paul E. McKenney 
 
 i="$1"
-if test -d $i
+if test -d "$i" -a -r "$i"
 then
:
 else
diff --git a/tools/testing/selftests/rcutorture/bin/kvm-recheck-rcuperf.sh 
b/tools/testing/selftests/rcutorture/bin/kvm-recheck-rcuperf.sh
index 8f3121afc716..6138fd94abfe 100755
--- a/tools/testing/selftests/rcutorture/bin/kvm-recheck-rcuperf.sh
+++ b/tools/testing/selftests/rcutorture/bin/kvm-recheck-rcuperf.sh
@@ -23,7 +23,7 @@
 # Authors: Paul E. McKenney 
 
 i="$1"
-if test -d $i
+if test -d "$i" -a -r "$i"
 then
:
 else
-- 
2.13.0

[PATCH 1/9] rcutorture/configinit: Fix build directory error message

2017-11-03 Thread SeongJae Park

'configinit.sh' checks format of optional argument for build directory
and print error message if the format is not valid.  However, the error
message is broken so that it just says user entered empty string as
build directory even though user entered wrong, but non-empty string as
the argument.  This commit fixes the message to show what user entered
for the argument.

Fixes: c87b9c601ac8 ("rcutorture: Add KVM-based test framework")

Signed-off-by: SeongJae Park 
---
 tools/testing/selftests/rcutorture/bin/configinit.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/testing/selftests/rcutorture/bin/configinit.sh 
b/tools/testing/selftests/rcutorture/bin/configinit.sh
index 3f81a1095206..50a6371b2b2e 100755
--- a/tools/testing/selftests/rcutorture/bin/configinit.sh
+++ b/tools/testing/selftests/rcutorture/bin/configinit.sh
@@ -51,7 +51,7 @@ then
mkdir $builddir
fi
else
-   echo Bad build directory: \"$builddir\"
+   echo Bad build directory: \"$buildloc\"
exit 2
fi
 fi
-- 
2.13.0

[PATCH 4/9] rcutorture/kvm.sh: Use consistent usage for --qemu-args

2017-11-03 Thread SeongJae Park

'--qemu-args' option usage is wrongly copied from '--qemu-cmd' option
and it's argument type description message format is inconsistent with
other arguments.  This commit fixes the usage and type messages to be
consistent with others.

Fixes: e9ce640001c6 ("rcutorture: Add --qemu-args argument to kvm.sh")

Signed-off-by: SeongJae Park 
---
 tools/testing/selftests/rcutorture/bin/kvm.sh | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/rcutorture/bin/kvm.sh 
b/tools/testing/selftests/rcutorture/bin/kvm.sh
index 0acdfa37c8ab..36f7f499698f 100755
--- a/tools/testing/selftests/rcutorture/bin/kvm.sh
+++ b/tools/testing/selftests/rcutorture/bin/kvm.sh
@@ -70,7 +70,7 @@ usage () {
echo "   --kmake-arg kernel-make-arguments"
echo "   --mac nn:nn:nn:nn:nn:nn"
echo "   --no-initrd"
-   echo "   --qemu-args qemu-system-..."
+   echo "   --qemu-args qemu-arguments"
echo "   --qemu-cmd qemu-system-..."
echo "   --results absolute-pathname"
echo "   --torture rcu"
@@ -150,7 +150,7 @@ do
TORTURE_INITRD=""; export TORTURE_INITRD
;;
--qemu-args|--qemu-arg)
-   checkarg --qemu-args "-qemu args" $# "$2" '^-' '^error'
+   checkarg --qemu-args "(qemu arguments)" $# "$2" '^-' '^error'
TORTURE_QEMU_ARG="$2"
shift
;;
-- 
2.13.0

[PATCH net-next 1/5] net: dsa: lan9303: Correct register names in comments

2017-11-03 Thread Egil Hjelmeland

Two comments refer to registers, but lack the LAN9303_ prefix.
Fix that.

Signed-off-by: Egil Hjelmeland 
---
 include/linux/dsa/lan9303.h | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/include/linux/dsa/lan9303.h b/include/linux/dsa/lan9303.h
index 05d8d136baab..f48a85c377de 100644
--- a/include/linux/dsa/lan9303.h
+++ b/include/linux/dsa/lan9303.h
@@ -13,8 +13,8 @@ struct lan9303_phy_ops {
 #define LAN9303_NUM_ALR_RECORDS 512
 struct lan9303_alr_cache_entry {
u8  mac_addr[ETH_ALEN];
-   u8  port_map;   /* Bitmap of ports. Zero if unused entry */
-   u8  stp_override;   /* non zero if set ALR_DAT1_AGE_OVERRID */
+   u8  port_map; /* Bitmap of ports. Zero if unused entry */
+   u8  stp_override; /* non zero if set LAN9303_ALR_DAT1_AGE_OVERRID */
 };
 
 struct lan9303 {
@@ -28,7 +28,9 @@ struct lan9303 {
struct mutex indirect_mutex; /* protect indexed register access */
const struct lan9303_phy_ops *ops;
bool is_bridged; /* true if port 1 and 2 are bridged */
-   u32 swe_port_state; /* remember SWE_PORT_STATE while not bridged */
+
+   /* remember LAN9303_SWE_PORT_STATE while not bridged */
+   u32 swe_port_state;
/* LAN9303 do not offer reading specific ALR entry. Cache all
 * static entries in a flat table
 **/
-- 
2.11.0

[PATCH] lock/rwlock fix comment for rwlock

2017-11-03 Thread Cheng Jian

The kernel/locking/spinlock.c file contains the
implementations of spinlock and rwlock.
and __lock_function inlines are taken from other include
files. the comment miss rwlock_api_smp.h about rwlock.

also fix a little comment in rwlock_api_smp.h

Signed-off-by: Cheng Jian 
---
 include/linux/rwlock_api_smp.h | 2 +-
 kernel/locking/spinlock.c  | 3 ++-
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/include/linux/rwlock_api_smp.h b/include/linux/rwlock_api_smp.h
index 5b9b84b..86ebb4b 100644
--- a/include/linux/rwlock_api_smp.h
+++ b/include/linux/rwlock_api_smp.h
@@ -211,7 +211,7 @@ static inline void __raw_write_lock(rwlock_t *lock)
LOCK_CONTENDED(lock, do_raw_write_trylock, do_raw_write_lock);
 }
 
-#endif /* CONFIG_PREEMPT */
+#endif /* !CONFIG_GENERIC_LOCKBREAK || CONFIG_DEBUG_LOCK_ALLOC */
 
 static inline void __raw_write_unlock(rwlock_t *lock)
 {
diff --git a/kernel/locking/spinlock.c b/kernel/locking/spinlock.c
index 6e40fdf..11fbca7 100644
--- a/kernel/locking/spinlock.c
+++ b/kernel/locking/spinlock.c
@@ -30,7 +30,8 @@
 #if !defined(CONFIG_GENERIC_LOCKBREAK) || defined(CONFIG_DEBUG_LOCK_ALLOC)
 /*
  * The __lock_function inlines are taken from
- * include/linux/spinlock_api_smp.h
+ * spinlock : include/linux/spinlock_api_smp.h
+ * rwlock   : include/linux/rwlock_api_smp.h
  */
 #else
 #define raw_read_can_lock(l)   read_can_lock(l)
-- 
1.8.3.1

[PATCH net-next 5/5] net: dsa: lan9303: Adjust indenting

2017-11-03 Thread Egil Hjelmeland

Remove scripts/checkpatch.pl CHECKs by adjusting indenting.

Signed-off-by: Egil Hjelmeland 
---
 drivers/net/dsa/lan9303_i2c.c  | 2 +-
 drivers/net/dsa/lan9303_mdio.c | 2 +-
 net/dsa/tag_lan9303.c  | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/dsa/lan9303_i2c.c b/drivers/net/dsa/lan9303_i2c.c
index 24ec20f7f444..909a7e864246 100644
--- a/drivers/net/dsa/lan9303_i2c.c
+++ b/drivers/net/dsa/lan9303_i2c.c
@@ -50,7 +50,7 @@ static int lan9303_i2c_probe(struct i2c_client *client,
return -ENOMEM;
 
sw_dev->chip.regmap = devm_regmap_init_i2c(client,
-   _i2c_regmap_config);
+  _i2c_regmap_config);
if (IS_ERR(sw_dev->chip.regmap)) {
ret = PTR_ERR(sw_dev->chip.regmap);
dev_err(>dev, "Failed to allocate register map: %d\n",
diff --git a/drivers/net/dsa/lan9303_mdio.c b/drivers/net/dsa/lan9303_mdio.c
index 0bc56b9900f9..cc9c2ea1c4fe 100644
--- a/drivers/net/dsa/lan9303_mdio.c
+++ b/drivers/net/dsa/lan9303_mdio.c
@@ -116,7 +116,7 @@ static int lan9303_mdio_probe(struct mdio_device *mdiodev)
return -ENOMEM;
 
sw_dev->chip.regmap = devm_regmap_init(>dev, NULL, sw_dev,
-   _mdio_regmap_config);
+  _mdio_regmap_config);
if (IS_ERR(sw_dev->chip.regmap)) {
ret = PTR_ERR(sw_dev->chip.regmap);
dev_err(>dev, "regmap init failed: %d\n", ret);
diff --git a/net/dsa/tag_lan9303.c b/net/dsa/tag_lan9303.c
index e526c8967b98..5ba01fc3c6ba 100644
--- a/net/dsa/tag_lan9303.c
+++ b/net/dsa/tag_lan9303.c
@@ -88,7 +88,7 @@ static struct sk_buff *lan9303_xmit(struct sk_buff *skb, 
struct net_device *dev)
 }
 
 static struct sk_buff *lan9303_rcv(struct sk_buff *skb, struct net_device *dev,
-   struct packet_type *pt)
+  struct packet_type *pt)
 {
u16 *lan9303_tag;
unsigned int source_port;
-- 
2.11.0

[PATCH net-next 3/5] net: dsa: lan9303: Replace msleep(1) with usleep_range()

2017-11-03 Thread Egil Hjelmeland

Remove scripts/checkpatch.pl WARNING by replacing msleep(1) with usleep_range()

Signed-off-by: Egil Hjelmeland 
---
 drivers/net/dsa/lan9303-core.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/dsa/lan9303-core.c b/drivers/net/dsa/lan9303-core.c
index c4afc8f1a66d..70ecd18a5e7d 100644
--- a/drivers/net/dsa/lan9303-core.c
+++ b/drivers/net/dsa/lan9303-core.c
@@ -284,7 +284,7 @@ static int lan9303_indirect_phy_wait_for_completion(struct 
lan9303 *chip)
}
if (!(reg & LAN9303_PMI_ACCESS_MII_BUSY))
return 0;
-   msleep(1);
+   usleep_range(1000, 2000);
}
 
return -EIO;
@@ -376,7 +376,7 @@ static int lan9303_switch_wait_for_completion(struct 
lan9303 *chip)
}
if (!(reg & LAN9303_SWITCH_CSR_CMD_BUSY))
return 0;
-   msleep(1);
+   usleep_range(1000, 2000);
}
 
return -EIO;
-- 
2.11.0

Re: [PATCH 3/8] QE: remove PPCisms for QE

2017-11-03 Thread Marc Zyngier

On 01/11/17 01:34, Zhao Qiang wrote:
> QE was supported on PowerPC, and dependent on PPC,
> Now it is supported on other platforms. so remove PPCisms.
> 
> Signed-off-by: Zhao Qiang 
That's the same (wrong) patch. See my reply to 4/4 in your other series.

M.
-- 
Jazz is not dead. It just smells funny...

[PATCH][RFC] usb: hub: Cycle HUB power when initialization fails

2017-11-03 Thread Mike Looijmans

Sometimes the USB device gets confused about the state of the initialization and
the connection fails. In particular, the device thinks that it's already set up
and running while the host thinks the device still needs to be configured. To
work around this issue, power-cycle the hub's output to issue a sort of "reset"
to the device. This makes the device restart its state machine and then the
initialization succeeds.

This fixes problems where the kernel reports a list of errors like this:

usb 1-1.3: device not accepting address 19, error -71

The end result is a non-functioning device. After this patch, the sequence
becomes like this:

usb 1-1.3: new high-speed USB device number 18 using ci_hdrc
usb 1-1.3: device not accepting address 18, error -71
usb 1-1.3: new high-speed USB device number 19 using ci_hdrc
usb 1-1.3: device not accepting address 19, error -71
usb 1-1-port3: attempt power cycle
usb 1-1.3: new high-speed USB device number 21 using ci_hdrc
usb-storage 1-1.3:1.2: USB Mass Storage device detected

Signed-off-by: Mike Looijmans 
---
This is a fix I did for a customer which might be appropriate for upstream. 
What do you think?

 drivers/usb/core/hub.c | 13 ++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/drivers/usb/core/hub.c b/drivers/usb/core/hub.c
index e9ce6bb..a30c1e7 100644
--- a/drivers/usb/core/hub.c
+++ b/drivers/usb/core/hub.c
@@ -2611,7 +2611,7 @@ static unsigned hub_is_wusb(struct usb_hub *hub)
 #define PORT_RESET_TRIES   5
 #define SET_ADDRESS_TRIES  2
 #define GET_DESCRIPTOR_TRIES   2
-#define SET_CONFIG_TRIES   (2 * (use_both_schemes + 1))
+#define SET_CONFIG_TRIES   (4 * (use_both_schemes + 1))
 #define USE_NEW_SCHEME(i)  ((i) / 2 == (int)old_scheme_first)
 
 #define HUB_ROOT_RESET_TIME60  /* times are in msec */
@@ -4805,7 +4805,6 @@ static void hub_port_connect(struct usb_hub *hub, int 
port1, u16 portstatus,
 
status = 0;
for (i = 0; i < SET_CONFIG_TRIES; i++) {
-
/* reallocate for each attempt, since references
 * to the previous one can escape in various ways
 */
@@ -4935,6 +4934,15 @@ static void hub_port_connect(struct usb_hub *hub, int 
port1, u16 portstatus,
usb_put_dev(udev);
if ((status == -ENOTCONN) || (status == -ENOTSUPP))
break;
+
+   /* When halfway through our retry count, power-cycle the port */
+   if (i == (SET_CONFIG_TRIES / 2) - 1) {
+   dev_info(_dev->dev, "attempt power cycle\n");
+   usb_hub_set_port_power(hdev, hub, port1, false);
+   msleep(800);
+   usb_hub_set_port_power(hdev, hub, port1, true);
+   msleep(hub_power_on_good_delay(hub));
+   }
}
if (hub->hdev->parent ||
!hcd->driver->port_handed_over ||
@@ -5476,7 +5484,6 @@ static int usb_reset_and_verify_device(struct usb_device 
*udev)
udev->bos = NULL;
 
for (i = 0; i < SET_CONFIG_TRIES; ++i) {
-
/* ep0 maxpacket size may change; let the HCD know about it.
 * Other endpoints will be handled by re-enumeration. */
usb_ep0_reinit(udev);
-- 
1.9.1

[PATCH] firewire-ohci: work around oversized DMA reads on JMicron controllers

2017-11-03 Thread Hector Martin

At least some JMicron controllers issue buggy oversized DMA reads when
fetching context descriptors, always fetching 0x20 bytes at once for
descriptors which are only 0x10 bytes long. This is often harmless, but
can cause page faults on modern systems with IOMMUs:

DMAR: [DMA Read] Request device [05:00.0] fault addr fff56000 [fault reason 06] 
PTE Read access is not set
firewire_ohci :05:00.0: DMA context IT0 has stopped, error code: 
evt_descriptor_read

This works around the problem by always leaving 0x10 padding bytes at
the end of descriptor buffer pages, which should be harmless to do
unconditionally for controllers in case others have the same behavior.

Signed-off-by: Hector Martin 
---
 drivers/firewire/ohci.c | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/firewire/ohci.c b/drivers/firewire/ohci.c
index 8bf89267dc25..d731b413cb2c 100644
--- a/drivers/firewire/ohci.c
+++ b/drivers/firewire/ohci.c
@@ -1130,7 +1130,13 @@ static int context_add_buffer(struct context *ctx)
return -ENOMEM;
 
offset = (void *)>buffer - (void *)desc;
-   desc->buffer_size = PAGE_SIZE - offset;
+   /*
+* Some controllers, like JMicron ones, always issue 0x20-byte DMA reads
+* for descriptors, even 0x10-byte ones. This can cause page faults when
+* an IOMMU is in use and the oversized read crosses a page boundary.
+* Work around this by always leaving at least 0x10 bytes of padding.
+*/
+   desc->buffer_size = PAGE_SIZE - offset - 0x10;
desc->buffer_bus = bus_addr + offset;
desc->used = 0;
 
-- 
2.14.3

Wir bieten jedem ein GÜNSTIGES Darlehnen zu TOP Konditionen an

2017-11-03 Thread Martin Kelly

Sehr geehrte Damen und Herren,

Sie brauchen Geld? Sie sind auf der suche nach einem Darlehnen? Seriös und 
unkompliziert?
Dann sind Sie hier bei uns genau richtig.
Durch unsere jahrelange Erfahrung und kompetente Beratung sind wir Europaweit 
tätig.

Wir bieten jedem ein GÜNSTIGES Darlehnen zu TOP Konditionen an. 
Darlehnen zwischen 5000 CHF/Euro bis zu 20 Millionen CHF/Euro möglich.
Wir erheben dazu 2% Zinssatz.

Lassen Sie sich von unserem kompetenten Team beraten. 

Zögern Sie nicht und kontaktieren Sie mich unter für weitere Infos & Anfragen 
unter
der eingeblendeten Email Adresse: 
Ich freue mich von Ihnen zu hören.

Re: [PATCH] scsi: hisi_sas: select CONFIG_RAS

2017-11-03 Thread John Garry


On 03/11/2017 10:19, Arnd Bergmann wrote:

On Fri, Nov 3, 2017 at 11:14 AM, John Garry  wrote:

+ Shiju, who authored the original patch




index d42f29a5eb65..6ad8a6251d21 100644
--- a/drivers/scsi/hisi_sas/Kconfig
+++ b/drivers/scsi/hisi_sas/Kconfig
@@ -4,6 +4,7 @@ config SCSI_HISI_SAS
depends on ARM64 || COMPILE_TEST
select SCSI_SAS_LIBSAS
select BLK_DEV_INTEGRITY
+   select RAS



My impression is that we don't want this. Correction: shouldn't want this.

Do you have the .config for the broken build? I couldn't recreate this by
turning off CONFIG_RAS.


Uploaded to https://pastebin.com/A1rJYhDr
Maybe you were missing support for tracing in your configuration?



Thanks Arnd, I can see it now.

However, as you said, it looks like this interface is not being used 
correctly, and I tend to agree.


Actually this interface is intended for logging UEFI CPER non-standard 
records from kernel RAS/APEI framework to userspace. It is not intended 
for kernel peripheral driver logging (like this case). Until we get a 
community concensus on whether this is acceptable, I would rather revert 
the original patch.


Thanks,
John


  Arnd

.

Re: [PATCH v2] x86/MCE/AMD: Always give PANIC severity for UC errors IN_KERNEL context

2017-11-03 Thread Borislav Petkov

On Wed, Nov 01, 2017 at 01:59:06PM -0500, Yazen Ghannam wrote:
> From: Yazen Ghannam 
> 
> The AMD severity grading function was introduced in v4.1 and has remained
> logically unchanged with the exception of a separate SMCA severity grading
> function for SMCA systems. The current logic can possibly give
> MCE_AR_SEVERITY for uncorrectable errors in kernel context. The system may
> then get stuck in a loop as memory_failure() will try to handle the bad
> kernel memory and find it busy.
> 
> Return MCE_PANIC_SEVERITY for all UC errors IN_KERNEL context on AMD
> systems.
> 
> After:
> 
>   b2f9d678e28c ("x86/mce: Check for faults tagged in EXTABLE_CLASS_FAULT 
> exception table entries")
> 
> was accepted in v4.6, this issue was masked because of the tail-end attempt
> at kernel mode recovery in the #MC handler.
> 
> However, uncorrectable errors IN_KERNEL context should always be considered
> unrecoverable and cause a panic.
> 
> Fixes: bf80bbd7dcf5 (x86/mce: Add an AMD severities-grading function)
> 
> Signed-off-by: Yazen Ghannam 
> [ This needs to be reworked to apply to v4.1 and v4.4 stable branches.]
> Cc:  # 4.9.x
> ---
> Link:
> https://lkml.kernel.org/r/1505830031-9630-1-git-send-email-yazen.ghan...@amd.com
> 
> v1->v2:
> * Update commit message.
> 
>  arch/x86/kernel/cpu/mcheck/mce-severity.c | 7 +++
>  1 file changed, 3 insertions(+), 4 deletions(-)

Applied, thanks.

-- 
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

Re: [PATCH v2 0/3] Sony-laptop: Adjustments for sony_nc_setup_rfkill()

2017-11-03 Thread Andy Shevchenko

On Wed, Nov 1, 2017 at 8:45 PM, SF Markus Elfring
 wrote:
> From: Markus Elfring 
> Date: Wed, 1 Nov 2017 19:34:56 +0100
>
> Three update suggestions were taken into account
> from static source code analysis.

I have applied first two, the last one is subject to discuss a necessity of it.

So, if Darren on my side than we won't apply it, if he opposes, I
would hear an argument why we might need it.

>
> Markus Elfring (3):
>   Fix exception handling
>   Delete an unnecessary variable initialisation
>   Use common error handling code
> ---
>
> v2:
> Two additional suggestions were taken into account from a corresponding
> source code review.
>
>  drivers/platform/x86/sony-laptop.c | 33 ++---
>  1 file changed, 18 insertions(+), 15 deletions(-)
>
> --
> 2.14.3
>



-- 
With Best Regards,
Andy Shevchenko

Re: [PATCH V3 05/11] clk: sprd: add mux clock support

2017-11-03 Thread Chunyan Zhang

Hi Julien,

On 3 November 2017 at 02:11, Julien Thierry  wrote:
> Hi,
>
>
> On 02/11/17 06:56, Chunyan Zhang wrote:
>>
>> This patch adds clock multiplexor support for Spreadtrum platforms,
>> the mux clocks also can be found in sprd composite clocks, so
>> provides two helpers that can be reused later on.
>>
>> Signed-off-by: Chunyan Zhang 
>> ---
>>   drivers/clk/sprd/Makefile |  1 +
>>   drivers/clk/sprd/mux.c| 89
>> +++
>>   drivers/clk/sprd/mux.h| 65 ++
>>   3 files changed, 155 insertions(+)
>>   create mode 100644 drivers/clk/sprd/mux.c
>>   create mode 100644 drivers/clk/sprd/mux.h
>>
>> diff --git a/drivers/clk/sprd/Makefile b/drivers/clk/sprd/Makefile
>> index 8cd5592..cee36b5 100644
>> --- a/drivers/clk/sprd/Makefile
>> +++ b/drivers/clk/sprd/Makefile
>> @@ -2,3 +2,4 @@ obj-$(CONFIG_SPRD_COMMON_CLK)   += clk-sprd.o
>> clk-sprd-y  += common.o
>>   clk-sprd-y+= gate.o
>> +clk-sprd-y += mux.o
>> diff --git a/drivers/clk/sprd/mux.c b/drivers/clk/sprd/mux.c
>> new file mode 100644
>> index 000..5a344e0
>> --- /dev/null
>> +++ b/drivers/clk/sprd/mux.c
>> @@ -0,0 +1,89 @@
>> +/*
>> + * Spreadtrum multiplexer clock driver
>> + *
>> + * Copyright (C) 2017 Spreadtrum, Inc.
>> + * Author: Chunyan Zhang 
>> + *
>> + * SPDX-License-Identifier: GPL-2.0
>> + */
>> +
>> +#include 
>> +#include 
>> +#include 
>> +
>> +#include "mux.h"
>> +
>> +DEFINE_SPINLOCK(sprd_mux_lock);
>> +EXPORT_SYMBOL_GPL(sprd_mux_lock);
>> +
>> +u8 sprd_mux_helper_get_parent(const struct sprd_clk_common *common,
>> + const struct sprd_mux_internal *mux)
>> +{
>> +   unsigned int reg;
>> +   u8 parent;
>> +   int num_parents;
>> +   int i;
>> +
>> +   sprd_regmap_read(common->regmap, common->reg, );
>> +   parent = reg >> mux->shift;
>> +   parent &= (1 << mux->width) - 1;
>> +
>> +   if (mux->table) {
>> +   num_parents = clk_hw_get_num_parents(>hw);
>> +
>> +   for (i = 0; i < num_parents; i++)
>> +   if (parent == mux->table[i] ||
>> +   (i < (num_parents - 1) && parent >
>> mux->table[i] &&
>> +parent < mux->table[i + 1]))
>> +   return i;
>> +   if (i == num_parents)
>> +   return i - 1;
>
>
> The if branch is not necessary since you only get there when the loop has
> finished, so the condition is always true. And the loop can be simplified
> to:
>
> for (i = 0; i < num_parents - 1; i++)
> if (parent >= mux->table[i] &&  parent < mux->table[i + 1])
> return i;
>
> return num_parents;
>
>
>
>> +   }
>> +
>> +   return parent;
>> +}
>> +EXPORT_SYMBOL_GPL(sprd_mux_helper_get_parent);
>> +
>> +static u8 sprd_mux_get_parent(struct clk_hw *hw)
>> +{
>> +   struct sprd_mux *cm = hw_to_sprd_mux(hw);
>> +
>> +   return sprd_mux_helper_get_parent(>common, >mux);
>> +}
>> +
>> +int sprd_mux_helper_set_parent(const struct sprd_clk_common *common,
>> +  const struct sprd_mux_internal *mux,
>> +  u8 index)
>> +{
>> +   unsigned long flags = 0;
>> +   unsigned int reg;
>> +
>> +   if (mux->table)
>> +   index = mux->table[index];
>> +
>> +   spin_lock_irqsave(common->lock, flags);
>> +
>> +   sprd_regmap_read(common->regmap, common->reg, );
>> +   reg &= ~GENMASK(mux->width + mux->shift - 1, mux->shift);
>> +   sprd_regmap_write(common->regmap, common->reg,
>> + reg | (index << mux->shift));
>> +
>> +   spin_unlock_irqrestore(common->lock, flags);
>> +
>> +   return 0;
>> +}
>> +EXPORT_SYMBOL_GPL(sprd_mux_helper_set_parent);
>> +
>> +static int sprd_mux_set_parent(struct clk_hw *hw, u8 index)
>> +{
>> +   struct sprd_mux *cm = hw_to_sprd_mux(hw);
>> +
>> +   return sprd_mux_helper_set_parent(>common, >mux, index);
>> +}
>> +
>> +const struct clk_ops sprd_mux_ops = {
>> +   .get_parent = sprd_mux_get_parent,
>> +   .set_parent = sprd_mux_set_parent,
>> +   .determine_rate = __clk_mux_determine_rate,
>> +};
>> +EXPORT_SYMBOL_GPL(sprd_mux_ops);
>
>
> Same as with the other patch, I'd recommend have one set of ops for direct
> mux and another one for a mux using a table to map the parents. Keeping
> functions for both modes separate.
>

I might prefer to keep them together, since separating them will
increase many lines of code for maintaining :)
And will export double of these functions, not only for mux driver but
also for composite driver.
And I think the name like "sprd_mux_helper_set_parent_table" is a little long.

Thanks,
Chunyan

> Cheers,
>
>
>> diff --git a/drivers/clk/sprd/mux.h b/drivers/clk/sprd/mux.h
>> new file mode 100644
>> index 000..148ca8c
>> ---

Re: [RFC v8 1/7] platform/x86: intel_punit_ipc: Fix resource ioremap warning

2017-11-03 Thread Andy Shevchenko

On Sun, Oct 29, 2017 at 11:49 AM,
 wrote:
> From: Kuppuswamy Sathyanarayanan 
>
> For PUNIT device, ISPDRIVER_IPC and GTDDRIVER_IPC resources are not
> mandatory. So when PMC IPC driver creates a PUNIT device, if these
> resources are not available then it creates dummy resource entries for
> these missing resources. But during PUNIT device probe, doing ioremap on
> these dummy resources generates following warning messages.
>
> intel_punit_ipc: can't request region for resource [mem 0x]
> intel_punit_ipc: can't request region for resource [mem 0x]
> intel_punit_ipc: can't request region for resource [mem 0x]
> intel_punit_ipc: can't request region for resource [mem 0x]
>
> This patch fixes this issue by adding extra check for resource size
> before performing ioremap operation.

I think I already told that this one had been pushed to my review and
testing queue, thanks!

>
> Signed-off-by: Kuppuswamy Sathyanarayanan 
> 
> Signed-off-by: Andy Shevchenko 
> ---
>  drivers/platform/x86/intel_punit_ipc.c | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
>
> Changes since v7:
>  * None
>
> Changes since v6:
>  * None
>
> Changes since v5:
>  * None
>
> Changes since v4:
>  * None
>
> diff --git a/drivers/platform/x86/intel_punit_ipc.c 
> b/drivers/platform/x86/intel_punit_ipc.c
> index a47a41f..b5b8901 100644
> --- a/drivers/platform/x86/intel_punit_ipc.c
> +++ b/drivers/platform/x86/intel_punit_ipc.c
> @@ -252,28 +252,28 @@ static int intel_punit_get_bars(struct platform_device 
> *pdev)
>  * - GTDRIVER_IPC BASE_IFACE
>  */
> res = platform_get_resource(pdev, IORESOURCE_MEM, 2);
> -   if (res) {
> +   if (res && resource_size(res) > 1) {
> addr = devm_ioremap_resource(>dev, res);
> if (!IS_ERR(addr))
> punit_ipcdev->base[ISPDRIVER_IPC][BASE_DATA] = addr;
> }
>
> res = platform_get_resource(pdev, IORESOURCE_MEM, 3);
> -   if (res) {
> +   if (res && resource_size(res) > 1) {
> addr = devm_ioremap_resource(>dev, res);
> if (!IS_ERR(addr))
> punit_ipcdev->base[ISPDRIVER_IPC][BASE_IFACE] = addr;
> }
>
> res = platform_get_resource(pdev, IORESOURCE_MEM, 4);
> -   if (res) {
> +   if (res && resource_size(res) > 1) {
> addr = devm_ioremap_resource(>dev, res);
> if (!IS_ERR(addr))
> punit_ipcdev->base[GTDRIVER_IPC][BASE_DATA] = addr;
> }
>
> res = platform_get_resource(pdev, IORESOURCE_MEM, 5);
> -   if (res) {
> +   if (res && resource_size(res) > 1) {
> addr = devm_ioremap_resource(>dev, res);
> if (!IS_ERR(addr))
> punit_ipcdev->base[GTDRIVER_IPC][BASE_IFACE] = addr;
> --
> 2.7.4
>



-- 
With Best Regards,
Andy Shevchenko

Re: [PATCH] platform: x86: dell-smo8800: remove redundant assignments to byte_data

2017-11-03 Thread Andy Shevchenko

On Tue, Oct 31, 2017 at 8:41 PM, Pali Rohár  wrote:
> On Tuesday 31 October 2017 20:08:45 Andy Shevchenko wrote:
>> On Tue, Oct 31, 2017 at 4:13 PM, Pali Rohár  wrote:
>> > On Tuesday 31 October 2017 16:07:25 Andy Shevchenko wrote:
>> >> On Tue, Oct 31, 2017 at 3:55 PM, Pali Rohár  wrote:
>> >> > On Tuesday 31 October 2017 15:47:58 Andy Shevchenko wrote:
>> >> >> On Tue, Oct 31, 2017 at 1:03 PM, Colin King  
>> >> >> wrote:
>>
>> >> OK, though it doesn't clarify the intention of the byte_data
>> >> (useless?) assignments.
>> >
>> > Probably similar code pattern exists in that lis3lv* driver...
>>
>> So, it seems to me OK to apply the patch. No objections?
>
> No objections, you can add my:
>
> Acked-by: Pali Rohár 

Applied to my review and testing queue, thanks!

>
> --
> Pali Rohár
> pali.ro...@gmail.com



-- 
With Best Regards,
Andy Shevchenko

Re: [PATCH net-next 0/6] net: hns3: support set_link_ksettings and for nway_reset ethtool command

2017-11-03 Thread David Miller

From: Lipeng 
Date: Fri, 3 Nov 2017 12:18:24 +0800

> This patch-set adds support for set_link_ksettings && for nway_resets
> ethtool command and fixes some related ethtool bugs.
> 1, patch[4/6] adds support for ethtool_ops.set_link_ksettings.
> 2, patch[5/6] adds support ethtool_ops.for nway_reset.
> 3, patch[1/6,2/6,3/6,6/6] fix some bugs for getting port information by
>ethtool command(ethtool ethx).

Series applied, thank you.

[PATCH] tools/hv: add install target to Makefile

2017-11-03 Thread Vitaly Kuznetsov

Makefiles usually come with 'install' target included so each distro
doesn't need to implement the procedure from scratch. Add it to tools/hv.

Signed-off-by: Vitaly Kuznetsov 
---
 tools/hv/Makefile | 23 ++-
 1 file changed, 22 insertions(+), 1 deletion(-)

diff --git a/tools/hv/Makefile b/tools/hv/Makefile
index 0d1e61b81844..e8f0a0a691d3 100644
--- a/tools/hv/Makefile
+++ b/tools/hv/Makefile
@@ -6,9 +6,30 @@ CFLAGS = $(WARNINGS) -g $(shell getconf LFS_CFLAGS)
 
 CFLAGS += -D__EXPORTED_HEADERS__ -I../../include/uapi -I../../include
 
-all: hv_kvp_daemon hv_vss_daemon hv_fcopy_daemon
+sbindir ?= /usr/sbin
+libexecdir ?= /usr/libexec
+sharedstatedir ?= /var/lib
+
+ALL_PROGRAMS := hv_kvp_daemon hv_vss_daemon hv_fcopy_daemon
+
+ALL_SCRIPTS := hv_get_dhcp_info.sh hv_get_dns_info.sh hv_set_ifconfig.sh
+
+all: $(ALL_PROGRAMS)
+
 %: %.c
$(CC) $(CFLAGS) -o $@ $^
 
 clean:
$(RM) hv_kvp_daemon hv_vss_daemon hv_fcopy_daemon
+
+install: all
+   install -d -m 755 $(DESTDIR)$(sbindir); \
+   install -d -m 755 $(DESTDIR)$(libexecdir)/hypervkvpd; \
+   install -d -m 755 $(DESTDIR)$(sharedstatedir); \
+   for program in $(ALL_PROGRAMS); do \
+   install $$program -m 755 $(DESTDIR)$(sbindir);  \
+   done; \
+   install -m 755 lsvmbus $(DESTDIR)$(sbindir); \
+   for script in $(ALL_SCRIPTS); do \
+   install $$script -m 755 
$(DESTDIR)$(libexecdir)/hypervkvpd/$${script%.sh}; \
+   done
-- 
2.13.6

Re: [PATCH v4 3/3] KVM: MMU: consider host cache mode in MMIO page check

2017-11-03 Thread Haozhong Zhang

On 11/03/17 16:51 +0800, Haozhong Zhang wrote:
> On 11/03/17 14:54 +0800, Xiao Guangrong wrote:
> > 
> > 
> > On 11/03/2017 01:53 PM, Haozhong Zhang wrote:
> > > Some reserved pages, such as those from NVDIMM DAX devices, are
> > > not for MMIO, and can be mapped with cached memory type for better
> > > performance. However, the above check misconceives those pages as
> > > MMIO.  Because KVM maps MMIO pages with UC memory type, the
> > > performance of guest accesses to those pages would be harmed.
> > > Therefore, we check the host memory type by lookup_memtype() in
> > > addition and only treat UC/UC- pages as MMIO.
> > > 
> > > Signed-off-by: Haozhong Zhang 
> > > Reported-by: Cuevas Escareno, Ivan D 
> > > Reported-by: Kumar, Karthik 
> > > ---
> > >   arch/x86/kvm/mmu.c | 19 ++-
> > >   1 file changed, 18 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
> > > index 0b481cc9c725..e9ed0e666a83 100644
> > > --- a/arch/x86/kvm/mmu.c
> > > +++ b/arch/x86/kvm/mmu.c
> > > @@ -2708,7 +2708,24 @@ static bool mmu_need_write_protect(struct kvm_vcpu 
> > > *vcpu, gfn_t gfn,
> > >   static bool kvm_is_mmio_pfn(kvm_pfn_t pfn)
> > >   {
> > >   if (pfn_valid(pfn))
> > > - return !is_zero_pfn(pfn) && PageReserved(pfn_to_page(pfn));
> > > + return !is_zero_pfn(pfn) && PageReserved(pfn_to_page(pfn)) &&
> > > + /*
> > > +  * Some reserved pages, such as those from
> > > +  * NVDIMM DAX devices, are not for MMIO, and
> > > +  * can be mapped with cached memory type for
> > > +  * better performance. However, the above
> > > +  * check misconceives those pages as MMIO.
> > > +  * Because KVM maps MMIO pages with UC memory
> > > +  * type, the performance of guest accesses to
> > > +  * those pages would be harmed. Therefore, we
> > > +  * check the host memory type in addition and
> > > +  * only treat UC/UC- pages as MMIO.
> > > +  *
> > > +  * pat_pfn_is_uc() works only when PAT is enabled,
> > > +  * so check pat_enabled() as well.
> > > +  */
> > > + (!pat_enabled() ||
> > > +  pat_pfn_is_uc(kvm_pfn_t_to_pfn_t(pfn)));
> > 
> > Can it be compiled if !CONFIG_PAT?
> 
> Yes.
> 
> What I check via pat_enabled() is not only whether PAT support is
> compiled, but also whether PAT is enabled at runtime.
> 
> > 
> > It would be better if we move pat_enabled out of kvm as well,
> 
> Surely I can combine them in one function like
> 
> bool pat_pfn_is_uc(pfn_t pfn)
> {
>   enum page_cache_mode cm;
> 
>   if (!pat_enabled())
>   return false;
> 
>   cm = lookup_memtype(pfn_t_to_phys(pfn));
> 
>   return cm == _PAGE_CACHE_MODE_UC || cm == _PAGE_CACHE_MODE_UC_MINUS;
> }

In addition, I think it's better to split this function into
pat_pfn_is_uc() and pat_pfn_is_uc_minus() to avoid additional
confusion.

Haozhong

> 
> but I need a good name to make its semantics clear, or is it enough to
> just leave a comment like?
> 
> /*
>  * Check via PAT whether the cache mode of a page if UC or UC-.
>  *
>  * Returns true, if PAT is enabled and the cache mode is UC or UC-.
>  * Returns false otherwise.
>  */
> 
> 
> > please refer
> > to pgprot_writecombine() which is implemented in pat.c and in
> > include\asm-generic\pgtable.h:
> > 
> > #ifndef pgprot_writecombine
> > #define pgprot_writecombine pgprot_noncached
> > #endif
> >
> 
> 
>

[PATCH] mm/hugetlb: Implement ASLR and topdown for hugetlb mappings

2017-11-03 Thread Shile Zhang

merge from arch/x86

Signed-off-by: Shile Zhang 
---
 arch/arm/include/asm/page.h |  1 +
 arch/arm/mm/hugetlbpage.c   | 85 +
 2 files changed, 86 insertions(+)

diff --git a/arch/arm/include/asm/page.h b/arch/arm/include/asm/page.h
index 4355f0e..994630f 100644
--- a/arch/arm/include/asm/page.h
+++ b/arch/arm/include/asm/page.h
@@ -144,6 +144,7 @@ extern void copy_page(void *to, const void *from);
 
 #ifdef CONFIG_KUSER_HELPERS
 #define __HAVE_ARCH_GATE_AREA 1
+#define HAVE_ARCH_HUGETLB_UNMAPPED_AREA
 #endif
 
 #ifdef CONFIG_ARM_LPAE
diff --git a/arch/arm/mm/hugetlbpage.c b/arch/arm/mm/hugetlbpage.c
index fcafb52..46ed0c8 100644
--- a/arch/arm/mm/hugetlbpage.c
+++ b/arch/arm/mm/hugetlbpage.c
@@ -45,3 +45,88 @@ int pmd_huge(pmd_t pmd)
 {
return pmd_val(pmd) && !(pmd_val(pmd) & PMD_TABLE_BIT);
 }
+
+#ifdef CONFIG_HUGETLB_PAGE
+static unsigned long hugetlb_get_unmapped_area_bottomup(struct file *file,
+   unsigned long addr, unsigned long len,
+   unsigned long pgoff, unsigned long flags)
+{
+   struct hstate *h = hstate_file(file);
+   struct vm_unmapped_area_info info;
+
+   info.flags = 0;
+   info.length = len;
+   info.low_limit = current->mm->mmap_legacy_base;
+   info.high_limit = TASK_SIZE;
+   info.align_mask = PAGE_MASK & ~huge_page_mask(h);
+   info.align_offset = 0;
+   return vm_unmapped_area();
+}
+
+static unsigned long hugetlb_get_unmapped_area_topdown(struct file *file,
+   unsigned long addr0, unsigned long len,
+   unsigned long pgoff, unsigned long flags)
+{
+   struct hstate *h = hstate_file(file);
+   struct vm_unmapped_area_info info;
+   unsigned long addr;
+
+   info.flags = VM_UNMAPPED_AREA_TOPDOWN;
+   info.length = len;
+   info.low_limit = PAGE_SIZE;
+   info.high_limit = current->mm->mmap_base;
+   info.align_mask = PAGE_MASK & ~huge_page_mask(h);
+   info.align_offset = 0;
+   addr = vm_unmapped_area();
+
+   /*
+* A failed mmap() very likely causes application failure,
+* so fall back to the bottom-up function here. This scenario
+* can happen with large stack limits and large mmap()
+* allocations.
+*/
+   if (addr & ~PAGE_MASK) {
+   VM_BUG_ON(addr != -ENOMEM);
+   info.flags = 0;
+   info.low_limit = TASK_UNMAPPED_BASE;
+   info.high_limit = TASK_SIZE;
+   addr = vm_unmapped_area();
+   }
+
+   return addr;
+}
+
+unsigned long
+hugetlb_get_unmapped_area(struct file *file, unsigned long addr,
+   unsigned long len, unsigned long pgoff, unsigned long flags)
+{
+   struct hstate *h = hstate_file(file);
+   struct mm_struct *mm = current->mm;
+   struct vm_area_struct *vma;
+
+   if (len & ~huge_page_mask(h))
+   return -EINVAL;
+   if (len > TASK_SIZE)
+   return -ENOMEM;
+
+   if (flags & MAP_FIXED) {
+   if (prepare_hugepage_range(file, addr, len))
+   return -EINVAL;
+   return addr;
+   }
+
+   if (addr) {
+   addr = ALIGN(addr, huge_page_size(h));
+   vma = find_vma(mm, addr);
+   if (TASK_SIZE - len >= addr &&
+   (!vma || addr + len <= vma->vm_start))
+   return addr;
+   }
+   if (mm->get_unmapped_area == arch_get_unmapped_area)
+   return hugetlb_get_unmapped_area_bottomup(file, addr, len,
+   pgoff, flags);
+   else
+   return hugetlb_get_unmapped_area_topdown(file, addr, len,
+   pgoff, flags);
+}
+#endif /* CONFIG_HUGETLB_PAGE */
-- 
2.6.2

linux-next: Tree for Nov 3

2017-11-03 Thread Stephen Rothwell

Hi all,

Changes since 20171102:

New tree: opp

The SPDX license identifier addition commits in Linus' tree caused 81
new conflicts in linux-next today, but I did not report them as the
resolutions are trivial and most of them are files that are removed
in linux-next.

The sunxi tree lost its build failure.

The powerpc tree still had its build failure for which I applied a patch.

The regmap tree still had its build failure so I used the version from
next-20171018.

The nvdimm tree gained conflict against the ext4 and gfs2 trees and a
build failure so I used the version from next-20171102.

The akpm-current tree gained a conflict against the net-next tree.

Non-merge commits (relative to Linus' tree): 10201
 9162 files changed, 500400 insertions(+), 233533 deletions(-)



I have created today's linux-next tree at
git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
(patches at http://www.kernel.org/pub/linux/kernel/next/ ).  If you
are tracking the linux-next tree using git, you should not use "git pull"
to do so as that will try to merge the new linux-next release with the
old one.  You should use "git fetch" and checkout or reset to the new
master.

You can see which trees have been included by looking in the Next/Trees
file in the source.  There are also quilt-import.log and merge.log
files in the Next directory.  Between each merge, the tree was built
with a ppc64_defconfig for powerpc and an allmodconfig (with
CONFIG_BUILD_DOCSRC=n) for x86_64, a multi_v7_defconfig for arm and a
native build of tools/perf. After the final fixups (if any), I do an
x86_64 modules_install followed by builds for x86_64 allnoconfig,
powerpc allnoconfig (32 and 64 bit), ppc44x_defconfig, allyesconfig
and pseries_le_defconfig and i386, sparc and sparc64 defconfig. And
finally, a simple boot test of the powerpc pseries_le_defconfig kernel
in qemu.

Below is a summary of the state of the merge.

I am currently merging 271 trees (counting Linus' and 42 trees of bug
fix patches pending for the current merge release).

Stats about the size of the tree over time can be seen at
http://neuling.org/linux-next-size.html .

Status of my local build tests will be at
http://kisskb.ellerman.id.au/linux-next .  If maintainers want to give
advice about cross compilers/configs that work, we are always open to add
more builds.

Thanks to Randy Dunlap for doing many randconfig builds.  And to Paul
Gortmaker for triage and bug fixes.

-- 
Cheers,
Stephen Rothwell

$ git checkout master
$ git reset --hard stable
Merging origin/master (ead751507de8 Merge tag 'spdx_identifiers-4.14-rc8' of 
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core)
Merging fixes/master (820bf5c419e4 Merge tag 'scsi-fixes' of 
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi)
Merging kbuild-current/fixes (bb3f38c3c5b7 kbuild: clang: fix build failures 
with sparse check)
Merging arc-current/for-curr (fdbed19697e1 ARC: unbork module link errors with 
!CONFIG_ARC_HAS_LLSC)
Merging arm-current/fixes (dad4675388fc ARM: add debug ".edata_real" symbol)
Merging m68k-current/for-linus (558d5ad276c9 m68k/mac: Avoid soft-lockup 
warning after mach_power_off)
Merging metag-fixes/fixes (b884a190afce metag/usercopy: Add missing fixups)
Merging powerpc-fixes/fixes (e6c4dcb30816 powerpc/kprobes: Dereference function 
pointers only if the address does not belong to kernel text)
Merging sparc/master (23198ddffb6c sparc32: Add cmpxchg64().)
Merging fscrypt-current/for-stable (42d97eb0ade3 fscrypt: fix renaming and 
linking special files)
Merging net/master (74784da82ff7 Merge 
git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf)
Merging ipsec/master (74784da82ff7 Merge 
git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf)
Merging netfilter/master (7400bb4b5800 netfilter: nf_reject_ipv4: Fix 
use-after-free in send_reset)
Merging ipvs/master (f7fb77fc1235 netfilter: nft_compat: check extension hook 
mask only if set)
Merging wireless-drivers/master (a6127b4440d1 Merge tag 
'iwlwifi-for-kalle-2017-10-06' of 
git://git.kernel.org/pub/scm/linux/kernel/git/iwlwifi/iwlwifi-fixes)
Merging mac80211/master (9618aec3349b Merge tag 'mac80211-for-davem-2017-10-25' 
of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211)
Merging sound-current/for-linus (f5ce817951f3 ALSA: usb-audio: support new 
Amanero Combo384 firmware version)
Merging pci-current/for-linus (814eae5982cc alpha/PCI: Move 
pci_map_irq()/pci_swizzle() out of initdata)
Merging driver-core.current/driver-core-linus (33d930e59a98 Linux 4.14-rc5)
Merging tty.current/tty-linus (8a5776a5f498 Linux 4.14-rc4)
Merging usb.current/usb-linus (bb176f67090c Linux 4.14-rc6)
Merging usb-gadget-fixes/fixes (7c80f9e4a588 usb: usbtest: fix NULL pointer 
dereference)
Merging usb-serial-fixes/usb-linus (0b07194bb55e Linux 4.14-rc7)
Merging usb-chipidea-fixes/ci-for-usb-stable (cbb22ebcfb99 usb: chipidea: core: 
check before

Re: [PATCH v2 1/1] mm: buddy page accessed before initialized

2017-11-03 Thread Michal Hocko

On Thu 02-11-17 13:02:21, Pavel Tatashin wrote:
> This problem is seen when machine is rebooted after kexec:
> A message like this is printed:
> ==
> WARNING: CPU: 21 PID: 249 at linux/lib/list_debug.c:53__listd+0x83/0xa0
> Modules linked in:
> CPU: 21 PID: 249 Comm: pgdatinit0 Not tainted 4.14.0-rc6_pt_deferred #90
> Hardware name: Oracle Corporation ORACLE SERVER X6-2/ASM,MOTHERBOARD,1U,
> BIOS 3016
> node 1 initialised, 32444607 pages in 1679ms
> task: 880180e75a00 task.stack: c9000cdb
> RIP: 0010:__list_del_entry_valid+0x83/0xa0
> RSP: :c9000cdb3d18 EFLAGS: 00010046
> RAX: 0054 RBX: 0009 RCX: 81c5f3e8
> RDX:  RSI: 0086 RDI: 0046
> RBP: c9000cdb3d18 R08: fffe R09: 0154
> R10: 0005 R11: 0153 R12: 01fcdc00
> R13: 01fcde00 R14: 88207ffded00 R15: ea007f37
> FS:  () GS:881fffac() knlGS:0
> CS:  0010 DS:  ES:  CR0: 80050033
> CR2:  CR3: 00407ec09001 CR4: 003606e0
> DR0:  DR1:  DR2: 
> DR3:  DR6: fffe0ff0 DR7: 0400
> Call Trace:
>  free_one_page+0x103/0x390
>  __free_pages_ok+0x1cf/0x2d0
>  __free_pages+0x19/0x30
>  __free_pages_boot_core+0xae/0xba
>  deferred_free_range+0x60/0x94
>  deferred_init_memmap+0x324/0x372
>  kthread+0x109/0x140
>  ? __free_pages_bootmem+0x2e/0x2e
>  ? kthread_park+0x60/0x60
>  ret_from_fork+0x25/0x30
> 
> list_del corruption. next->prev should be ea007f428020, but was
> ea007f1d8020
> ==
> 
> The problem happens in this path:
> 
> page_alloc_init_late
>   deferred_init_memmap
> deferred_init_range
>   __def_free
> deferred_free_range
>   __free_pages_boot_core(page, order)
> __free_pages()
>   __free_pages_ok()
> free_one_page()
>   __free_one_page(page, pfn, zone, order, migratetype);
> 
> deferred_init_range() initializes one page at a time by calling
> __init_single_page(), once it initializes pageblock_nr_pages pages, it
> calls deferred_free_range() to free the initialized pages to the buddy
> allocator. Eventually, we reach __free_one_page(), where we compute buddy
> page:
>   buddy_pfn = __find_buddy_pfn(pfn, order);
>   buddy = page + (buddy_pfn - pfn);
> 
> buddy_pfn is computed as pfn ^ (1 << order), or pfn + pageblock_nr_pages.
> Thefore, buddy page becomes a page one after the range that currently was
> initialized, and we access this page in this function. Also, later when we
> return back to deferred_init_range(), the buddy page is initialized again.
> 
> So, in order to avoid this issue, we must initialize the buddy page prior
> to calling deferred_free_range().

Have you measured any negative performance impact with this change?

> Signed-off-by: Pavel Tatashin 

The patch looks good to me otherwise. So if this doesn't introduce a
noticeable overhead, which I whope it doesn't then feel free to add
Acked-by: Michal Hocko 
-- 
Michal Hocko
SUSE Labs

[PATCH 0/9] Trivial fixup for KVM-based rcutorture test framework

2017-11-03 Thread SeongJae Park

This patchset contains trivial fixup and enhancements for KVM-based rcutorture
test framework.

SeongJae Park (9):
  rcutorture/configinit: Fix build directory error message
  rcutorture: Remove unused script, config2frag.sh
  rcutorture/kvm.sh: Remove unused variable, `alldone`
  rcutorture/kvm.sh: Use consistent usage for --qemu-args
  rcutorture/kvm.sh: Support execution from any directory
  rcutorture/kvm-recheck-*: Improve result directory readability check
  rcutorture/kvm-build.sh: Skip build directory check
  rcutorture: Simplify logging
  rcutorture: Simplify functions.sh include path

 .../selftests/rcutorture/bin/config2frag.sh| 25 -
 .../testing/selftests/rcutorture/bin/configinit.sh |  2 +-
 .../testing/selftests/rcutorture/bin/kvm-build.sh  |  5 ---
 .../selftests/rcutorture/bin/kvm-recheck-lock.sh   |  2 +-
 .../selftests/rcutorture/bin/kvm-recheck-rcu.sh|  4 +--
 .../rcutorture/bin/kvm-recheck-rcuperf-ftrace.sh   |  2 +-
 .../rcutorture/bin/kvm-recheck-rcuperf.sh  |  4 +--
 .../selftests/rcutorture/bin/kvm-recheck.sh|  2 +-
 .../selftests/rcutorture/bin/kvm-test-1-run.sh |  6 ++--
 tools/testing/selftests/rcutorture/bin/kvm.sh  | 42 +-
 10 files changed, 26 insertions(+), 68 deletions(-)
 delete mode 100755 tools/testing/selftests/rcutorture/bin/config2frag.sh

-- 
2.13.0

[PATCH 3/9] rcutorture/kvm.sh: Remove unused variable, `alldone`

2017-11-03 Thread SeongJae Park

Variable `alldone` is defined but not used within an awk script.  This
commit removes it.

Fixes:53954671033d ("rcutorture: Do better bin packing")

Signed-off-by: SeongJae Park 
---
 tools/testing/selftests/rcutorture/bin/kvm.sh | 1 -
 1 file changed, 1 deletion(-)

diff --git a/tools/testing/selftests/rcutorture/bin/kvm.sh 
b/tools/testing/selftests/rcutorture/bin/kvm.sh
index b55895fb10ed..0acdfa37c8ab 100755
--- a/tools/testing/selftests/rcutorture/bin/kvm.sh
+++ b/tools/testing/selftests/rcutorture/bin/kvm.sh
@@ -238,7 +238,6 @@ BEGIN {
 }
 
 END {
-   alldone = 0;
batch = 0;
nc = -1;
 
-- 
2.13.0

[PATCH 8/9] rcutorture: Simplify logging

2017-11-03 Thread SeongJae Park

'kvm.sh' and 'kvm-test-1-run.sh' log messages by printing the message to
'stdout' then writing it into the log file.  Generation of the message
occurs twice once for 'stdout', and once for log file.  It is redundant.
Moreover, many of the messages are containing 'date' output inside the
messages.  Because the 'date' calculation also be called twice (once for
stdout print, once for log file write), the date information in stdout
and log file can be different.  It could confuse some sensitive mind
people.

This commit simplifies the logging procedure by using 'tee'.

Signed-off-by: SeongJae Park 
---
 .../selftests/rcutorture/bin/kvm-test-1-run.sh |  4 +--
 tools/testing/selftests/rcutorture/bin/kvm.sh  | 32 --
 2 files changed, 12 insertions(+), 24 deletions(-)

diff --git a/tools/testing/selftests/rcutorture/bin/kvm-test-1-run.sh 
b/tools/testing/selftests/rcutorture/bin/kvm-test-1-run.sh
index 0af36a721b9c..2678a9d0733d 100755
--- a/tools/testing/selftests/rcutorture/bin/kvm-test-1-run.sh
+++ b/tools/testing/selftests/rcutorture/bin/kvm-test-1-run.sh
@@ -154,9 +154,7 @@ cpu_count=`configfrag_boot_cpus "$boot_args" 
"$config_template" "$cpu_count"`
 vcpus=`identify_qemu_vcpus`
 if test $cpu_count -gt $vcpus
 then
-   echo CPU count limited from $cpu_count to $vcpus
-   touch $resdir/Warnings
-   echo CPU count limited from $cpu_count to $vcpus >> $resdir/Warnings
+   echo CPU count limited from $cpu_count to $vcpus | tee -a 
$resdir/Warnings
cpu_count=$vcpus
 fi
 qemu_args="`specify_qemu_cpus "$QEMU" "$qemu_args" "$cpu_count"`"
diff --git a/tools/testing/selftests/rcutorture/bin/kvm.sh 
b/tools/testing/selftests/rcutorture/bin/kvm.sh
index cd62933e33d7..a7bbe2dc8791 100755
--- a/tools/testing/selftests/rcutorture/bin/kvm.sh
+++ b/tools/testing/selftests/rcutorture/bin/kvm.sh
@@ -331,8 +331,7 @@ awk < $T/cfgcpu.pack \
 # Dump out the scripting required to run one test batch.
 function dump(first, pastlast, batchnum)
 {
-   print "echo Start batch " batchnum ": `date`";
-   print "echo Start batch " batchnum ": `date` >> " rd "/log";
+   print "echo Start batch " batchnum ": `date` | tee -a " rd "log";
print "needqemurun="
jn=1
for (j = first; j < pastlast; j++) {
@@ -349,21 +348,18 @@ function dump(first, pastlast, batchnum)
ovf = "-ovf";
else
ovf = "";
-   print "echo ", cfr[jn], cpusr[jn] ovf ": Starting build. 
`date`";
-   print "echo ", cfr[jn], cpusr[jn] ovf ": Starting build. `date` 
>> " rd "/log";
+   print "echo ", cfr[jn], cpusr[jn] ovf ": Starting build. `date` 
| tee -a " rd "log";
print "rm -f " builddir ".*";
print "touch " builddir ".wait";
print "mkdir " builddir " > /dev/null 2>&1 || :";
print "mkdir " rd cfr[jn] " || :";
print "kvm-test-1-run.sh " CONFIGDIR cf[j], builddir, rd 
cfr[jn], dur " \"" TORTURE_QEMU_ARG "\" \"" TORTURE_BOOTARGS "\" > " rd cfr[jn] 
 "/kvm-test-1-run.sh.out 2>&1 &"
-   print "echo ", cfr[jn], cpusr[jn] ovf ": Waiting for build to 
complete. `date`";
-   print "echo ", cfr[jn], cpusr[jn] ovf ": Waiting for build to 
complete. `date` >> " rd "/log";
+   print "echo ", cfr[jn], cpusr[jn] ovf ": Waiting for build to 
complete. `date` | tee -a " rd "log";
print "while test -f " builddir ".wait"
print "do"
print "\tsleep 1"
print "done"
-   print "echo ", cfr[jn], cpusr[jn] ovf ": Build complete. 
`date`";
-   print "echo ", cfr[jn], cpusr[jn] ovf ": Build complete. `date` 
>> " rd "/log";
+   print "echo ", cfr[jn], cpusr[jn] ovf ": Build complete. `date` 
| tee -a " rd "log";
jn++;
}
for (j = 1; j < jn; j++) {
@@ -371,8 +367,7 @@ function dump(first, pastlast, batchnum)
print "rm -f " builddir ".ready"
print "if test -f \"" rd cfr[j] "/builtkernel\""
print "then"
-   print "\techo ", cfr[j], cpusr[j] ovf ": Kernel present. 
`date`";
-   print "\techo ", cfr[j], cpusr[j] ovf ": Kernel present. 
`date` >> " rd "/log";
+   print "\techo ", cfr[j], cpusr[j] ovf ": Kernel present. 
`date` | tee -a " rd "log";
print "\tneedqemurun=1"
print "fi"
}
@@ -386,31 +381,26 @@ function dump(first, pastlast, batchnum)
njitter = ja[1];
if (TORTURE_BUILDONLY && njitter != 0) {
njitter = 0;
-   print "echo Build-only run, so suppressing jitter >> " rd "/log"
+   print "echo Build-only run, so suppressing jitter | tee -a " rd 
"log"
}
if (TORTURE_BUILDONLY) {
print "needqemurun="
}

[PATCH 9/9] rcutorture: Simplify functions.sh include path

2017-11-03 Thread SeongJae Park

Inclusions of 'functions.sh' from 'kvm-test-1-run.sh' and
'kvm-recheck*.sh' are using its absolute path.  Because directory for
'functions.sh' is already in PATH, however, it could be included without
the full path.  This commit simplifies the inclusions to use simple,
short relative path.

Signed-off-by: SeongJae Park 
---
 tools/testing/selftests/rcutorture/bin/kvm-recheck-rcu.sh| 2 +-
 tools/testing/selftests/rcutorture/bin/kvm-recheck-rcuperf-ftrace.sh | 2 +-
 tools/testing/selftests/rcutorture/bin/kvm-recheck-rcuperf.sh| 2 +-
 tools/testing/selftests/rcutorture/bin/kvm-recheck.sh| 2 +-
 tools/testing/selftests/rcutorture/bin/kvm-test-1-run.sh | 2 +-
 5 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/tools/testing/selftests/rcutorture/bin/kvm-recheck-rcu.sh 
b/tools/testing/selftests/rcutorture/bin/kvm-recheck-rcu.sh
index 9e34656bf659..c2e1bb6d0cba 100755
--- a/tools/testing/selftests/rcutorture/bin/kvm-recheck-rcu.sh
+++ b/tools/testing/selftests/rcutorture/bin/kvm-recheck-rcu.sh
@@ -30,7 +30,7 @@ else
echo Unreadable results directory: $i
exit 1
 fi
-. tools/testing/selftests/rcutorture/bin/functions.sh
+. functions.sh
 
 configfile=`echo $i | sed -e 's/^.*\///'`
 ngps=`grep ver: $i/console.log 2> /dev/null | tail -1 | sed -e 's/^.* ver: //' 
-e 's/ .*$//'`
diff --git 
a/tools/testing/selftests/rcutorture/bin/kvm-recheck-rcuperf-ftrace.sh 
b/tools/testing/selftests/rcutorture/bin/kvm-recheck-rcuperf-ftrace.sh
index f79b0e9e84fc..963f71289d22 100755
--- a/tools/testing/selftests/rcutorture/bin/kvm-recheck-rcuperf-ftrace.sh
+++ b/tools/testing/selftests/rcutorture/bin/kvm-recheck-rcuperf-ftrace.sh
@@ -26,7 +26,7 @@
 # Authors: Paul E. McKenney 
 
 i="$1"
-. tools/testing/selftests/rcutorture/bin/functions.sh
+. functions.sh
 
 if test "`grep -c 'rcu_exp_grace_period.*start' < $i/console.log`" -lt 100
 then
diff --git a/tools/testing/selftests/rcutorture/bin/kvm-recheck-rcuperf.sh 
b/tools/testing/selftests/rcutorture/bin/kvm-recheck-rcuperf.sh
index 6138fd94abfe..ccebf772fa1e 100755
--- a/tools/testing/selftests/rcutorture/bin/kvm-recheck-rcuperf.sh
+++ b/tools/testing/selftests/rcutorture/bin/kvm-recheck-rcuperf.sh
@@ -31,7 +31,7 @@ else
exit 1
 fi
 PATH=`pwd`/tools/testing/selftests/rcutorture/bin:$PATH; export PATH
-. tools/testing/selftests/rcutorture/bin/functions.sh
+. functions.sh
 
 if kvm-recheck-rcuperf-ftrace.sh $i
 then
diff --git a/tools/testing/selftests/rcutorture/bin/kvm-recheck.sh 
b/tools/testing/selftests/rcutorture/bin/kvm-recheck.sh
index f659346d3358..f7e988f369dd 100755
--- a/tools/testing/selftests/rcutorture/bin/kvm-recheck.sh
+++ b/tools/testing/selftests/rcutorture/bin/kvm-recheck.sh
@@ -25,7 +25,7 @@
 # Authors: Paul E. McKenney 
 
 PATH=`pwd`/tools/testing/selftests/rcutorture/bin:$PATH; export PATH
-. tools/testing/selftests/rcutorture/bin/functions.sh
+. functions.sh
 for rd in "$@"
 do
firsttime=1
diff --git a/tools/testing/selftests/rcutorture/bin/kvm-test-1-run.sh 
b/tools/testing/selftests/rcutorture/bin/kvm-test-1-run.sh
index 2678a9d0733d..afdd4630ff88 100755
--- a/tools/testing/selftests/rcutorture/bin/kvm-test-1-run.sh
+++ b/tools/testing/selftests/rcutorture/bin/kvm-test-1-run.sh
@@ -42,7 +42,7 @@ T=/tmp/kvm-test-1-run.sh.$$
 trap 'rm -rf $T' 0
 mkdir $T
 
-. $KVM/bin/functions.sh
+. functions.sh
 . $CONFIGFRAG/ver_functions.sh
 
 config_template=${1}
-- 
2.13.0

RE: [f2fs-dev] [PATCH RESEND] f2fs: modify the procedure of scan free nid

2017-11-03 Thread Fan Li



> -Original Message-
> From: Chao Yu [mailto:yuch...@huawei.com]
> Sent: Friday, November 03, 2017 4:54 PM
> To: Fan Li; 'Chao Yu'; 'Jaegeuk Kim'
> Cc: linux-kernel@vger.kernel.org; linux-f2fs-de...@lists.sourceforge.net
> Subject: Re: [f2fs-dev] [PATCH RESEND] f2fs: modify the procedure of scan 
> free nid
> 
> On 2017/11/3 15:31, Fan Li wrote:
> > In current version, we preserve 8 pages of nat blocks as free nids, we
> > build bitmaps for it and use them to allocate nids until its number
> > drops below NAT_ENTRY_PER_BLOCK.
> >
> > After that, we have a problem, scan_free_nid_bits will scan the same
> > 8 pages trying to find more free nids, but in most cases the free nids
> > in these bitmaps are already in free list, scan them won't get us any
> > new nids.
> > Further more, after scan_free_nid_bits, the scan is over if
> > nid_cnt[FREE_NID] != 0.
> > It causes that we scan the same pages over and over again, and no new
> > free nids are found until nid_cnt[FREE_NID]==0. While the scanned
> > pages increase, the problem grows worse.
> >
> > This patch mark the range where new free nids could exist and keep
> > scan for free nids until nid_cnt[FREE_NID] >= NAT_ENTRY_PER_BLOCK.
> > The new vairable first_scan_block marks the start of the range, it's
> > initialized with NEW_ADDR, which means all free nids before
> > next_scan_nid are already in free list; and use next_scan_nid as the
> > end of the range since all free nids which are scanned in
> > scan_free_nid_bits must be smaller next_scan_nid.
> 
> Think over again, IMO, we can add an variable for stating total count of free 
> nids in bitamp, if there is no free nid, just
skipping scanning all
> existed bitmap.
> 
> And if there is only few free nid scattered in bitmap, the cost will be 
> limited because we will skip scanning
nm_i::free_nid_bitmap if
> nm_i::free_nid_count is zero. Once we find one free nid, let's skip out.
> 
> Since there shouldn't be very heavy overhead for CPU during traveling 
> nm_i::nat_block_bitmap, I expect below change could be more
> simple for maintaining and being with the same effect.
> 
> How do you think?
> 

I think if you need this to work, check total_bitmap_free_nid may not be 
sufficient enough.
The problem this patch presents is  that even all the free nids are already in 
the free list,
we still scan all the pages.
The scan proceeds once free nid count is below NAT_ENTRY_PER_BLOCK. 
So in most cases, there are still free nids in the bitmap during the scan, and
current codes will check every one of them to see if they are actually in free 
list.
If only check total_bitmap_free_nid == 0 won't take this overhead away.

I considered a lot of ways to fix this problem before I submit this patch,
One of my idea is quite similar to yours, but I use
"if (total_bitmap_free_nid == nm_i->nid_cnt[FREE_NID])" to decide whether
skip or not. 
If you insist, I can submit this simpler one instead, but some follow upgrade
would be unavailable, for example, use smaller granularity for tracking 
last-scanned-position that we talked about.

I know sometimes I can be obsessed with the performance, I usually
choose the faster way over simpler ones. If you think it's too much,
please tell me, I'm sure we can find some middle ground.

Thank you


> diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h index cb3f10bc8723..238d95e89dec 
> 100644
> --- a/fs/f2fs/f2fs.h
> +++ b/fs/f2fs/f2fs.h
> @@ -729,6 +729,7 @@ struct f2fs_nm_info {
>   unsigned char (*free_nid_bitmap)[NAT_ENTRY_BITMAP_SIZE];
>   unsigned char *nat_block_bitmap;
>   unsigned short *free_nid_count; /* free nid count of NAT block */
> + unsigned int total_bitmap_free_nid; /* total free nid count in 
> bitmap */
> 
>   /* for checkpoint */
>   char *nat_bitmap;   /* NAT bitmap pointer */
> diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c index fef5c68886b1..e4861908a396 
> 100644
> --- a/fs/f2fs/node.c
> +++ b/fs/f2fs/node.c
> @@ -1911,10 +1911,13 @@ static void update_free_nid_bitmap(struct 
> f2fs_sb_info *sbi, nid_t nid,
>   else
>   __clear_bit_le(nid_ofs, nm_i->free_nid_bitmap[nat_ofs]);
> 
> - if (set)
> + if (set) {
>   nm_i->free_nid_count[nat_ofs]++;
> - else if (!build)
> + nm_i->total_bitmap_free_nid++;
> + } else if (!build) {
>   nm_i->free_nid_count[nat_ofs]--;
> + nm_i->total_bitmap_free_nid--;
> + }
>  }
> 
>  static void scan_nat_page(struct f2fs_sb_info *sbi, @@ -1958,6 +1961,9 @@ 
> static void scan_free_nid_bits(struct f2fs_sb_info
*sbi)
> 
>   down_read(_i->nat_tree_lock);
> 
> + if (!nm_i->total_bitmap_free_nid)
> + goto out;
> +
>   for (i = 0; i < nm_i->nat_blocks; i++) {
>   if (!test_bit_le(i, nm_i->nat_block_bitmap))
>   continue;
> @@ -1972,7 +1978,7 @@ static void scan_free_nid_bits(struct f2fs_sb_info *sbi)
>   nid = i * NAT_ENTRY_PER_BLOCK + idx;

Re: [PATCH v3 2/3] usb: xhci: Add DbC support in xHCI driver

2017-11-03 Thread Felipe Balbi


Hi,

Greg Kroah-Hartman  writes:
>>  > >> Greg Kroah-Hartman  writes:
>> >>> > >> >> > xHCI compatible USB host controllers(i.e. super-speed USB3 
>> >>> > >> >> > controllers)
>> >>> > >> >> > can be implemented with the Debug Capability(DbC). It 
>> >>> > >> >> > presents a debug
>> >>> > >> >> > device which is fully compliant with the USB framework and 
>> >>> > >> >> > provides the
>> >>> > >> >> > equivalent of a very high performance full-duplex serial 
>> >>> > >> >> > link. The debug
>> >>> > >> >> > capability operation model and registers interface are 
>> >>> > >> >> > defined in 7.6.8
>> >>> > >> >> > of the xHCI specification, revision 1.1.
>> >>> > >> >> >
>> >>> > >> >> > The DbC debug device shares a root port with the xHCI 
>> >>> > >> >> > host. By default,
>> >>> > >> >> > the debug capability is disabled and the root port is 
>> >>> > >> >> > assigned to xHCI.
>> >>> > >> >> > When the DbC is enabled, the root port will be assigned to 
>> >>> > >> >> > the DbC debug
>> >>> > >> >> > device, and the xHCI sees nothing on this port. This 
>> >>> > >> >> > implementation uses
>> >>> > >> >> > a sysfs node named  under the xHCI device to manage 
>> >>> > >> >> > the enabling
>> >>> > >> >> > and disabling of the debug capability.
>> >>> > >> >> >
>> >>> > >> >> > When the debug capability is enabled, it will present a 
>> >>> > >> >> > debug device
>> >>> > >> >> > through the debug port. This debug device is fully 
>> >>> > >> >> > compliant with the
>> >>> > >> >> > USB3 framework, and it can be enumerated by a debug host 
>> >>> > >> >> > on the other
>> >>> > >> >> > end of the USB link. As soon as the debug device is 
>> >>> > >> >> > configured, a TTY
>> >>> > >> >> > serial device named /dev/ttyDBC0 will be created.
>> >>> > >> >> >
>> >>> > >> >> > One use of this link is running a login service on the 
>> >>> > >> >> > debug target.
>> >>> > >> >> > Hence it can be remote accessed by a debug host. Another 
>> >>> > >> >> > use case can
>> >>> > >> >> > probably be found in servers. It provides a peer-to-peer 
>> >>> > >> >> > USB link
>> >>> > >> >> > between two host-only machines. This provides a reasonable 
>> >>> > >> >> > out-of-band
>> >>> > >> >> > communication method between two servers.
>> >>> > >> >> >
>> >>> > >> >> > Signed-off-by: Lu Baolu 
>> >>> > >> >> > ---
>> >>> > >> >> >  .../ABI/testing/sysfs-bus-pci-drivers-xhci_hcd |   25 
>> >>> > >> >> > +
>> >>> > >> >> >  drivers/usb/host/Kconfig   |9 
>> >>> > >> >> > +
>> >>> > >> >> >  drivers/usb/host/Makefile  |5 
>> >>> > >> >> > +
>> >>> > >> >> >  drivers/usb/host/xhci-dbgcap.c | 1016 
>> >>> > >> >> > 
>> >>> > >> >> >  drivers/usb/host/xhci-dbgcap.h |  247 
>> >>> > >> >> > +
>> >>> > >> >> >  drivers/usb/host/xhci-dbgtty.c |  586 
>> >>> > >> >> > +++
>> >>> > >> >> >  drivers/usb/host/xhci-trace.h  |   60 
>> >>> > >> >> > ++
>> >>> > >> >> >  drivers/usb/host/xhci.c|   10 
>> >>> > >> >> > +
>> >>> > >> >> >  drivers/usb/host/xhci.h|1 
>> >>> > >> >> > +
>> >>> > >> >> >  9 files changed, 1959 insertions(+)
>> >>> > >> >> >  create mode 100644 
>> >>> > >> >> > Documentation/ABI/testing/sysfs-bus-pci-drivers-xhci_hcd
>> >>> > >> >> >  create mode 100644 drivers/usb/host/xhci-dbgcap.c
>> >>> > >> >> >  create mode 100644 drivers/usb/host/xhci-dbgcap.h
>> >>> > >> >> >  create mode 100644 drivers/usb/host/xhci-dbgtty.c
>> >>> > >> >> >
>> >> > >> >> 
>> >> > >> >> [snip]
>> >> > >> >> 
>> >>> > >> >> > +#define DBC_VENDOR_ID 0x1d6b  /* 
>> >>> > >> >> > Linux Foundation 0x1d6b */
>> >>> > >> >> > +#define DBC_PRODUCT_ID0x0004  /* 
>> >>> > >> >> > device 0004 */
>> >>> > >> >> >
>> >> > >> >> 
>> >> > >> >> The DbC (xHCI DeBug Capability) is an optional functionality 
>> >> > >> >> in
>> >> > >> >> some xHCI host controllers. It will present a super-speed 
>> >> > >> >> debug
>> >> > >> >> device through the debug port after it is enabled.
>> >> > >> >> 
>> >> > >> >> The DbC register set defines an interface for system software
>> >> > >> >> to specify the vendor id and product id of the debug device.
>> >> > >> >> These two values will be presented by the debug device in its
>> >> > >> >> device descriptor idVendor and idProduct fields.
>> >> > >> >> 
>> >> > >> >> Microsoft Windows have a well

Re: [PATCH v2] USB: add SPDX identifiers to all remaining files in drivers/usb/

2017-11-03 Thread Felipe Balbi

Greg Kroah-Hartman  writes:

> It's good to have SPDX identifiers in all files to make it easier to
> audit the kernel tree for correct licenses.
>
> Update the drivers/usb/ and include/linux/usb* files with the correct
> SPDX license identifier based on the license text in the file itself.
> The SPDX identifier is a legally binding shorthand, which can be used
> instead of the full boiler plate text.
>
> This work is based on a script and data from Thomas Gleixner, Philippe
> Ombredanne, and Kate Stewart.
>
> Cc: Thomas Gleixner 
> Cc: Kate Stewart 
> Cc: Philippe Ombredanne 
> Signed-off-by: Greg Kroah-Hartman 

Acked-by: Felipe Balbi 

-- 
balbi


signature.asc
Description: PGP signature

Re: [PATCH v9 1/8] perf: Export perf_event_update_userpage

2017-11-03 Thread Mark Rutland

[+ Ingo]

On Tue, Oct 31, 2017 at 05:23:11PM +, Suzuki K Poulose wrote:
> Export perf_event_update_userpage() so that PMU driver using them,
> can be built as modules.

Peter, Ingo, are you happy with this?

It would be useful if we could take this via the arm64 tree with the
rest of the series.

Thanks,
Mark.

> Cc: Peter Zilstra 
> Signed-off-by: Suzuki K Poulose 
> ---
>  kernel/events/core.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index 9d93db81fa36..550015829db8 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -4982,6 +4982,7 @@ void perf_event_update_userpage(struct perf_event 
> *event)
>  unlock:
>   rcu_read_unlock();
>  }
> +EXPORT_SYMBOL_GPL(perf_event_update_userpage);
>  
>  static int perf_mmap_fault(struct vm_fault *vmf)
>  {
> -- 
> 2.13.6
>

[PATCH v5 2/2] watchdog: Add Spreadtrum watchdog driver

2017-11-03 Thread Eric Long

This patch adds the watchdog driver for Spreadtrum SC9860 platform.

Signed-off-by: Eric Long 
---
Changes since v4:
 - Remove sprd_wdt_remove().
 - Add devm_add_action() for sprd_wdt_disable().

Changes since v3:
 - Update Kconfig SPRD_WATCHDOG help messages.
 - Correct the wrong spell words.
 - Rename SPRD_WDT_CNT_HIGH_VALUE as SPRD_WDT_CNT_HIGH_SHIFT.
 - Remove unused macor.
 - Update sprd_wdt_set_pretimeout() api.
 - Add wdt->wdd.timeout default value.
 - Use devm_watchdog_register_device() to register wdt device.
 - If module does not support NOWAYOUT, disable wdt when remove this driver.
 - Call sprd_wdt_disable() every wdt suspend.

Changes since v2:
 - Rename all the macors, add SPRD tag at the head of the macro names.
 - Rename SPRD_WDT_CLK as SPRD_WTC_CNT_STEP.
 - Remove the code which check timeout value at the wrong place.
 - Add min/max timeout value limit.
 - Remove set WDOG_HW_RUNNING status at sprd_wdt_enable().
 - Add timeout/pretimeout judgment when set them.
 - Support WATCHDOG_NOWAYOUT status.

Changes since v1:
 - Use pretimeout instead of own implementation.
 - Fix timeout loop when loading timeout values.
 - use the infrastructure to read and set "timeout-sec" property.
 - Add conditions when start or stop watchdog.
 - Change the position of enabling watchdog.
 - Other optimization.
---
 drivers/watchdog/Kconfig|   8 +
 drivers/watchdog/Makefile   |   1 +
 drivers/watchdog/sprd_wdt.c | 407 
 3 files changed, 416 insertions(+)
 create mode 100644 drivers/watchdog/sprd_wdt.c

diff --git a/drivers/watchdog/Kconfig b/drivers/watchdog/Kconfig
index c722cbf..3367a8c 100644
--- a/drivers/watchdog/Kconfig
+++ b/drivers/watchdog/Kconfig
@@ -787,6 +787,14 @@ config UNIPHIER_WATCHDOG
  To compile this driver as a module, choose M here: the
  module will be called uniphier_wdt.
 
+config SPRD_WATCHDOG
+   tristate "Spreadtrum watchdog support"
+   depends on ARCH_SPRD || COMPILE_TEST
+   select WATCHDOG_CORE
+   help
+ Say Y here to include watchdog timer supported
+ by Spreadtrum system.
+
 # AVR32 Architecture
 
 config AT32AP700X_WDT
diff --git a/drivers/watchdog/Makefile b/drivers/watchdog/Makefile
index 56adf9f..187cca2 100644
--- a/drivers/watchdog/Makefile
+++ b/drivers/watchdog/Makefile
@@ -87,6 +87,7 @@ obj-$(CONFIG_ASPEED_WATCHDOG) += aspeed_wdt.o
 obj-$(CONFIG_ZX2967_WATCHDOG) += zx2967_wdt.o
 obj-$(CONFIG_STM32_WATCHDOG) += stm32_iwdg.o
 obj-$(CONFIG_UNIPHIER_WATCHDOG) += uniphier_wdt.o
+obj-$(CONFIG_SPRD_WATCHDOG) += sprd_wdt.o
 
 # AVR32 Architecture
 obj-$(CONFIG_AT32AP700X_WDT) += at32ap700x_wdt.o
diff --git a/drivers/watchdog/sprd_wdt.c b/drivers/watchdog/sprd_wdt.c
new file mode 100644
index 000..88f6f84
--- /dev/null
+++ b/drivers/watchdog/sprd_wdt.c
@@ -0,0 +1,407 @@
+/*
+ * Spreadtrum watchdog driver
+ * Copyright (C) 2017 Spreadtrum - http://www.spreadtrum.com
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * version 2 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define SPRD_WDT_LOAD_LOW  0x0
+#define SPRD_WDT_LOAD_HIGH 0x4
+#define SPRD_WDT_CTRL  0x8
+#define SPRD_WDT_INT_CLR   0xc
+#define SPRD_WDT_INT_RAW   0x10
+#define SPRD_WDT_INT_MSK   0x14
+#define SPRD_WDT_CNT_LOW   0x18
+#define SPRD_WDT_CNT_HIGH  0x1c
+#define SPRD_WDT_LOCK  0x20
+#define SPRD_WDT_IRQ_LOAD_LOW  0x2c
+#define SPRD_WDT_IRQ_LOAD_HIGH 0x30
+
+/* WDT_CTRL */
+#define SPRD_WDT_INT_EN_BITBIT(0)
+#define SPRD_WDT_CNT_EN_BITBIT(1)
+#define SPRD_WDT_NEW_VER_ENBIT(2)
+#define SPRD_WDT_RST_EN_BITBIT(3)
+
+/* WDT_INT_CLR */
+#define SPRD_WDT_INT_CLEAR_BIT BIT(0)
+#define SPRD_WDT_RST_CLEAR_BIT BIT(3)
+
+/* WDT_INT_RAW */
+#define SPRD_WDT_INT_RAW_BIT   BIT(0)
+#define SPRD_WDT_RST_RAW_BIT   BIT(3)
+#define SPRD_WDT_LD_BUSY_BIT   BIT(4)
+
+/* 1s equal to 32768 counter steps */
+#define SPRD_WDT_CNT_STEP  32768
+
+#define SPRD_WDT_UNLOCK_KEY0xe551
+#define SPRD_WDT_MIN_TIMEOUT   3
+#define SPRD_WDT_MAX_TIMEOUT   60
+
+#define SPRD_WDT_CNT_HIGH_SHIFT16
+#define SPRD_WDT_LOW_VALUE_MASKGENMASK(15, 0)
+#define SPRD_WDT_LOAD_TIMEOUT  1000
+
+struct sprd_wdt {
+   void __iomem *base;
+   struct watchdog_device

[PATCH v5 1/2] dt-bindings: watchdog: Add Spreadtrum watchdog documentation

2017-11-03 Thread Eric Long

This patch adds the documentation for Spreadtrum watchdog driver.

Signed-off-by: Eric Long 
Acked-by: Rob Herring 
---
Changes since v4:
- No updates.

Changes since v3:
- No updates.

Changes since v2:
- Add acked tag from Rob.

Changes since v1:
- No updates.
---
 .../devicetree/bindings/watchdog/sprd-wdt.txt | 19 +++
 1 file changed, 19 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/watchdog/sprd-wdt.txt

diff --git a/Documentation/devicetree/bindings/watchdog/sprd-wdt.txt 
b/Documentation/devicetree/bindings/watchdog/sprd-wdt.txt
new file mode 100644
index 000..aeaf3e0
--- /dev/null
+++ b/Documentation/devicetree/bindings/watchdog/sprd-wdt.txt
@@ -0,0 +1,19 @@
+Spreadtrum SoCs Watchdog timer
+
+Required properties:
+- compatible : Should be "sprd,sp9860-wdt".
+- reg : Specifies base physical address and size of the registers.
+- interrupts : Exactly one interrupt specifier.
+- timeout-sec : Contain the default watchdog timeout in seconds.
+- clock-names : Contain the input clock names.
+- clocks : Phandles to input clocks.
+
+Example:
+   watchdog: watchdog@4031 {
+   compatible = "sprd,sp9860-wdt";
+   reg = <0 0x4031 0 0x1000>;
+   interrupts = ;
+   timeout-sec = <12>;
+   clock-names = "enable", "rtc_enable";
+   clocks = <_aon_apb_gates1 8>, <_aon_apb_rtc_gates 9>;
+   };
-- 
1.9.1

perf/x86/amd/power missing pmu::module initialisation

2017-11-03 Thread Mark Rutland

Hi,

As a heads-up, I believe that arch/x86/events/amd/power.c is currently
missing initialisation of pmu::module, which could result in problems
when the module is unloaded.

We've just spotted a similar problem in a couple of ARM PMUs, and AFAICT
the AMD power PMU is the only other PMU affected.

Other modular PMUs all initialise this to THIS_MODULE prior to calling
perf_pmu_register().

Thanks,
Mark.

Applied "regmap: Add a config option for hwspinlock" to the regmap tree

2017-11-03 Thread Mark Brown

The patch

   regmap: Add a config option for hwspinlock

has been applied to the regmap tree at

   https://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap.git 

All being well this means that it will be integrated into the linux-next
tree (usually sometime in the next 24 hours) and sent to Linus during
the next merge window (or sooner if it is a bug fix), however if
problems are discovered then the patch may be dropped or reverted.  

You may get further e-mails resulting from automated or manual testing
and review of the tree, please engage with people reporting problems and
send followup patches addressing any issues that are reported if needed.

If any updates are required or you are submitting further changes they
should be sent as incremental updates against current git, existing
patches will not be replaced.

Please add any relevant lists and maintainers to the CCs when replying
to this mail.

Thanks,
Mark

>From f25637a6b89e59eddf79f6df39b23e202753f555 Mon Sep 17 00:00:00 2001
From: Mark Brown 
Date: Fri, 3 Nov 2017 12:38:04 +0100
Subject: [PATCH] regmap: Add a config option for hwspinlock

Unlike other lock types hwspinlocks are optional and can be built
modular so we can't use them unconditionally in regmap so add a config
option that drivers that want to use hwspinlocks with regmap can select
which will ensure that hwspinlock is built in.

Signed-off-by: Mark Brown 
---
 drivers/base/regmap/Kconfig  | 4 
 drivers/base/regmap/regmap.c | 7 +++
 2 files changed, 11 insertions(+)

diff --git a/drivers/base/regmap/Kconfig b/drivers/base/regmap/Kconfig
index 073c0b77e5b3..2d5e849f79c9 100644
--- a/drivers/base/regmap/Kconfig
+++ b/drivers/base/regmap/Kconfig
@@ -5,6 +5,7 @@
 config REGMAP
default y if (REGMAP_I2C || REGMAP_SPI || REGMAP_SPMI || REGMAP_W1 || 
REGMAP_AC97 || REGMAP_MMIO || REGMAP_IRQ)
select IRQ_DOMAIN if REGMAP_IRQ
+   select HWSPINLOCK if REGMAP_HWSPINLOCK
bool
 
 config REGCACHE_COMPRESSED
@@ -36,3 +37,6 @@ config REGMAP_MMIO
 
 config REGMAP_IRQ
bool
+
+config REGMAP_HWSPINLOCK
+   bool
diff --git a/drivers/base/regmap/regmap.c b/drivers/base/regmap/regmap.c
index 999e981a174a..ff6ef6a579c6 100644
--- a/drivers/base/regmap/regmap.c
+++ b/drivers/base/regmap/regmap.c
@@ -414,6 +414,7 @@ static unsigned int regmap_parse_64_native(const void *buf)
 }
 #endif
 
+#ifdef REGMAP_HWSPINLOCK
 static void regmap_lock_hwlock(void *__map)
 {
struct regmap *map = __map;
@@ -456,6 +457,7 @@ static void regmap_unlock_hwlock_irqrestore(void *__map)
 
hwspin_unlock_irqrestore(map->hwlock, >spinlock_flags);
 }
+#endif
 
 static void regmap_lock_mutex(void *__map)
 {
@@ -672,6 +674,7 @@ struct regmap *__regmap_init(struct device *dev,
map->unlock = config->unlock;
map->lock_arg = config->lock_arg;
} else if (config->hwlock_id) {
+#ifdef REGMAP_HWSPINLOCK
map->hwlock = hwspin_lock_request_specific(config->hwlock_id);
if (!map->hwlock) {
ret = -ENXIO;
@@ -694,6 +697,10 @@ struct regmap *__regmap_init(struct device *dev,
}
 
map->lock_arg = map;
+#else
+   ret = -EINVAL;
+   goto err;
+#endif
} else {
if ((bus && bus->fast_io) ||
config->fast_io) {
-- 
2.14.1

Re: [PATCH v2 2/3] Sony-laptop: Delete an unnecessary variable initialisation in sony_nc_setup_rfkill()

2017-11-03 Thread Andy Shevchenko

On Wed, Nov 1, 2017 at 8:47 PM, SF Markus Elfring
 wrote:
> From: Markus Elfring 
> Date: Wed, 1 Nov 2017 19:00:59 +0100
>
> The local variable "err" will eventually be set to an appropriate value
> a bit later. Thus omit the explicit initialisation at the beginning.

Applied to my review and testing queue, thanks!

>
> Signed-off-by: Markus Elfring 
> ---
>  drivers/platform/x86/sony-laptop.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/platform/x86/sony-laptop.c 
> b/drivers/platform/x86/sony-laptop.c
> index 4332cc982ce0..62aa2c37b8d2 100644
> --- a/drivers/platform/x86/sony-laptop.c
> +++ b/drivers/platform/x86/sony-laptop.c
> @@ -1627,7 +1627,7 @@ static const struct rfkill_ops sony_rfkill_ops = {
>  static int sony_nc_setup_rfkill(struct acpi_device *device,
> enum sony_nc_rfkill nc_type)
>  {
> -   int err = 0;
> +   int err;
> struct rfkill *rfk;
> enum rfkill_type type;
> const char *name;
> --
> 2.14.3
>



-- 
With Best Regards,
Andy Shevchenko

Re: [PATCH 1/3] printk: Introduce per-console loglevel setting

2017-11-03 Thread Petr Mladek

On Thu 2017-09-28 17:43:55, Calvin Owens wrote:
> This patch introduces a new per-console loglevel setting, and changes
> console_unlock() to use max(global_level, per_console_level) when
> deciding whether or not to emit a given log message.

> diff --git a/include/linux/console.h b/include/linux/console.h
> index b8920a0..a5b5d79 100644
> --- a/include/linux/console.h
> +++ b/include/linux/console.h
> @@ -147,6 +147,7 @@ struct console {
>   int cflag;
>   void*data;
>   struct   console *next;
> + int level;

I would make the meaning more clear and call this min_loglevel.

>  };
>  
>  /*
> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> index 512f7c2..3f1675e 100644
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -1141,9 +1141,14 @@ module_param(ignore_loglevel, bool, S_IRUGO | S_IWUSR);
>  MODULE_PARM_DESC(ignore_loglevel,
>"ignore loglevel setting (prints all kernel messages to the 
> console)");
>  
> -static bool suppress_message_printing(int level)
> +static int effective_loglevel(struct console *con)
>  {
> - return (level >= console_loglevel && !ignore_loglevel);
> + return max(console_loglevel, con ? con->level : LOGLEVEL_EMERG);
> +}
> +
> +static bool suppress_message_printing(int level, struct console *con)
> +{
> + return (level >= effective_loglevel(con) && !ignore_loglevel);
>  }

We need to be more careful here:

First, there is one ugly level called CONSOLE_LOGLEVEL_SILENT. Fortunately,
it is used only by vkdb_printf(). I guess that the purpose is to store
messages into the log buffer and do not show them on consoles.

It is hack and it is racy. It would hide the messages only when the
console_lock() is not already taken. Similar hack is used on more
locations, e.g. in __handle_sysrq() and these are racy as well.
We need to come up with something better in the future but this
is a task for another patchset.

Second, this functions are called with NULL when we need to take
all usable consoles into account. You simplified it by ignoring
the per-console setting. But it is not correct. For example,
you might need to delay the printing in boot_delay_msec()
also on the fast console. Also this was the reason to remove
one optimization in console_unlock().

I thought about a reasonable solution and came up with something like:

static bool suppress_message_printing(int level, struct console *con)
{
int callable_loglevel;

if (ignore_loglevel || console_loglevel == CONSOLE_LOGLEVEL_MOTORMOUTH)
return false;

/* Make silent even fast consoles. */
if (console_loglevel == CONSOLE_LOGLEVEL_SILENT)
return true;

if (con)
callable_loglevel = con->min_loglevel;
else
callable_loglevel = max_custom_console_loglevel;

/* Global setting might make all consoles more verbose. */
if (callable_loglevel < console_loglevel)
callable_loglevel = console_loglevel;

return level >= callable_loglevel();
}

Yes, it is complicated. But the logic is complicated. IMHO, this has
the advantage that we do most of the decisions on a single place
and it might be easier to get the picture.

Anyway, max_custom_console_loglevel would be a global variable
defined as:

/*
 * Minimum loglevel of the most talkative registered console.
 * It is a maximum of all registered con->min_logvevel values.
 */
static int max_custom_console_loglevel = LOGLEVEL_EMERG;

The value should get updated when any console is registered
and when a registered console is manipulated. It means in
register_console(), unregister_console(), and the sysfs
write callbacks.

>  #ifdef CONFIG_BOOT_PRINTK_DELAY
> @@ -2199,22 +2205,11 @@ void console_unlock(void)
>   } else {
>   len = 0;
>   }
> -skip:
> +
>   if (console_seq == log_next_seq)
>   break;
>  
>   msg = log_from_idx(console_idx);
> - if (suppress_message_printing(msg->level)) {
> - /*
> -  * Skip record we have buffered and already printed
> -  * directly to the console when we received it, and
> -  * record that has level above the console loglevel.
> -  */
> - console_idx = log_next(console_idx);
> - console_seq++;
> - goto skip;
> - }

I would like to keep this code. It does not make sense to prepare the
text buffer if it won't be used at all. It would work with the change
that I proposed above.

>   len += msg_print_text(msg, false, text + len, sizeof(text) - 
> len);
>   if (nr_ext_console_drivers) {
>   ext_len = msg_print_ext_header(ext_text,
> @@ -2230,7 +2225,7 @@ void console_unlock(void)
>   raw_spin_unlock(_lock);
>  
>

Re: [PATCH] ARM: sun7i: Add Cubietech Einstein A20 board device-tree

2017-11-03 Thread Frank Kunz

Am Freitag, 3. November 2017, 09:44:27 CET schrieb Maxime Ripard:
> Hi Frank,
> 
> On Thu, Nov 02, 2017 at 07:59:04PM +0100, Frank Kunz wrote:
> > > > + {
> > > > +   pinctrl-names = "default";
> > > > +   pinctrl-0 = <_pins_a>;
> > > > +   vmmc-supply = <_vcc3v3>;
> > > 
> > > What regulator is this connected to on the PMIC?
> > 
> > Fixed, On the schematic it is finally connected to 3v0.
> 
> Sorry, it wasn't really what I meant. Surely there's a PMIC regulator
> providing that 3.0V, you should put that regulator here.
> 

It is not connected to any of the PMIC (AXP209) regulators. There is a 
dedicated DCDC converter on that board which is driven by IPSOUT. This one 
generates the 3.0V IO supply.

Br,
Frank

Re: KASAN: stack-out-of-bounds Read in xfrm_state_find (2)

2017-11-03 Thread Steffen Klassert

On Thu, Nov 02, 2017 at 01:25:28PM +0100, Florian Westphal wrote:
> Steffen Klassert  wrote:
> 
> > I'd propose to use the addresses from the template unconditionally,
> > like the (untested) patch below does.
> > 
> > Unfortunalely the reproducer does not work with my config,
> > sendto returns EAGAIN. Could anybody try this patch?
> 
> The reproducer no longer causes KASAN spew with your patch,
> but i don't have a test case that actually creates/uses a tunnel.

The patch passed my standard tests, so I tend apply it
after a day in the ipsec/testing branch.

Re: [PATCH v9 8/8] perf: ARM DynamIQ Shared Unit PMU support

2017-11-03 Thread Mark Rutland

Hi Suzuki,

This looks good, but there are a couple of edge cases I think that we
need to handle, as noted below.

On Tue, Oct 31, 2017 at 05:23:18PM +, Suzuki K Poulose wrote:
> Changes since V8:

>  - Fill in the "module" field for the PMU to prevent the module unload
>when the PMU is active.

Huh. For some reason I thought that was done automatically, but having
looked, I see that it is not.

It looks like this is missing from the SPE PMU, and the CCN PMU. Would
you mind fixing those up?

The only other PMU that I see affected is the AMD power PMU; I've pinged
the maintainer separately.

[...]

> +The driver also exposes the CPUs connected to the DSU instance in 
> "associated_cpus".

Just to check, is there a user of this?

I agree that it could be useful, but AFAICT the perf tool won't look at
this, so it seems odd to expose it. I'd feel happier punting on exposing
that so that we can settle on a common name for this across
uncore/system PMUs.

[...]

> +static void dsu_pmu_probe_pmu(void *data)
> +{

> + /* We can only support upto 31 independent counters */

Nit: s/upto/up to/

[...]

> +static void dsu_pmu_init_pmu(struct dsu_pmu *dsu_pmu)
> +{
> + int cpu, rc;
> +
> + cpu = dsu_pmu_get_online_cpu(dsu_pmu);
> + /* Defer, if we don't have any active CPUs in the DSU */
> + if (cpu >= nr_cpu_ids)
> + return;
> + rc = smp_call_function_single(cpu, dsu_pmu_probe_pmu, dsu_pmu, 1);
> + if (rc)
> + return;
> + /* Reset the interrupt overflow mask */
> + dsu_pmu_get_reset_overflow();
> + dsu_pmu_set_active_cpu(cpu, dsu_pmu);
> +}

I think this can be simplified by only callnig this in the hotplug
callback, and not donig the corss-call at all at driver init time. That
way, we can do:

static void dsu_pmu_init_pmu(struct dsu_pmu *dsu_pmu)
{
if (dsu_pmu->num_counters == -1)
dsu_pmu_probe_pmu(dsu_pmu);

dsu_pmu_get_reset_overflow();
}

... which also means we can simplify the prototype of
dsu_pmu_probe_pmu().

Note that the dsu_pmu_set_active_cpu() can be factored out to the
caller, which is a little clearer, as I suiggest below.

> +static int dsu_pmu_device_probe(struct platform_device *pdev)
> +{

> + /*
> +  * We could defer probing the PMU details from the registers until
> +  * an associated CPU is online.
> +  */
> + dsu_pmu_init_pmu(dsu_pmu);

... then we can drop this line ...

> + platform_set_drvdata(pdev, dsu_pmu);
> + rc = cpuhp_state_add_instance(dsu_pmu_cpuhp_state,
> + _pmu->cpuhp_node);

... as this should set things up if a CPU is already online.

[...]

> +static int dsu_pmu_cpu_online(unsigned int cpu, struct hlist_node *node)
> +{
> + struct dsu_pmu *dsu_pmu = hlist_entry_safe(node, struct dsu_pmu,
> +cpuhp_node);
> +
> + if (!cpumask_test_cpu(cpu, _pmu->associated_cpus))
> + return 0;
> +
> + /* Initialise the PMU if necessary */
> + if (dsu_pmu->num_counters < 0)
> + dsu_pmu_init_pmu(dsu_pmu);
> + /* Set the active CPU if we don't have one */
> + if (cpumask_empty(_pmu->active_cpu))
> + dsu_pmu_set_active_cpu(cpu, dsu_pmu);
> + return 0;
> +}

I don't think this is quite right, as if we've offlined all the
associated CPUs, the DSCU itself may have been powered down, and we'll
want to reset it when it's brought online.

I think we want this to be:

static int dsu_pmu_cpu_online(unsigned int cpu, struct hlist_node *node)
{
struct dsu_pmu *dsu_pmu = hlist_entry_safe(node, struct dsu_pmu,
   cpuhp_node);

if (!cpumask_test_cpu(cpu, _pmu->associated_cpus))
return 0;

/* If the PMU is already managed, there's nothing to do */
if (!cpumask_empty(_pmu->active_cpu))
return 0;

/* Reset the PMU, and take ownership */
dsu_pmu_init_pmu(dsu_pmu);
dsu_pmu_set_active_cpu(cpu, dsu_pmu);

return 0;
}

[...]

> +static int dsu_pmu_cpu_teardown(unsigned int cpu, struct hlist_node *node)
> +{

> + dsu_pmu_set_active_cpu(dst, dsu_pmu);
> + perf_pmu_migrate_context(_pmu->pmu, cpu, dst);

In other PMU drivers, we do the migrate, then set the active CPU. That
shouldn't matter, but for consistency, could we flip these around?

Otherwise, this looks good to me.

With the above changes:

Reviewed-by: Mark Rutland 

Thanks,
Mark.

Hello Beloved

2017-11-03 Thread Mariam Nassir

Hello Beloved,

My name is Mrs. Mariam Nassir, a widow from Aleppo city of Syria. I was linked 
to you by a comrade who was killed in the war few weeks ago. I write to seek 
for your assistance to help me and move my family fortune valued $8.2 Million 
to your country out of Syria for resettlement.

Following what is happening here in Syria, i have notified my late husband 
account officer in the bank on the relocation of the funds outside Syria, for 
safety and resettlement, and she advice to me to seek for a reliable and 
trustworthy person to assist me. I will be very glad to hear from you soon, if 
you are willing to assist me in receiving the fund. With good heart, i will 
give you 30% of the total sum for your involvement and assistance.

I will appreciate your earliest response.

Thank You,

Mariam.

Re: [PATCH v4 3/3] KVM: MMU: consider host cache mode in MMIO page check

2017-11-03 Thread Haozhong Zhang

On 11/03/17 17:10 +0800, Xiao Guangrong wrote:
> 
> 
> On 11/03/2017 04:51 PM, Haozhong Zhang wrote:
> > On 11/03/17 14:54 +0800, Xiao Guangrong wrote:
> > > 
> > > 
> > > On 11/03/2017 01:53 PM, Haozhong Zhang wrote:
> > > > Some reserved pages, such as those from NVDIMM DAX devices, are
> > > > not for MMIO, and can be mapped with cached memory type for better
> > > > performance. However, the above check misconceives those pages as
> > > > MMIO.  Because KVM maps MMIO pages with UC memory type, the
> > > > performance of guest accesses to those pages would be harmed.
> > > > Therefore, we check the host memory type by lookup_memtype() in
> > > > addition and only treat UC/UC- pages as MMIO.
> > > > 
> > > > Signed-off-by: Haozhong Zhang 
> > > > Reported-by: Cuevas Escareno, Ivan D 
> > > > Reported-by: Kumar, Karthik 
> > > > ---
> > > >arch/x86/kvm/mmu.c | 19 ++-
> > > >1 file changed, 18 insertions(+), 1 deletion(-)
> > > > 
> > > > diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
> > > > index 0b481cc9c725..e9ed0e666a83 100644
> > > > --- a/arch/x86/kvm/mmu.c
> > > > +++ b/arch/x86/kvm/mmu.c
> > > > @@ -2708,7 +2708,24 @@ static bool mmu_need_write_protect(struct 
> > > > kvm_vcpu *vcpu, gfn_t gfn,
> > > >static bool kvm_is_mmio_pfn(kvm_pfn_t pfn)
> > > >{
> > > > if (pfn_valid(pfn))
> > > > -   return !is_zero_pfn(pfn) && 
> > > > PageReserved(pfn_to_page(pfn));
> > > > +   return !is_zero_pfn(pfn) && 
> > > > PageReserved(pfn_to_page(pfn)) &&
> > > > +   /*
> > > > +* Some reserved pages, such as those from
> > > > +* NVDIMM DAX devices, are not for MMIO, and
> > > > +* can be mapped with cached memory type for
> > > > +* better performance. However, the above
> > > > +* check misconceives those pages as MMIO.
> > > > +* Because KVM maps MMIO pages with UC memory
> > > > +* type, the performance of guest accesses to
> > > > +* those pages would be harmed. Therefore, we
> > > > +* check the host memory type in addition and
> > > > +* only treat UC/UC- pages as MMIO.
> > > > +*
> > > > +* pat_pfn_is_uc() works only when PAT is 
> > > > enabled,
> > > > +* so check pat_enabled() as well.
> > > > +*/
> > > > +   (!pat_enabled() ||
> > > > +pat_pfn_is_uc(kvm_pfn_t_to_pfn_t(pfn)));
> > > 
> > > Can it be compiled if !CONFIG_PAT?
> > 
> > Yes.
> > 
> > What I check via pat_enabled() is not only whether PAT support is
> > compiled, but also whether PAT is enabled at runtime.
> 
> The issue is about pat_pfn_is_uc() which is implemented only if CONFIG_PAT is
> enabled, but you used it here unconditionally.
> 
> I am not sure if gcc is smart enough to omit pat_pfn_is_uc() completely under
> this case. If you really have done the test to compile kernel and KVM module
> with CONFIG_PAT disabled, it is fine.
> 

I've done the test and it can compile.

arch/x86/mm/Makefile shows pat.c is compiled regardless of CONFIG_X86_PAT,
and pat_pfn_is_uc() is defined out of  #ifdef CONFIG_X86_PAT ... #endif.

Haozhong

Re: [RFC] EPOLL_KILLME: New flag to epoll_wait() that subscribes process to death row (new syscall)

2017-11-03 Thread peter enderborg

On 11/01/2017 04:22 PM, Colin Walters wrote:
>
> On Wed, Nov 1, 2017, at 11:16 AM, Colin Walters wrote:
>> as the maintainer of glib2 which is used by a *lot* of things; I'm not
> (I meant to say "a" maintainer)
>
> Also, while I'm not an expert in Android, I think the "what to kill" logic
> there lives in userspace, right?   So it feels like we should expose this
> state in e.g. /proc and allow userspace daemons (e.g. systemd, kubelet) to 
> perform
> idle collection too, even if the system isn't actually low on resources
> from the kernel's perspective.
>
> And doing that requires some sort of kill(pid, SIGKILL_IF_IDLE) or so?
>
You are right, in android it is the activity manager that performs this tasks. 
And if services
dies without talking to the activity manager the service is restarted, unless 
it is
on highest oom score. A other problem is that a lot communication in android is 
binder not epoll.

And a signal that can not be caught not that good. But a "warn" signal of the 
userspace choice in
something in a context similar to ulimit. SIGXFSZ/SIGXCPU that you can pickup 
and notify activity manager might work.

However, in android this is already solved with OnTrimMemory that is message 
sent from activitymanager to
application, services etc when system need memory back.

[PATCH 3/3] nvme: fix eui_show() print format

2017-11-03 Thread Javier González

Signed-off-by: Javier González 
---
 drivers/nvme/host/core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index ae8ab0a1ef0d..f05c81774abf 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -2108,7 +2108,7 @@ static ssize_t eui_show(struct device *dev, struct 
device_attribute *attr,
char *buf)
 {
struct nvme_ns *ns = nvme_get_ns_from_dev(dev);
-   return sprintf(buf, "%8phd\n", ns->eui);
+   return sprintf(buf, "%8phD\n", ns->eui);
 }
 static DEVICE_ATTR(eui, S_IRUGO, eui_show, NULL);
 
-- 
2.7.4

[PATCH 1/3] nvme: do not check for ns on rw path

2017-11-03 Thread Javier González

On the rw path, the ns is assumed to be set. However, a check is still
done, inherited from the time the code resided at nvme_queue_rq().

Eliminate this check, which also eliminates a smatch complain for not
doing proper NULL checks on ns.

Signed-off-by: Javier González 
---
 drivers/nvme/host/core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 5a14cc7f28ee..bd1d5ff911c9 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -472,7 +472,7 @@ static inline blk_status_t nvme_setup_rw(struct nvme_ns *ns,
 * unless this namespace is formated such that the metadata can be
 * stripped/generated by the controller with PRACT=1.
 */
-   if (ns && ns->ms &&
+   if (ns->ms &&
(!ns->pi_type || ns->ms != sizeof(struct t10_pi_tuple)) &&
!blk_integrity_rq(req) && !blk_rq_is_passthrough(req))
return BLK_STS_NOTSUPP;
-- 
2.7.4

[PATCH 0/3] nvme: small fixes reported by smatch

2017-11-03 Thread Javier González

Fix a number of small things reported by smatch on the nvme driver

Javier González (3):
  nvme: do not check for ns on rw path
  nvme: compare NQN string with right size
  nvme: fix eui_show() print format

 drivers/nvme/host/core.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

-- 
2.7.4

Re: [PATCH 10/17] coresight: etr: Track if the device is coherent

2017-11-03 Thread Suzuki K Poulose


On 02/11/17 19:40, Mathieu Poirier wrote:

On Thu, Oct 19, 2017 at 06:15:46PM +0100, Suzuki K Poulose wrote:

Track if the ETR is dma-coherent or not. This will be useful
in deciding if we should use software buffering for perf.

Cc: Mathieu Poirier 
Signed-off-by: Suzuki K Poulose 
---
  drivers/hwtracing/coresight/coresight-tmc.c | 5 -
  drivers/hwtracing/coresight/coresight-tmc.h | 1 +
  2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/hwtracing/coresight/coresight-tmc.c 
b/drivers/hwtracing/coresight/coresight-tmc.c
index 4939333cc6c7..5a8c41130f96 100644
--- a/drivers/hwtracing/coresight/coresight-tmc.c
+++ b/drivers/hwtracing/coresight/coresight-tmc.c
@@ -347,6 +347,9 @@ static int tmc_etr_setup_caps(struct tmc_drvdata *drvdata,
if (!(devid & TMC_DEVID_NOSCAT))
tmc_etr_set_cap(drvdata, TMC_ETR_SG);
  
+	if (device_get_dma_attr(drvdata->dev) == DEV_DMA_COHERENT)

+   tmc_etr_set_cap(drvdata, TMC_ETR_COHERENT);
+
/* Check if the AXI address width is available */
if (devid & TMC_DEVID_AXIAW_VALID)
dma_mask = ((devid >> TMC_DEVID_AXIAW_SHIFT) &
@@ -397,7 +400,7 @@ static int tmc_probe(struct amba_device *adev, const struct 
amba_id *id)
if (!drvdata)
goto out;
  
-	drvdata->dev = >dev;

+   drvdata->dev = dev;


What is that one for?



Oops, that was a minor cleanup and need not be part of this patch. I will leave 
things
as it is. It is not worth a separate patch.

Cheers
Suzuki

Re: [PATCH v17 4/6] virtio-balloon: VIRTIO_BALLOON_F_SG

2017-11-03 Thread Tetsuo Handa

Wei Wang wrote:
> @@ -164,6 +284,8 @@ static unsigned fill_balloon(struct virtio_balloon *vb, 
> size_t num)
>   break;
>   }
>  
> + if (use_sg && xb_set_page(vb, page, _min, _max) < 0)

Isn't this leaking "page" ?

> + break;
>   balloon_page_push(, page);
>   }
>  



> @@ -184,8 +307,12 @@ static unsigned fill_balloon(struct virtio_balloon *vb, 
> size_t num)
>  
>   num_allocated_pages = vb->num_pfns;
>   /* Did we get any? */
> - if (vb->num_pfns != 0)
> - tell_host(vb, vb->inflate_vq);
> + if (vb->num_pfns) {
> + if (use_sg)
> + tell_host_sgs(vb, vb->inflate_vq, pfn_min, pfn_max);

Please describe why tell_host_sgs() can work without __GFP_DIRECT_RECLAIM 
allocation,
for tell_host_sgs() is called with vb->balloon_lock mutex held.

> + else
> + tell_host(vb, vb->inflate_vq);
> + }
>   mutex_unlock(>balloon_lock);
>  
>   return num_allocated_pages;



> @@ -223,7 +353,13 @@ static unsigned leak_balloon(struct virtio_balloon *vb, 
> size_t num)
>   page = balloon_page_dequeue(vb_dev_info);
>   if (!page)
>   break;
> - set_page_pfns(vb, vb->pfns + vb->num_pfns, page);
> + if (use_sg) {
> + if (xb_set_page(vb, page, _min, _max) < 0)

Isn't this leaking "page" ?

If this is inside vb->balloon_lock mutex (isn't this?), xb_set_page() must not
use __GFP_DIRECT_RECLAIM allocation, for leak_balloon_sg_oom() will be blocked
on vb->balloon_lock mutex.

> + break;
> + } else {
> + set_page_pfns(vb, vb->pfns + vb->num_pfns, page);
> + }
> +
>   list_add(>lru, );
>   vb->num_pages -= VIRTIO_BALLOON_PAGES_PER_PAGE;
>   }

Re: [PATCH resend v5 3/3] platform/x86: intel_cht_int33fe: Update fusb302 type string, add properties

2017-11-03 Thread Andy Shevchenko

On Fri, Oct 27, 2017 at 6:24 PM, Hans de Goede  wrote:
> On 27-10-17 12:41, Wolfram Sang wrote:
>> On Fri, Oct 27, 2017 at 12:31:01PM +0200, Hans de Goede wrote:
>>> On 27-10-17 12:13, Hans de Goede wrote:
 On 26-10-17 22:33, Wolfram Sang wrote:
>
> On Wed, Oct 11, 2017 at 11:41:21AM +0200, Hans de Goede wrote:
>>
>> The fusb302 driver as merged in staging uses "typec_fusb302" as i2c-id
>> rather then just "fusb302" and needs us to set a number of device-
>> properties, adjust the intel_cht_int33fe driver accordingly.
>>
>> One of the properties set is max-snk-mv which makes the fusb302 driver
>> negotiate up to 12V charging voltage, which is a bad idea on boards
>> which are not setup to handle this, so this commit also adds 2 extra
>> sanity checks to make sure that the expected Whiskey Cove PMIC +
>> TI bq24292i charger combo, which can handle 12V, is present.
>>
>> Signed-off-by: Hans de Goede 
>> Acked-by: Andy Shevchenko 
>
>
> I can't apply this one. Is there an immutable branch I need to pick up?
> Or shall this go via another tree? My base is v4.14-rc5.


 It should be applied on top of this patch:


 http://git.infradead.org/users/dvhart/linux-platform-drivers-x86.git/commitdiff/5c003458db40cf3c89aeddd22c6e934c28b5a565

   From linux-platform-drivers-x86.git/for-next.

 So either we are going to need an immutable branch from you
 with the first patch of this series so that the platform/x86
 maintainers can merge this, or the other way around :|
>>>
>>>
>>> Alternatively we could push this patch as a post rc1 fix I guess.
>>
>>
>> I intentionally did not push out yet until this was cleared. So, I think
>> the most simple option is that I create an immutable branch with only
>> the i2c patches from you. Then linux-platform maintainers can pull in
>> my branch and your patch here on top of that. Would be my favourite.
>>
>> D'accord everyone?
>
>
> Works for me, ack.

Okay, merged, thanks!


-- 
With Best Regards,
Andy Shevchenko

RE: [PATCH] refcount: provide same memory ordering guarantees as in atomic_t

2017-11-03 Thread Reshetova, Elena


> On Thu, Nov 02, 2017 at 11:04:53AM +, Reshetova, Elena wrote:
> 
> > Well refcount_dec_and_test() is not the only function that has different
> > memory ordering specifics. So, the full answer then for any arbitrary case
> > according to your points above would be smth like:
> >
> > for each substituted function (atomic_* --> refcount_*) that actually
> >  has changes in memory ordering *** perform the following:
> >   - mention the difference
> >   - mention the actual code place where the change would affect (
> >various put and etc. functions)
> >   - potentially try to understand if it would make a difference for
> >   this code (again here is where I am not sure how to do it properly)
> >
> >
> > *** the actual list of these functions to me looks like:
> 
> >  1) atomic_inc_not_zero -> refcount_inc_not_zero.  Change is from
> >  underlying atomic_cmpxchg() to atomic_try_cmpxchg_relaxed () First
> >  one implies SMP-conditional general memory barrier (smp_mb()) on each
> >  side of the actual operation (at least according to documentation, In
> >  reality it goes through so many changes, ifdefs and conditionals that
> >  one gets lost Easily in the process).
> 
> That matches _inc(), which doesn't imply anything.

So you are saying that atomic_inc_not_zero has the same guarantees (meaning
no guarantees) as atomic_inc?  If yes, then I am now confused here because
atomic_inc_not_zero based on atomic_cmpxchg() which according to this
https://elixir.free-electrons.com/linux/latest/source/Documentation/memory-barriers.txt#L2527
does imply the smp_mb()

> 
> The reasoning being that you must already have obtained a pointer to the
> object; and that will necessarily include enough ordering to observe the
> object. Therefore increasing the refcount doesn't require further
> constraints.
> 
> > Second one according to the comment implies no memory barriers
> > whatsoever, BUT "provides a control dependency which will order future
> > stores against the inc" So, every store (not load) operation (I guess
> > we are talking here only about store operations that touch the same
> > object, but I wonder how it is defined in terms of memory location?
> 
> Memory location is irrelevant.

I was just trying to understand to what "stores" does control dependency
barrier applies here? You mean it applies to absolutely all stores on all 
objects? I guess we are talking about the same object here, just trying to 
understand how object is defined in terms of memory location. 
> 
> > (overlapping?)  that comes inside "if refcount_inc_not_zero(){}" cause
> > would only be executed if functions returns true.
> 
> The point is that a CPU must NOT speculate on stores. So while it can
> speculate a branch, any store inside the branch must not become visible
> until it can commit to that branch.
> 
> The whole point being that possible modifications to the object to which
> we've _conditionally_ acquired a reference, will only happen after we're
> sure to indeed have acquired this reference.
> 
> Otherwise its similar to _inc().

Yes, now I understand this part. 

> 
> > So, practically what we might "loose" here is any updates on the
> > object protected by this refcounter done by another CPU. But for this
> > purpose we expect the developer to take some other lock/memory barrier
> > into use, right?
> 
> Correct, object modification had better be serialized. Refcounts cannot
> (even with atomic_t) help with that.
> 
> > We only care of incrementing the refcount atomically and make sure we
> > don't do anything with object unless it is ready for us to be used.
> 
> Just so..
> 
> > If yes, then  I guess it might be a big change for the code that
> > previously relied on atomic-provided smp_mb() barriers and now instead
> > needs to take an explicit locks/barriers by itself.
> 
> Right, however such memory ordering should be explicitly documented.
> Unknown and hidden memory ordering is a straight up bug, because
> modifications to the code (be they introducing refcount_t or anything
> else) can easily break things.

Yes, this is what has been mentioned before many times, but again reality might
be different, so better be prepared here also. 
> 
> > 2) atomic_** -> refcount_add_not_zero. Fortunately these are super
> > rare and need to see per each case dependent on actual atomic function
> > substituted.
> 
> See inc_not_zero.
> 
> > 3) atomic_add() --> refcount_add(). This should not make any change
> > since both do not provide memory ordering at all, but there is a
> > comment in the refcount.c that says that refcount_add " provide a
> > control dependency and thereby orders future stores".  How is it done
> > given that refcount_add is void returning function??  I am fully
> > confused with this one.
> 
> Weird, mostly comes from being implemented using add_not_zero I suppose.

Yes, underlying is add_not_zero, but is it still correct to talk about any 
control 
dependencies here? How would it possibly look in

Re: mptsas driver cannot detect hotplugging disk with the LSI SCSI SAS1068 controller in Ubuntu guest on VMware

2017-11-03 Thread Gavin Guo

On Fri, Nov 3, 2017 at 6:59 PM, Hannes Reinecke  wrote:
> On 11/03/2017 04:38 AM, Gavin Guo wrote:
>> On Sat, Oct 28, 2017 at 11:35 AM, Gavin Guo  wrote:
>>> On Fri, Oct 27, 2017 at 10:53 PM, Hannes Reinecke  wrote:
 On 10/27/2017 04:02 PM, Gavin Guo wrote:
> Hi Hannes,
>
> Thank you for looking into the issue. If there is anything I can help
> to test the patch? I appreciate your help. Thank you.
>
 If you had checked linux-scsi you would have found this patch:
 '[PATCH] mptsas: Fixup device hotplug for VMWare ESXi', which I guess is
 already scheduled for inclusion in 4.14.
 Anything else I could help you with?

 Cheers,

 Hannes
 --
 Dr. Hannes ReineckeTeamlead Storage & Networking
 h...@suse.de   +49 911 74053 688
 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
 GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
 HRB 21284 (AG Nürnberg)
>>>
>>> Really appreciate your help. I will proceed the testing and keep you posted.
>>
>> Hello Hannes,
>>
>> I've tested the MPT SAS device controller, the patch works perfectly.
>> However, the MPT SPI still cannot work with the hotplugging. The lspci
>> of the LSI SPI device is listed:
>>
>> 00:10.0 SCSI storage controller [0100]: LSI Logic / Symbios Logic
>> 53c1030 PCI-X Fusion-MPT Dual Ultra320 SCSI [1000:0030] (rev 01)
>> Subsystem: VMware LSI Logic Parallel SCSI Controller [15ad:1976]
>> Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
>> ParErr- Stepping- SERR- FastB2B- DisINTx-
>> Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium
>>> TAbort- SERR- > Latency: 64 (1500ns min, 63750ns max)
>> Interrupt: pin A routed to IRQ 17
>> Region 0: I/O ports at 1400 [size=256]
>> Region 1: Memory at feba (64-bit, non-prefetchable) [size=128K]
>> Region 3: Memory at febc (64-bit, non-prefetchable) [size=128K]
>> [virtual] Expansion ROM at 40008000 [disabled] [size=16K]
>> Kernel driver in use: mptspi
>>
>> The vendor id/device id[1000:0030] can also be found in the
>> mptspi_pci_table(drivers/message/fusion/mptspi.c).
>>
>> PCI_VENDOR_ID_LSI_LOGIC, MPI_MANUFACTPAGE_DEVID_53C1030
>>
>> If there is anything I missed in the testing? I appreciate your help.
>>
> Don't. Ever. Try.
>
> SCSI Parallel hotplugging is so loosely defined that essentially every
> vendor can (and did!) implement their own thing, and _still_ be spec
> compliant.
> Plus there is no guarantee that we even see events when a hot-add
> occurs. So any thingie we put in here has a really high chance of
> breaking other installations.
> I wouldn't even bother to attempt this; it's virtualized anyway so you
> can as well switch to the SAS emulation here.
> Or use a different SCSI parallel emulation like the Symbios/LSI 53c800 one.
>
> Cheers,
>
> Hannes
> --
> Dr. Hannes ReineckeTeamlead Storage & Networking
> h...@suse.de   +49 911 74053 688
> SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
> GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
> HRB 21284 (AG Nürnberg)

Hello Hannes,

I totally understood your concern and appreciate your help again on
the hotplugging patch for MPT SAS controller. :)

Re: [PATCH v3] printk: Add console owner and waiter logic to load balance console writes

2017-11-03 Thread Steven Rostedt

On Fri, 3 Nov 2017 07:21:21 -0400
Steven Rostedt  wrote:

> On Thu, 2 Nov 2017 21:09:32 -0700
> John Hubbard  wrote:
>

 > 
> > For example, if there are 3 or more threads, you can do the following:
> > 
> > thread A: holds the console lock, is printing, then moves into the 
> > console_unlock
> >   phase
> > 
> > thread B: goes into the waiter spin loop above, and (once the polarity is 
> > corrected)
> >   waits for console_waiter to become 0
> > 
> > thread A: finishing up, sets console_waiter --> 0
> > 
> > thread C: before thread B notices, thread C goes into the "else" section, 
> > sees that
> >   console_waiter == 0, and sets console_waiter --> 1. So thread C 
> > now
> >   becomes the waiter  
> 
> But console_waiter only gets set to 1 if console_waiter is 0 *and*
> console_owner is not NULL and is not current. console_owner is only
> updated under a spin lock and console_waiter is only set under a spin
> lock when console_owner is not NULL.
> 
> This means this scenario can not happen.
> 
> 
> > 
> > thread B: gets *very* unlucky and never sees the 1 --> 0 --> 1 transition of
> >   console_waiter, so it continues waiting.  And now we have both B
> >   and C in the same spin loop, and this is now broken.
> > 
> > At the root, this is really due to the absence of a pre-existing "hand-off 
> > this lock"
> > mechanism. And this one here is not quite correct.
> > 
> > Solution ideas: for a true hand-off, there needs to be a bit more 
> > information
> > exchanged. Conceptually, a (lock-protected) list of waiters (which would 
> > only ever have zero or one entries) is a good way to start thinking about 
> > it.  
> 
> As stated above, the console owner check will prevent this issue.
> 

I'll condense the patch to show what I mean:

To become a waiter, a task must do the following:

+   printk_safe_enter_irqsave(flags);
+
+   raw_spin_lock(_owner_lock);
+   owner = READ_ONCE(console_owner);
+   waiter = READ_ONCE(console_waiter);
+   if (!waiter && owner && owner != current) {
+   WRITE_ONCE(console_waiter, true);
+   spin = true;
+   }
+   raw_spin_unlock(_owner_lock);


The new waiter gets set only if there isn't already a waiter *and*
there is an owner that is not current (and with the printk_safe_enter I
don't think that is even needed).

+   while (!READ_ONCE(console_waiter))
+   cpu_relax();

The spin is outside the spin lock. But only the owner can clear it.

Now the owner is doing a loop of this (with interrupts disabled)

+   raw_spin_lock(_owner_lock);
+   console_owner = current;
+   raw_spin_unlock(_owner_lock);

Write to consoles.

+   raw_spin_lock(_owner_lock);
+   waiter = READ_ONCE(console_waiter);
+   console_owner = NULL;
+   raw_spin_unlock(_owner_lock);

+   if (waiter)
+   break;

At this moment console_owner is NULL, and no new waiters can happen.
The next owner will be the waiter that is spinning.

+   if (waiter) {
+   WRITE_ONCE(console_waiter, false);

There is no possibility of another task sneaking in and becoming a
waiter at this moment. The console_owner was cleared under spin lock,
and a waiter is only set under the same spin lock if owner is set.
There will be no new owner sneaking in because to become the owner, you
must have the console lock. Since it is never released between the time
the owner clears console_waiter and the waiter takes the console lock,
there is no race.

-- Steve

Re: [PATCH v3] printk: Add console owner and waiter logic to load balance console writes

2017-11-03 Thread Steven Rostedt

On Fri, 3 Nov 2017 07:54:04 -0400
Steven Rostedt  wrote:

> The new waiter gets set only if there isn't already a waiter *and*
> there is an owner that is not current (and with the printk_safe_enter I
> don't think that is even needed).
> 
> + while (!READ_ONCE(console_waiter))
> + cpu_relax();

I still need to fix the patch. I cut and pasted the bad version. This
should have been:

while (READ_ONCE(console_waiter))
cpu_relax();

-- Steve

Re: [PATCH] intel_ips: Convert timers to use timer_setup()

2017-11-03 Thread Andy Shevchenko

On Thu, Nov 2, 2017 at 9:55 PM, Kees Cook  wrote:
> On Thu, Oct 5, 2017 at 1:41 AM, Andy Shevchenko
>  wrote:
>> On Thu, Oct 5, 2017 at 3:54 AM, Kees Cook  wrote:
>>> In preparation for unconditionally passing the struct timer_list pointer to
>>> all timer callbacks, switch to using the new timer_setup() and from_timer()
>>> to pass the timer pointer explicitly. Moves timer structure off stack and
>>> into struct ips_driver.
>>
>> Pushed to my testing queue, thanks!
>
> Hi,
>
> I don't see this in -next yet. Should the tip tree carry this conversion?

I thought it was a result of discussion since we lack of patch that
brought the core change.

However, because I merged Wolfram's immutable branch in order to apply
Hans' fix, I can carry your patch as well.

Either would be fine with me.

Going ahead, I applied it to my review and testing queue, thanks!

-- 
With Best Regards,
Andy Shevchenko

Re: [PATCH V4 11/12] boot_constraint: Add support for IMX platform

2017-11-03 Thread Sascha Hauer

On Sun, Oct 29, 2017 at 07:18:59PM +0530, Viresh Kumar wrote:
> This adds boot constraint support for IMX platforms. Currently only one
> use case is supported: earlycon. Some of the UARTs are enabled by the
> bootloader and are used for early console in the kernel. The boot
> constraint core handles them properly and removes them once the serial
> device is probed by its driver.
> 
> This gets rid of lots of hacky code in the clock drivers.
> 
> Signed-off-by: Viresh Kumar 
> ---
>  arch/arm/mach-imx/Kconfig |   1 +
>  drivers/boot_constraints/Makefile |   1 +
>  drivers/boot_constraints/imx.c| 113 
> ++
>  drivers/clk/imx/clk-imx25.c   |  12 
>  drivers/clk/imx/clk-imx27.c   |  13 -
>  drivers/clk/imx/clk-imx31.c   |  12 
>  drivers/clk/imx/clk-imx35.c   |  10 
>  drivers/clk/imx/clk-imx51-imx53.c |  16 --
>  drivers/clk/imx/clk-imx6q.c   |   8 ---
>  drivers/clk/imx/clk-imx6sl.c  |   8 ---
>  drivers/clk/imx/clk-imx6sx.c  |   8 ---
>  drivers/clk/imx/clk-imx7d.c   |  14 -
>  drivers/clk/imx/clk.c |  38 -
>  drivers/clk/imx/clk.h |   1 -
>  14 files changed, 115 insertions(+), 140 deletions(-)
>  create mode 100644 drivers/boot_constraints/imx.c
> 
> diff --git a/arch/arm/mach-imx/Kconfig b/arch/arm/mach-imx/Kconfig
> index 782699e67600..9ea1fe32b280 100644
> --- a/arch/arm/mach-imx/Kconfig
> +++ b/arch/arm/mach-imx/Kconfig
> @@ -4,6 +4,7 @@ menuconfig ARCH_MXC
>   select ARCH_SUPPORTS_BIG_ENDIAN
>   select CLKSRC_IMX_GPT
>   select GENERIC_IRQ_CHIP
> + select DEV_BOOT_CONSTRAINTS
>   select GPIOLIB
>   select PINCTRL
>   select PM_OPP if PM
> diff --git a/drivers/boot_constraints/Makefile 
> b/drivers/boot_constraints/Makefile
> index 43c89d2458e9..3b5a87fcf099 100644
> --- a/drivers/boot_constraints/Makefile
> +++ b/drivers/boot_constraints/Makefile
> @@ -3,3 +3,4 @@
>  obj-y := clk.o deferrable_dev.o core.o pm.o serial.o supply.o
>  
>  obj-$(CONFIG_ARCH_HISI)  += hikey.o
> +obj-$(CONFIG_ARCH_MXC)   += imx.o
> +
> +static const struct of_device_id machines[] __initconst = {
> + { .compatible = "fsl,imx25", .data = _constraints },
> + { .compatible = "fsl,imx27", .data = _constraints },
> + { .compatible = "fsl,imx31", .data = _constraints },
> + { .compatible = "fsl,imx35", .data = _constraints },
> + { .compatible = "fsl,imx50", .data = _constraints },
> + { .compatible = "fsl,imx51", .data = _constraints },
> + { .compatible = "fsl,imx53", .data = _constraints },
> + { .compatible = "fsl,imx6dl", .data = _constraints },
> + { .compatible = "fsl,imx6q", .data = _constraints },
> + { .compatible = "fsl,imx6qp", .data = _constraints },
> + { .compatible = "fsl,imx6sl", .data = _constraints },
> + { .compatible = "fsl,imx6sx", .data = _constraints },
> + { .compatible = "fsl,imx6ul", .data = _constraints },
> + { .compatible = "fsl,imx6ull", .data = _constraints },
> + { .compatible = "fsl,imx7d", .data = _constraints },
> + { .compatible = "fsl,imx7s", .data = _constraints },
> + { }
> +};
> +
> +static int __init imx_constraints_init(void)
> +{
> + const struct imx_machine_constraints *constraints;
> + const struct of_device_id *match;
> + struct device_node *np;
> +
> + if (!boot_constraint_earlycon_enabled())
> + return 0;
> +
> + np = of_find_node_by_path("/");
> + if (!np)
> + return -ENODEV;
> +
> + match = of_match_node(machines, np);
> + of_node_put(np);
> +
> + if (!match)
> + return 0;
> +
> + constraints = match->data;
> + BUG_ON(!constraints);
> +
> + dev_boot_constraint_add_deferrable_of(constraints->dev_constraints,
> +   constraints->count);
> +
> + return 0;
> +}
> +subsys_initcall(imx_constraints_init);

As said to Viresh privately: I would pretty much prefer this code being
hooked up to some SoC specific code which already knows the SoC type rather
than looping around the compatible array for all enabled machines (which
may even not only be i.MX for multi SoC kernels).
Vireshs response was that adding more SoC specific code may not be
wanted after we've finally got rid of most of it. So basically we need
a third opinion ;)

Other than that I tested an earlier version of this patch on i.MX6 and
it worked as expected.

Sascha


-- 
Pengutronix e.K.   | |
Industrial Linux Solutions | http://www.pengutronix.de/  |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0|
Amtsgericht Hildesheim, HRA 2686   | Fax:   +49-5121-206917- |

[PATCH] xen/pvcalls: remove redundant check for irq >= 0

2017-11-03 Thread Colin King

From: Colin Ian King 

This is a moot point, but irq is always less than zero at the label
out_error, so the check for irq >= 0 is redundant and can be removed.

Detected by CoverityScan, CID#1460371 ("Logically dead code")

Fixes: cb1c7d9bbc87 ("xen/pvcalls: implement connect command")
Signed-off-by: Colin Ian King 
---
 drivers/xen/pvcalls-front.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c
index de8a470351a5..b08569998046 100644
--- a/drivers/xen/pvcalls-front.c
+++ b/drivers/xen/pvcalls-front.c
@@ -351,9 +351,7 @@ static int create_active(struct sock_mapping *map, int 
*evtchn)
return 0;
 
 out_error:
-   if (irq >= 0)
-   unbind_from_irqhandler(irq, map);
-   else if (*evtchn >= 0)
+   if (*evtchn >= 0)
xenbus_free_evtchn(pvcalls_front_dev, *evtchn);
kfree(map->active.data.in);
kfree(map->active.ring);
-- 
2.14.1

Re: [PATCH] s390/mm: fix pud table accounting

2017-11-03 Thread Kirill A. Shutemov

On Fri, Nov 03, 2017 at 09:05:51AM +, Heiko Carstens wrote:
> With "mm: account pud page tables" and "mm: consolidate page table
> accounting" pud page table accounting was introduced which now results
> in tons of warnings like this one on s390:
> 
> BUG: non-zero pgtables_bytes on freeing mm: -16384
> 
> Reason for this are our run-time folded page tables: by default new
> processes start with three page table levels where the allocated pgd
> is the same as the first pud. In this case there won't ever be a pud
> allocated and therefore mm_inc_nr_puds() will also never be called.
> 
> However when freeing the address space free_pud_range() will call
> exactly once mm_dec_nr_puds() which leads to misaccounting.
> 
> Therefore call mm_inc_nr_puds() within init_new_context() to fix
> this. This is the same like we have it already for processes that run
> with two page table levels (aka compat processes).
> 
> While at it also adjust the comment, since there is no "mm->nr_pmds"
> anymore.

Thanks for tracking it down.

Acked-by: Kirill A. Shutemov 

-- 
 Kirill A. Shutemov

Re: [PATCH] Support resetting WARN*_ONCE

2017-11-03 Thread Michael Ellerman

Andi Kleen  writes:

> diff --git a/kernel/panic.c b/kernel/panic.c
> index bdd18afa19a4..b2d872fa16de 100644
> --- a/kernel/panic.c
> +++ b/kernel/panic.c
> @@ -587,6 +588,32 @@ void warn_slowpath_null(const char *file, int line)
>  EXPORT_SYMBOL(warn_slowpath_null);
>  #endif
>  
> +#ifdef CONFIG_BUG
> +
> +/* Support resetting WARN*_ONCE state */
> +
> +static int clear_warn_once_set(void *data, u64 val)
> +{
> + memset(__start_once, 0, __end_once - __start_once);
> + return 0;
> +}
> +
> +DEFINE_SIMPLE_ATTRIBUTE(clear_warn_once_fops,
> + NULL,
> +clear_warn_once_set,
> + "%lld\n");
> +
> +static __init int register_warn_debugfs(void)
> +{
> + /* Don't care about failure */
> + debugfs_create_file("clear_warn_once", 0644, NULL,
   ^

Wouldn't 0200 be more appropriate if it's only writable?

Otherwise it appears readable, but cat'ing it gives you an error.

cheers

Re: [PATCH 07/17] coresight: tmc etr: Add transparent buffer management

2017-11-03 Thread Suzuki K Poulose


On 02/11/17 17:48, Mathieu Poirier wrote:

On Thu, Oct 19, 2017 at 06:15:43PM +0100, Suzuki K Poulose wrote:

At the moment we always use contiguous memory for TMC ETR tracing
when used from sysfs. The size of the buffer is fixed at boot time
and can only be changed by modifiying the DT. With the introduction
of SG support we could support really large buffers in that mode.
This patch abstracts the buffer used for ETR to switch between a
contiguous buffer or a SG table depending on the availability of
the memory.

This also enables the sysfs mode to use the ETR in SG mode depending
on configured the trace buffer size. Also, since ETR will use the
new infrastructure to manage the buffer, we can get rid of some
of the members in the tmc_drvdata and clean up the fields a bit.

Cc: Mathieu Poirier 
Signed-off-by: Suzuki K Poulose 
---
  drivers/hwtracing/coresight/coresight-tmc-etr.c | 433 +++-
  drivers/hwtracing/coresight/coresight-tmc.h |  60 +++-
  2 files changed, 403 insertions(+), 90 deletions(-)



[..]
  

+
+static void tmc_etr_sync_sg_buf(struct etr_buf *etr_buf, u64 rrp, u64 rwp)



+   w_offset = tmc_sg_get_data_page_offset(table, rwp);
+   if (w_offset < 0) {
+   dev_warn(table->dev, "Unable to map RWP %llx to offset\n",
+   rwp);


 dev_warn(table->dev,
  "Unable to map RWP %llx to offset\n", rwq);

It looks a little better and we respect indentation rules.  Same for r_offset.




+static inline int tmc_etr_mode_alloc_buf(int mode,
+ struct tmc_drvdata *drvdata,
+ struct etr_buf *etr_buf, int node,
+ void **pages)


static inline int
tmc_etr_mode_alloc_buf(int mode,
struct tmc_drvdata *drvdata,
struct etr_buf *etr_buf, int node,
void **pages)



+ * tmc_alloc_etr_buf: Allocate a buffer use by ETR.
+ * @drvdata: ETR device details.
+ * @size   : size of the requested buffer.
+ * @flags  : Required properties of the type of buffer.
+ * @node   : Node for memory allocations.
+ * @pages  : An optional list of pages.
+ */
+static struct etr_buf *tmc_alloc_etr_buf(struct tmc_drvdata *drvdata,
+ ssize_t size, int flags,
+ int node, void **pages)


Please fix indentation.  Also @flags isn't used.



Yep, flags is only used later and can move it to the patch where we use it.


+{
+   int rc = -ENOMEM;
+   bool has_etr_sg, has_iommu;
+   struct etr_buf *etr_buf;
+
+   has_etr_sg = tmc_etr_has_cap(drvdata, TMC_ETR_SG);
+   has_iommu = iommu_get_domain_for_dev(drvdata->dev);
+
+   etr_buf = kzalloc(sizeof(*etr_buf), GFP_KERNEL);
+   if (!etr_buf)
+   return ERR_PTR(-ENOMEM);
+
+   etr_buf->size = size;
+
+   /*
+* If we have to use an existing list of pages, we cannot reliably
+* use a contiguous DMA memory (even if we have an IOMMU). Otherwise,
+* we use the contiguous DMA memory if :
+*  a) The ETR cannot use Scatter-Gather.
+*  b) if not a, we have an IOMMU backup


Please rework the above sentence.


How about :
   b) if (a) is not true and we have an IOMMU connected to the ETR.

I will address the other comments on indentation.

Thanks for the detailed look

Cheers
Suzuki

Re: [PATCH] xen/pvcalls: remove redundant check for irq >= 0

2017-11-03 Thread Juergen Gross

On 03/11/17 10:20, Colin King wrote:
> From: Colin Ian King 
> 
> This is a moot point, but irq is always less than zero at the label
> out_error, so the check for irq >= 0 is redundant and can be removed.
> 
> Detected by CoverityScan, CID#1460371 ("Logically dead code")
> 
> Fixes: cb1c7d9bbc87 ("xen/pvcalls: implement connect command")
> Signed-off-by: Colin Ian King 

Reviewed-by: Juergen Gross 


Juergen

Re: [PATCH] scsi: hisi_sas: select CONFIG_RAS

2017-11-03 Thread John Garry


+ Shiju, who authored the original patch

Hi Arnd,

Thanks for this.

On 02/11/2017 16:50, Arnd Bergmann wrote:

The driver now uses the RAS infrastructure, and fails to link if that
is disabled:

drivers/scsi/hisi_sas/hisi_sas_v2_hw.o: In function `fatal_ecc_int_v2_hw':
hisi_sas_v2_hw.c:(.text+0xb08): undefined reference to 
`__tracepoint_non_standard_event'
drivers/scsi/hisi_sas/hisi_sas_v2_hw.o: In function `fatal_axi_int_v2_hw':
hisi_sas_v2_hw.c:(.text+0x1b34): undefined reference to 
`__tracepoint_non_standard_event'

This adds an explicit Kconfig 'select RAS' statement. I don't know if
the driver uses the interface correctly, as no other driver seems to do
it like this, but the change fixes the link error.

Fixes: dfeb5021f001 ("scsi: hisi_sas: report ECC and AXI errors in v2 hw to 
userspace")
Signed-off-by: Arnd Bergmann 
---
 drivers/scsi/hisi_sas/Kconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/scsi/hisi_sas/Kconfig b/drivers/scsi/hisi_sas/Kconfig
index d42f29a5eb65..6ad8a6251d21 100644
--- a/drivers/scsi/hisi_sas/Kconfig
+++ b/drivers/scsi/hisi_sas/Kconfig
@@ -4,6 +4,7 @@ config SCSI_HISI_SAS
depends on ARM64 || COMPILE_TEST
select SCSI_SAS_LIBSAS
select BLK_DEV_INTEGRITY
+   select RAS


My impression is that we don't want this. Correction: shouldn't want this.

Do you have the .config for the broken build? I couldn't recreate this 
by turning off CONFIG_RAS.


John


depends on ATA
help
This driver supports HiSilicon's SAS HBA, including support 
based

Re: [PATCH 1/2] staging: greybus: remove unused kfifo_ts

2017-11-03 Thread Bryan O'Donoghue


On 02/11/17 14:32, Arnd Bergmann wrote:

As of commit 8e1d6c336d74 ("greybus: loopback: drop bus aggregate
calculation"), nothing ever reads from kfifo_ts, so there is no
reason to write to it or even allocate it any more.

Signed-off-by: Arnd Bergmann 
---
  drivers/staging/greybus/loopback.c | 27 +++
  1 file changed, 3 insertions(+), 24 deletions(-)

diff --git a/drivers/staging/greybus/loopback.c 
b/drivers/staging/greybus/loopback.c
index 08e255884206..85046fb16aad 100644
--- a/drivers/staging/greybus/loopback.c
+++ b/drivers/staging/greybus/loopback.c
@@ -72,7 +72,6 @@ struct gb_loopback {
  
  	struct dentry *file;

struct kfifo kfifo_lat;
-   struct kfifo kfifo_ts;
struct mutex mutex;
struct task_struct *task;
struct list_head entry;
@@ -262,7 +261,6 @@ static void gb_loopback_check_attr(struct gb_loopback *gb)
 gb->iteration_max, kfifo_depth);
}
kfifo_reset_out(>kfifo_lat);
-   kfifo_reset_out(>kfifo_ts);
  
  	switch (gb->type) {

case GB_LOOPBACK_TYPE_PING:
@@ -387,13 +385,6 @@ static u64 gb_loopback_calc_latency(struct timeval *ts, 
struct timeval *te)
return __gb_loopback_calc_latency(t1, t2);
  }
  
-static void gb_loopback_push_latency_ts(struct gb_loopback *gb,

-   struct timeval *ts, struct timeval *te)
-{
-   kfifo_in(>kfifo_ts, (unsigned char *)ts, sizeof(*ts));
-   kfifo_in(>kfifo_ts, (unsigned char *)te, sizeof(*te));
-}
-
  static int gb_loopback_operation_sync(struct gb_loopback *gb, int type,
  void *request, int request_size,
  void *response, int response_size)
@@ -433,7 +424,6 @@ static int gb_loopback_operation_sync(struct gb_loopback 
*gb, int type,
do_gettimeofday();
  
  	/* Calculate the total time the message took */

-   gb_loopback_push_latency_ts(gb, , );
gb->elapsed_nsecs = gb_loopback_calc_latency(, );
  
  out_put_operation:

@@ -521,11 +511,9 @@ static void gb_loopback_async_operation_callback(struct 
gb_operation *operation)
err = true;
}
  
-	if (!err) {

-   gb_loopback_push_latency_ts(gb, _async->ts, );
+   if (!err)
gb->elapsed_nsecs = gb_loopback_calc_latency(_async->ts,
 );
-   }
  
  	if (op_async->pending) {

if (err)
@@ -1241,18 +1229,12 @@ static int gb_loopback_probe(struct gb_bundle *bundle,
retval = -ENOMEM;
goto out_conn;
}
-   if (kfifo_alloc(>kfifo_ts, kfifo_depth * sizeof(struct timeval) * 2,
- GFP_KERNEL)) {
-   retval = -ENOMEM;
-   goto out_kfifo0;
-   }
-
/* Fork worker thread */
mutex_init(>mutex);
gb->task = kthread_run(gb_loopback_fn, gb, "gb_loopback");
if (IS_ERR(gb->task)) {
retval = PTR_ERR(gb->task);
-   goto out_kfifo1;
+   goto out_kfifo;
}
  
  	spin_lock_irqsave(_dev.lock, flags);

@@ -1266,9 +1248,7 @@ static int gb_loopback_probe(struct gb_bundle *bundle,
  
  	return 0;
  
-out_kfifo1:

-   kfifo_free(>kfifo_ts);
-out_kfifo0:
+out_kfifo:
kfifo_free(>kfifo_lat);
  out_conn:
device_unregister(dev);
@@ -1302,7 +1282,6 @@ static void gb_loopback_disconnect(struct gb_bundle 
*bundle)
kthread_stop(gb->task);
  
  	kfifo_free(>kfifo_lat);

-   kfifo_free(>kfifo_ts);
gb_connection_latency_tag_disable(gb->connection);
debugfs_remove(gb->file);
  



This looks right to me

Reviewed-by: Bryan O'Donoghue

Re: [RFT][PATCH 2/2] PM / QoS: Fix device resume latency framework

2017-11-03 Thread Rafael J. Wysocki

On Fri, Nov 3, 2017 at 8:58 AM, Rafael J. Wysocki  wrote:
>  On Fri, Nov 3, 2017 at 8:43 AM, Ramesh Thomas  
> wrote:
>> On 2017-11-02 at 00:03:54 +0100, Rafael J. Wysocki wrote:
>>> From: Rafael J. Wysocki 
>>>
>>> The special value of 0 for device resume latency PM QoS means
>>> "no restriction", but there are two problems with that.
>>>
>>> First, device resume latency PM QoS requests with 0 as the
>>> value are always put in front of requests with positive
>>> values in the priority lists used internally by the PM QoS
>>> framework, causing 0 to be chosen as an effective constraint
>>> value.  However, that 0 is then interpreted as "no restriction"
>>> effectively overriding the other requests with specific
>>> restrictions which is incorrect.
>>>
>>> Second, the users of device resume latency PM QoS have no
>>> way to specify that *any* resume latency at all should be
>>> avoided, which is an artificial limitation in general.
>>>
>>> To address these issues, modify device resume latency PM QoS to
>>> use S32_MAX as the "no constraint" value and 0 as the "no
>>> latency at all" one and rework its users (the cpuidle menu
>>> governor, the genpd QoS governor and the runtime PM framework)
>>> to follow these changes.
>>>
>>> Also add a special "n/a" value to the corresponding user space I/F
>>> to allow user space to indicate that it cannot accept any resume
>>> latencies at all for the given device.
>>>
>>> Fixes: 85dc0b8a4019 (PM / QoS: Make it possible to expose PM QoS latency 
>>> constraints)
>>> Link: https://bugzilla.kernel.org/show_bug.cgi?id=197323
>>> Reported-by: Reinette Chatre 
>>> Signed-off-by: Rafael J. Wysocki 
>>> ---
>>>  Documentation/ABI/testing/sysfs-devices-power |4 ++-
>>>  drivers/base/cpu.c|3 +-
>>>  drivers/base/power/domain.c   |2 -
>>>  drivers/base/power/domain_governor.c  |   27 
>>> --
>>>  drivers/base/power/qos.c  |5 +++-
>>>  drivers/base/power/runtime.c  |2 -
>>>  drivers/base/power/sysfs.c|   25 
>>> 
>>>  drivers/cpuidle/governors/menu.c  |4 +--
>>>  include/linux/pm_qos.h|   24 
>>> +++
>>>  9 files changed, 63 insertions(+), 33 deletions(-)
>>>
>>> Index: linux-pm/drivers/base/power/sysfs.c
>>> ===
>>> --- linux-pm.orig/drivers/base/power/sysfs.c
>>> +++ linux-pm/drivers/base/power/sysfs.c
>>> @@ -218,7 +218,14 @@ static ssize_t pm_qos_resume_latency_sho
>>> struct device_attribute *attr,
>>> char *buf)
>>>  {
>>> - return sprintf(buf, "%d\n", dev_pm_qos_requested_resume_latency(dev));
>>> + s32 value = dev_pm_qos_requested_resume_latency(dev);
>>> +
>>> + if (value == 0)
>>> + return sprintf(buf, "n/a\n");
>>> + else if (value == PM_QOS_RESUME_LATENCY_NO_CONSTRAINT)
>>> + value = 0;
>>> +
>>> + return sprintf(buf, "%d\n", value);
>>>  }
>>>
>>>  static ssize_t pm_qos_resume_latency_store(struct device *dev,
>>> @@ -228,11 +235,21 @@ static ssize_t pm_qos_resume_latency_sto
>>>   s32 value;
>>>   int ret;
>>>
>>> - if (kstrtos32(buf, 0, ))
>>> - return -EINVAL;
>>> + if (!kstrtos32(buf, 0, )) {
>>> + /*
>>> +  * Prevent users from writing negative or "no constraint" 
>>> values
>>> +  * directly.
>>> +  */
>>> + if (value < 0 || value == PM_QOS_RESUME_LATENCY_NO_CONSTRAINT)
>>> + return -EINVAL;
>>>
>>> - if (value < 0)
>>> + if (value == 0)
>>> + value = PM_QOS_RESUME_LATENCY_NO_CONSTRAINT;
>>> + } else if (!strcmp(buf, "n/a") || !strcmp(buf, "n/a\n")) {
>>> + value = 0;
>>> + } else {
>>>   return -EINVAL;
>>> + }
>>>
>>>   ret = dev_pm_qos_update_request(dev->power.qos->resume_latency_req,
>>>   value);
>>> Index: linux-pm/include/linux/pm_qos.h
>>> ===
>>> --- linux-pm.orig/include/linux/pm_qos.h
>>> +++ linux-pm/include/linux/pm_qos.h
>>> @@ -27,16 +27,17 @@ enum pm_qos_flags_status {
>>>   PM_QOS_FLAGS_ALL,
>>>  };
>>>
>>> -#define PM_QOS_DEFAULT_VALUE -1
>>> +#define PM_QOS_DEFAULT_VALUE (-1)
>>> +#define PM_QOS_LATENCY_ANY   S32_MAX
>>>
>>>  #define PM_QOS_CPU_DMA_LAT_DEFAULT_VALUE (2000 * USEC_PER_SEC)
>>>  #define PM_QOS_NETWORK_LAT_DEFAULT_VALUE (2000 * USEC_PER_SEC)
>>>  #define PM_QOS_NETWORK_THROUGHPUT_DEFAULT_VALUE  0
>>>  #define PM_QOS_MEMORY_BANDWIDTH_DEFAULT_VALUE0
>>> -#define

Re: [PATCH v17 3/6] mm/balloon_compaction.c: split balloon page allocation and enqueue

2017-11-03 Thread Tetsuo Handa

Wei Wang wrote:
> Here's a detailed analysis of the deadlock by Tetsuo Handa:
> 
> In leak_balloon(), mutex_lock(>balloon_lock) is called in order to
> serialize against fill_balloon(). But in fill_balloon(),
> alloc_page(GFP_HIGHUSER[_MOVABLE] | __GFP_NOMEMALLOC | __GFP_NORETRY) is
> called with vb->balloon_lock mutex held. Since GFP_HIGHUSER[_MOVABLE]
> implies __GFP_DIRECT_RECLAIM | __GFP_IO | __GFP_FS, despite __GFP_NORETRY
> is specified, this allocation attempt might indirectly depend on somebody
> else's __GFP_DIRECT_RECLAIM memory allocation. And such indirect
> __GFP_DIRECT_RECLAIM memory allocation might call leak_balloon() via
> virtballoon_oom_notify() via blocking_notifier_call_chain() callback via
> out_of_memory() when it reached __alloc_pages_may_oom() and held oom_lock
> mutex. Since vb->balloon_lock mutex is already held by fill_balloon(), it
> will cause OOM lockup. Thus, do not wait for vb->balloon_lock mutex if
> leak_balloon() is called from out_of_memory().

Please drop "Thus, do not wait for vb->balloon_lock mutex if leak_balloon()
is called from out_of_memory()." part. This is not what this patch will do.

> 
> Thread1Thread2
> fill_balloon()
>  takes a balloon_lock
>   balloon_page_enqueue()
>alloc_page(GFP_HIGHUSER_MOVABLE)
> direct reclaim (__GFP_FS context)  takes a fs lock
>  waits for that fs lock alloc_page(GFP_NOFS)
>  __alloc_pages_may_oom()
>   takes the oom_lock
>out_of_memory()
> blocking_notifier_call_chain()
>  leak_balloon()
>tries to take that
>  balloon_lock and deadlocks

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 1758 matches

Mail list logo