date:20180412

Re: [PATCH v3 2/2] vfio: platform: Add generic DT reset support

2018-04-12 Thread Simon Horman

On Wed, Apr 11, 2018 at 11:15:49AM +0200, Geert Uytterhoeven wrote:
> Vfio-platform requires reset support, provided either by ACPI, or, on DT
> platforms, by a device-specific reset driver matching against the
> device's compatible value.
> 
> On many SoCs, devices are connected to an SoC-internal reset controller.
> If the reset hierarchy is described in DT using "resets" properties,
> such devices can be reset in a generic way through the reset controller
> subsystem.  Hence add support for this, avoiding the need to write
> device-specific reset drivers for each single device on affected SoCs.
> 
> Devices that do require a more complex reset procedure can still provide
> a device-specific reset driver, as that takes precedence.
> 
> Note that this functionality depends on CONFIG_RESET_CONTROLLER=y, and
> becomes a no-op (as in: "No reset function found for device") if reset
> controller support is disabled.
> 
> Signed-off-by: Geert Uytterhoeven 
> Reviewed-by: Philipp Zabel 

Reviewed-by: Simon Horman

Re: [PATCH] sched/fair: schedutil: update only with all info available

2018-04-12 Thread Vincent Guittot

Hi Joel,

On 11 April 2018 at 23:34, Joel Fernandes  wrote:
> Hi Vincent,
>
> On Wed, Apr 11, 2018 at 4:56 AM, Vincent Guittot
>  wrote:
>> On 11 April 2018 at 12:15, Patrick Bellasi  wrote:
>>> On 11-Apr 08:57, Vincent Guittot wrote:
 On 10 April 2018 at 13:04, Patrick Bellasi  wrote:
 > On 09-Apr 10:51, Vincent Guittot wrote:
 >> On 6 April 2018 at 19:28, Patrick Bellasi  
 >> wrote:
 >> Peter,
 >> what was your goal with adding the condition "if
 >> (rq->cfs.h_nr_running)" for the aggragation of CFS utilization
 >
 > The original intent was to get rid of sched class flags, used to track
 > which class has tasks runnable from within schedutil. The reason was
 > to solve some misalignment between scheduler class status and
 > schedutil status.

 This was mainly for RT tasks but it was not the case for cfs task
 before commit 8f111bc357aa
>>>
>>> True, but with his solution Peter has actually come up with a unified
>>> interface which is now (and can be IMO) based just on RUNNABLE
>>> counters for each class.
>>
>> But do we really want to only take care of runnable counter for all class ?
>>
>>>
 > The solution, initially suggested by Viresh, and finally proposed by
 > Peter was to exploit RQ knowledges directly from within schedutil.
 >
 > The problem is that now schedutil updated depends on two information:
 > utilization changes and number of RT and CFS runnable tasks.
 >
 > Thus, using cfs_rq::h_nr_running is not the problem... it's actually
 > part of a much more clean solution of the code we used to have.

 So there are 2 problems there:
 - using cfs_rq::h_nr_running when aggregating cfs utilization which
 generates a lot of frequency drop
>>>
>>> You mean because we now completely disregard the blocked utilization
>>> where a CPU is idle, right?
>>
>> yes
>>
>>>
>>> Given how PELT works and the recent support for IDLE CPUs updated, we
>>> should probably always add contributions for the CFS class.
>>>
 - making sure that the nr-running are up-to-date when used in sched_util
>>>
>>> Right... but, if we always add the cfs_rq (to always account for
>>> blocked utilization), we don't have anymore this last dependency,
>>> isn't it?
>>
>> yes
>>
>>>
>>> We still have to account for the util_est dependency.
>>>
>>> Should I add a patch to this series to disregard cfs_rq::h_nr_running
>>> from schedutil as you suggested?
>>
>> It's probably better to have a separate patch as these are 2 different topics
>> - when updating cfs_rq::h_nr_running and when calling cpufreq_update_util
>> - should we use runnable or running utilization for CFS
>
> By runnable you don't mean sched_avg::load_avg right? I got a bit
> confused, since runnable means load_avg and running means util_avg.

Sorry for the confusion. By runnable utilization, I meant taking into
account the number of running task (cfs_rq::h_nr_running) like what is
done by commit (8f111bc357a)


> But I didn't follow here why we are talking about load_avg for
> schedutil at all. I am guessing by "runnable" you mean h_nr_running !=
> 0.

yes

>
> Also that aside, the "running util" is what was used to drive the CFS
> util before Peter's patch (8f111bc357a). That was accounting the
> blocked and decaying utilization but that patch changed the behavior.
> It seems logical we should just use that not check for h_nr_running
> for CFS so we don't miss on the decayed  utilization. What is the use
> of checking h_nr_running or state of runqueue for CFS? I am sure to be
> missing something here. :-(

As Peter mentioned, the change in commit (8f111bc357a) was to remove
the test that was arbitrary removing the util_avg of a cpu that has
not been updated since a tick

But with the update of blocked idle load, we don't need to handle the
case of stalled load/utilization

>
> thanks!
>
> - Joel

Re: [RFC PATCH v2 1/6] sched/fair: Create util_fits_capacity()

2018-04-12 Thread Viresh Kumar

On 06-04-18, 16:36, Dietmar Eggemann wrote:
> The functionality that a given utilization fits into a given capacity
> is factored out into a separate function.
> 
> Currently it is only used in wake_cap() but will be re-used to figure
> out if a cpu or a scheduler group is over-utilized.
> 
> Cc: Ingo Molnar 
> Cc: Peter Zijlstra 
> Signed-off-by: Dietmar Eggemann 
> ---
>  kernel/sched/fair.c | 7 ++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 0951d1c58d2f..0a76ad2ef022 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -6574,6 +6574,11 @@ static unsigned long cpu_util_wake(int cpu, struct 
> task_struct *p)
>   return min_t(unsigned long, util, capacity_orig_of(cpu));
>  }
>  
> +static inline int util_fits_capacity(unsigned long util, unsigned long 
> capacity)
> +{
> + return capacity * 1024 > util * capacity_margin;

This changes the behavior slightly compared to existing code. If that
wasn't intentional, perhaps you should use >= here.

> +}
> +
>  /*
>   * Disable WAKE_AFFINE in the case where task @p doesn't fit in the
>   * capacity of either the waking CPU @cpu or the previous CPU @prev_cpu.
> @@ -6595,7 +6600,7 @@ static int wake_cap(struct task_struct *p, int cpu, int 
> prev_cpu)
>   /* Bring task utilization in sync with prev_cpu */
>   sync_entity_load_avg(&p->se);
>  
> - return min_cap * 1024 < task_util(p) * capacity_margin;
> + return !util_fits_capacity(task_util(p), min_cap);
>  }
>  
>  /*
> -- 
> 2.11.0

-- 
viresh

Re: [PATCH] checkpatch: Add a --strict test for structs with bool member definitions

2018-04-12 Thread Julia Lawall



On Wed, 11 Apr 2018, Joe Perches wrote:

> On Thu, 2018-04-12 at 08:22 +0200, Julia Lawall wrote:
> > On Wed, 11 Apr 2018, Joe Perches wrote:
> > > On Wed, 2018-04-11 at 09:29 -0700, Andrew Morton wrote:
> > > > We already have some 500 bools-in-structs
> > >
> > > I got at least triple that only in include/
> > > so I expect there are at probably an order
> > > of magnitude more than 500 in the kernel.
> > >
> > > I suppose some cocci script could count the
> > > actual number of instances.  A regex can not.
> >
> > I got 12667.
>
> Could you please post the cocci script?

Sure.

julia


Command line:
spatch.opt boolinstruct.cocci -j 40 --very-quiet --no-includes 
--include-headers /run/shm/linux-next --use-idutils

This was tested on:

struct foo {
  bool a;
  bool b,c;
  int r;
};

struct {
  bool a;
  bool b,c;
  int r;
} x;

--

@initialize:ocaml@
@@
let ctr = ref 0

@r@
identifier i,x;
position p;
@@

struct i {
  ...
  bool x@p;
  ...
}

@script:ocaml@
_p << r.p;
@@

ctr := !ctr + 1

@s@
identifier x;
position p;
@@

struct {
  ...
  bool x@p;
  ...
}

@script:ocaml@
_p << s.p;
@@

ctr := !ctr + 1

@finalize:ocaml@
ctrs << merge.ctr;
@@

ctr := 0;
List.iter (function c -> ctr := !c + !ctr) ctrs;
Printf.printf "%d\n" !ctr

Re: [PATCH] vfio: platform: Fix using devices in PM Domains

2018-04-12 Thread Simon Horman

On Wed, Apr 11, 2018 at 11:24:08AM +0200, Geert Uytterhoeven wrote:
> If a device is part of a PM Domain (e.g. power and/or clock domain), its
> power state is managed using Runtime PM.  Without Runtime PM, the device
> may not be powered up, causing subtle failures, crashes, or system
> lock-ups when the device is accessed by the guest.
> 
> Fix this by adding Runtime PM support, powering the device when the VFIO
> device is opened by the guest.
> 
> Note that while more fine-grained power management could be implemented
> on the guest side, if exported, this would be inherently unsafe, as
> abusing it may kill the whole system.
> 
> Signed-off-by: Geert Uytterhoeven 

Reviewed-by: Simon Horman

Re: [PATCH 8/9] mtd: nand: qcom: helper function for raw read

2018-04-12 Thread Abhishek Sahu


On 2018-04-10 15:14, Miquel Raynal wrote:

Hi Abhishek,

On Wed,  4 Apr 2018 18:12:24 +0530, Abhishek Sahu
 wrote:


This patch does minor code reorganization for raw reads.
Currently the raw read is required for complete page but for
subsequent patches related with erased codeword bit flips
detection, only few CW should be read. So, this patch adds
helper function and introduces the read CW bitmask which
specifies which CW reads are required in complete page.


I am not sure this is the right approach for subpage reads. If the
controller supports it, you should just enable it in chip->options.



 Thanks Miquel.

 It is internal to this file only. This patch makes one static helper
 function which provides the support to read subpages.



Signed-off-by: Abhishek Sahu 
---
 drivers/mtd/nand/qcom_nandc.c | 186 
+-

 1 file changed, 110 insertions(+), 76 deletions(-)

diff --git a/drivers/mtd/nand/qcom_nandc.c 
b/drivers/mtd/nand/qcom_nandc.c

index 40c790e..f5d1fa4 100644
--- a/drivers/mtd/nand/qcom_nandc.c
+++ b/drivers/mtd/nand/qcom_nandc.c
@@ -1590,6 +1590,114 @@ static int check_flash_errors(struct 
qcom_nand_host *host, int cw_cnt)

 }

 /*
+ * Helper to perform the page raw read operation. The read_cw_mask 
will be
+ * used to specify the codewords for which the data should be read. 
The
+ * single page contains multiple CW. Sometime, only few CW data is 
required

+ * in complete page. Also, start address will be determined with
+ * this CW mask to skip unnecessary data copy from NAND flash device. 
Then,
+ * actual data copy from NAND controller to data buffer will be done 
only

+ * for the CWs which have the mask set.
+ */
+static int
+nandc_read_page_raw(struct mtd_info *mtd, struct nand_chip *chip,
+   u8 *data_buf, u8 *oob_buf,
+   int page, unsigned long read_cw_mask)
+{
+   struct qcom_nand_host *host = to_qcom_nand_host(chip);
+   struct qcom_nand_controller *nandc = get_qcom_nand_controller(chip);
+   struct nand_ecc_ctrl *ecc = &chip->ecc;
+   int i, ret;
+   int read_loc, start_step, last_step;
+
+   nand_read_page_op(chip, page, 0, NULL, 0);
+
+   host->use_ecc = false;
+   start_step = ffs(read_cw_mask) - 1;
+   last_step = fls(read_cw_mask);
+
+   clear_bam_transaction(nandc);
+   set_address(host, host->cw_size * start_step, page);
+   update_rw_regs(host, last_step - start_step, true);
+   config_nand_page_read(nandc);
+
+   for (i = start_step; i < last_step; i++) {
+   int data_size1, data_size2, oob_size1, oob_size2;
+   int reg_off = FLASH_BUF_ACC;
+
+   data_size1 = mtd->writesize - host->cw_size * (ecc->steps - 1);
+   oob_size1 = host->bbm_size;
+
+   if (i == (ecc->steps - 1)) {
+   data_size2 = ecc->size - data_size1 -
+((ecc->steps - 1) << 2);
+   oob_size2 = (ecc->steps << 2) + host->ecc_bytes_hw +
+   host->spare_bytes;
+   } else {
+   data_size2 = host->cw_data - data_size1;
+   oob_size2 = host->ecc_bytes_hw + host->spare_bytes;
+   }
+
+   /*
+* Don't perform actual data copy from NAND controller to data
+* buffer through DMA for this codeword
+*/
+   if (!(read_cw_mask & BIT(i))) {
+   if (nandc->props->is_bam)
+   nandc_set_read_loc(nandc, 0, 0, 0, 1);
+
+   config_nand_cw_read(nandc, false);
+
+   data_buf += data_size1 + data_size2;
+   oob_buf += oob_size1 + oob_size2;
+
+   continue;
+   }
+
+   if (nandc->props->is_bam) {
+   read_loc = 0;
+   nandc_set_read_loc(nandc, 0, read_loc, data_size1, 0);
+   read_loc += data_size1;
+
+   nandc_set_read_loc(nandc, 1, read_loc, oob_size1, 0);
+   read_loc += oob_size1;
+
+   nandc_set_read_loc(nandc, 2, read_loc, data_size2, 0);
+   read_loc += data_size2;
+
+   nandc_set_read_loc(nandc, 3, read_loc, oob_size2, 1);
+   }
+
+   config_nand_cw_read(nandc, false);
+
+   read_data_dma(nandc, reg_off, data_buf, data_size1, 0);
+   reg_off += data_size1;
+   data_buf += data_size1;
+
+   read_data_dma(nandc, reg_off, oob_buf, oob_size1, 0);
+   reg_off += oob_size1;
+   oob_buf += oob_size1;
+
+   read_data_dma(nandc, reg_off, data_buf, data_size2, 0);
+   reg_off += data_size2;
+   data_buf += data_size2;
+
+   read_data_dma(nandc, reg_off, oob_buf, oob

Re: x86-tip.today (4cdf573) early instaboot

2018-04-12 Thread Ingo Molnar


* Mike Galbraith  wrote:

> On Tue, 2018-04-10 at 09:06 -0500, Tom Lendacky wrote:
> > 
> > Just out of curiosity, can you try the following patch and see if it
> > fixes your reboot issue:
> 
> Yup, all better.
> 
> > diff --git a/arch/x86/boot/compressed/kaslr.c
> > b/arch/x86/boot/compressed/kaslr.c
> > index c5196d2..a0a50b9 100644
> > --- a/arch/x86/boot/compressed/kaslr.c
> > +++ b/arch/x86/boot/compressed/kaslr.c
> > @@ -55,7 +55,7 @@
> >  extern unsigned long get_cmd_line_ptr(void);
> > 
> >  /* Used by PAGE_KERN* macros: */
> > -pteval_t __default_kernel_pte_mask __read_mostly;
> > +pteval_t __default_kernel_pte_mask __read_mostly = ~0;
> > 
> >  /* Simplified build-specific string for starting entropy. */
> >  static const char build_str[] = UTS_RELEASE " (" LINUX_COMPILE_BY "@"

Thanks guys!

I ended up back-merging this fix (and another fix) into:

  fb43d6cb91ef: x86/mm: Do not auto-massage page protections

I added credits as:

- printk format warning fix from: Arnd Bergmann 
- boot crash fix from:Tom Lendacky 
- crash bisected by:  Mike Galbraith 

...

Reported-and-fixed-by: Arnd Bergmann 
Fixed-by: Tom Lendacky 
Bisected-by: Mike Galbraith 

Thanks,

Ingo

[PATCH 1/2] nohz: Align types to bool for tick_nohz_tick_stopped()

2018-04-12 Thread Ulf Hansson

The intent was likely to also make the inline version a bool, so let's
change this.

Fixes: a364298359e7 ("nohz: Convert tick_nohz_tick_stopped() to bool")
Signed-off-by: Ulf Hansson 
---

Changes in v2:
- Rebased on top of Linus' latest master to apply cleanly.

---
 include/linux/tick.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/tick.h b/include/linux/tick.h
index 55388ab..4132eba 100644
--- a/include/linux/tick.h
+++ b/include/linux/tick.h
@@ -137,7 +137,7 @@ static inline void tick_nohz_idle_stop_tick_protected(void)
 
 #else /* !CONFIG_NO_HZ_COMMON */
 #define tick_nohz_enabled (0)
-static inline int tick_nohz_tick_stopped(void) { return 0; }
+static inline bool tick_nohz_tick_stopped(void) { return false; }
 static inline int tick_nohz_tick_stopped_cpu(int cpu) { return 0; }
 static inline void tick_nohz_idle_stop_tick(void) { }
 static inline void tick_nohz_idle_retain_tick(void) { }
-- 
2.7.4

Re: [PATCH 0/3] Refactor TPM event log code

2018-04-12 Thread Jarkko Sakkinen

On Wed, Apr 11, 2018 at 02:54:58PM +0200, Thiebaud Weksteen wrote:
> This patchset implements the proposal from Jarkko Sakkinen [1]. I have
> included the feedback from Nayna Jain about the function naming.
> 
> [1] https://lkml.kernel.org/r/20171024222148.gwnkj5vqsyj43...@linux.intel.com

You could add suggested-by to these commits.

/Jarkko

[PATCH 2/2] nohz: Align types to bool for tick_nohz_tick_stopped_cpu()

2018-04-12 Thread Ulf Hansson

The intent was likely to also make the inline version a bool, so let's
change this.

Fixes: 22ab8bc02a5f ("nohz: Allow to check if remote CPU tick is stopped")
Signed-off-by: Ulf Hansson 
---

Changes in v2:
- Rebased on top of Linus' latest master to apply cleanly.

---
 include/linux/tick.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/tick.h b/include/linux/tick.h
index 4132eba..389aa25 100644
--- a/include/linux/tick.h
+++ b/include/linux/tick.h
@@ -138,7 +138,7 @@ static inline void tick_nohz_idle_stop_tick_protected(void)
 #else /* !CONFIG_NO_HZ_COMMON */
 #define tick_nohz_enabled (0)
 static inline bool tick_nohz_tick_stopped(void) { return false; }
-static inline int tick_nohz_tick_stopped_cpu(int cpu) { return 0; }
+static inline bool tick_nohz_tick_stopped_cpu(int cpu) { return false; }
 static inline void tick_nohz_idle_stop_tick(void) { }
 static inline void tick_nohz_idle_retain_tick(void) { }
 static inline void tick_nohz_idle_restart_tick(void) { }
-- 
2.7.4

Re: [PATCH 5/9] mtd: nand: qcom: parse read errors for read oob also

2018-04-12 Thread Abhishek Sahu


On 2018-04-10 15:33, Miquel Raynal wrote:

Hi Abhishek,

On Wed,  4 Apr 2018 18:12:21 +0530, Abhishek Sahu
 wrote:


read_page and read_oob both calls the read_page_ecc function.
The QCOM NAND controller protect the OOB available bytes with
ECC so read errors should be checked for read_oob also. Now
this patch moves the error checking code inside read_page_ecc
so caller does not have to check explicitly for read errors.

Signed-off-by: Abhishek Sahu 


Nitpick: the prefix should be "mtd: rawnand: qcom: " now as this driver
has been moved to drivers/mtd/nand/raw/.

Otherwise:
Reviewed-by: Miquel Raynal 


 Thanks Miquel for your review.

 I will update the same in this patch and other patches.
 and rebase my patches over 4.17-rc1 once its available.

 Thanks,
 Abhishek

Re: [PATCH 3/3] tpm: Move eventlog declarations to its own header

2018-04-12 Thread Jarkko Sakkinen

On Wed, Apr 11, 2018 at 02:55:01PM +0200, Thiebaud Weksteen wrote:
> Reduce the size of tpm.h by moving eventlog declarations to a separate
> header.
> 
> Signed-off-by: Thiebaud Weksteen 

Will be fine with suggested-by added.

/Jarkko

[tip:x86/pti] x86/mm: Do not auto-massage page protections

2018-04-12 Thread tip-bot for Dave Hansen

Commit-ID:  fb43d6cb91ef57d9e58d5f69b423784ff4a4c374
Gitweb: https://git.kernel.org/tip/fb43d6cb91ef57d9e58d5f69b423784ff4a4c374
Author: Dave Hansen 
AuthorDate: Fri, 6 Apr 2018 13:55:09 -0700
Committer:  Ingo Molnar 
CommitDate: Thu, 12 Apr 2018 09:04:22 +0200

x86/mm: Do not auto-massage page protections

A PTE is constructed from a physical address and a pgprotval_t.
__PAGE_KERNEL, for instance, is a pgprot_t and must be converted
into a pgprotval_t before it can be used to create a PTE.  This is
done implicitly within functions like pfn_pte() by massage_pgprot().

However, this makes it very challenging to set bits (and keep them
set) if your bit is being filtered out by massage_pgprot().

This moves the bit filtering out of pfn_pte() and friends.  For
users of PAGE_KERNEL*, filtering will be done automatically inside
those macros but for users of __PAGE_KERNEL*, they need to do their
own filtering now.

Note that we also just move pfn_pte/pmd/pud() over to check_pgprot()
instead of massage_pgprot().  This way, we still *look* for
unsupported bits and properly warn about them if we find them.  This
might happen if an unfiltered __PAGE_KERNEL* value was passed in,
for instance.

- printk format warning fix from: Arnd Bergmann 
- boot crash fix from:Tom Lendacky 
- crash bisected by:  Mike Galbraith 

Signed-off-by: Dave Hansen 
Reported-and-fixed-by: Arnd Bergmann 
Fixed-by: Tom Lendacky 
Bisected-by: Mike Galbraith 
Cc: Andrea Arcangeli 
Cc: Andy Lutomirski 
Cc: Arjan van de Ven 
Cc: Borislav Petkov 
Cc: Dan Williams 
Cc: David Woodhouse 
Cc: Greg Kroah-Hartman 
Cc: Hugh Dickins 
Cc: Josh Poimboeuf 
Cc: Juergen Gross 
Cc: Kees Cook 
Cc: Linus Torvalds 
Cc: Nadav Amit 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: linux...@kvack.org
Link: http://lkml.kernel.org/r/20180406205509.77e1d...@viggo.jf.intel.com
Signed-off-by: Ingo Molnar 
---
 arch/x86/boot/compressed/kaslr.c |  3 +++
 arch/x86/include/asm/pgtable.h   | 27 ++-
 arch/x86/kernel/head64.c |  2 ++
 arch/x86/kernel/ldt.c|  6 +-
 arch/x86/mm/ident_map.c  |  3 +++
 arch/x86/mm/iomap_32.c   |  6 ++
 arch/x86/mm/ioremap.c|  3 +++
 arch/x86/mm/kasan_init_64.c  | 14 +-
 arch/x86/mm/pgtable.c|  3 +++
 arch/x86/power/hibernate_64.c| 20 +++-
 10 files changed, 75 insertions(+), 12 deletions(-)

diff --git a/arch/x86/boot/compressed/kaslr.c b/arch/x86/boot/compressed/kaslr.c
index 66e42a098d70..a0a50b91ecef 100644
--- a/arch/x86/boot/compressed/kaslr.c
+++ b/arch/x86/boot/compressed/kaslr.c
@@ -54,6 +54,9 @@ unsigned int ptrs_per_p4d __ro_after_init = 1;
 
 extern unsigned long get_cmd_line_ptr(void);
 
+/* Used by PAGE_KERN* macros: */
+pteval_t __default_kernel_pte_mask __read_mostly = ~0;
+
 /* Simplified build-specific string for starting entropy. */
 static const char build_str[] = UTS_RELEASE " (" LINUX_COMPILE_BY "@"
LINUX_COMPILE_HOST ") (" LINUX_COMPILER ") " UTS_VERSION;
diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
index 89d5c8886c85..5f49b4ff0c24 100644
--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -526,22 +526,39 @@ static inline pgprotval_t massage_pgprot(pgprot_t pgprot)
return protval;
 }
 
+static inline pgprotval_t check_pgprot(pgprot_t pgprot)
+{
+   pgprotval_t massaged_val = massage_pgprot(pgprot);
+
+   /* mmdebug.h can not be included here because of dependencies */
+#ifdef CONFIG_DEBUG_VM
+   WARN_ONCE(pgprot_val(pgprot) != massaged_val,
+ "attempted to set unsupported pgprot: %016llx "
+ "bits: %016llx supported: %016llx\n",
+ (u64)pgprot_val(pgprot),
+ (u64)pgprot_val(pgprot) ^ massaged_val,
+ (u64)__supported_pte_mask);
+#endif
+
+   return massaged_val;
+}
+
 static inline pte_t pfn_pte(unsigned long page_nr, pgprot_t pgprot)
 {
return __pte(((phys_addr_t)page_nr << PAGE_SHIFT) |
-massage_pgprot(pgprot));
+check_pgprot(pgprot));
 }
 
 static inline pmd_t pfn_pmd(unsigned long page_nr, pgprot_t pgprot)
 {
return __pmd(((phys_addr_t)page_nr << PAGE_SHIFT) |
-massage_pgprot(pgprot));
+check_pgprot(pgprot));
 }
 
 static inline pud_t pfn_pud(unsigned long page_nr, pgprot_t pgprot)
 {
return __pud(((phys_addr_t)page_nr << PAGE_SHIFT) |
-massage_pgprot(pgprot));
+check_pgprot(pgprot));
 }
 
 static inline pte_t pte_modify(pte_t pte, pgprot_t newprot)
@@ -553,7 +570,7 @@ static inline pte_t pte_modify(pte_t pte, pgprot_t newprot)
 * the newprot (if present):
 */
val &= _PAGE_CHG_MASK;
-   val |= massage_pgprot(newprot) & ~_PAGE_CHG_MASK;
+   val |= check_pgprot(newprot) & ~_PAGE_CHG_MASK;
 
return __pte(val);
 }
@@ -563,7

[tip:x86/pti] x86/mm: Remove extra filtering in pageattr code

2018-04-12 Thread tip-bot for Dave Hansen

Commit-ID:  1a54420aeb4da1ba5b28283aa5696898220c9a27
Gitweb: https://git.kernel.org/tip/1a54420aeb4da1ba5b28283aa5696898220c9a27
Author: Dave Hansen 
AuthorDate: Fri, 6 Apr 2018 13:55:11 -0700
Committer:  Ingo Molnar 
CommitDate: Thu, 12 Apr 2018 09:05:58 +0200

x86/mm: Remove extra filtering in pageattr code

The pageattr code has a mode where it can set or clear PTE bits in
existing PTEs, so the page protections of the *new* PTEs come from
one of two places:

  1. The set/clear masks: cpa->mask_clr / cpa->mask_set
  2. The existing PTE

We filter ->mask_set/clr for supported PTE bits at entry to
__change_page_attr() so we never need to filter them again.

The only other place permissions can come from is an existing PTE
and those already presumably have good bits.  We do not need to filter
them again.

Signed-off-by: Dave Hansen 
Cc: Andrea Arcangeli 
Cc: Andy Lutomirski 
Cc: Arjan van de Ven 
Cc: Borislav Petkov 
Cc: Dan Williams 
Cc: David Woodhouse 
Cc: Greg Kroah-Hartman 
Cc: Hugh Dickins 
Cc: Josh Poimboeuf 
Cc: Juergen Gross 
Cc: Kees Cook 
Cc: Linus Torvalds 
Cc: Nadav Amit 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: linux...@kvack.org
Link: http://lkml.kernel.org/r/20180406205511.bc072...@viggo.jf.intel.com
Signed-off-by: Ingo Molnar 
---
 arch/x86/mm/pageattr.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c
index d3442dfdfced..968f51a2e39b 100644
--- a/arch/x86/mm/pageattr.c
+++ b/arch/x86/mm/pageattr.c
@@ -598,7 +598,6 @@ try_preserve_large_page(pte_t *kpte, unsigned long address,
req_prot = pgprot_clear_protnone_bits(req_prot);
if (pgprot_val(req_prot) & _PAGE_PRESENT)
pgprot_val(req_prot) |= _PAGE_PSE;
-   req_prot = canon_pgprot(req_prot);
 
/*
 * old_pfn points to the large page base pfn. So we need
@@ -718,7 +717,7 @@ __split_large_page(struct cpa_data *cpa, pte_t *kpte, 
unsigned long address,
 */
pfn = ref_pfn;
for (i = 0; i < PTRS_PER_PTE; i++, pfn += pfninc)
-   set_pte(&pbase[i], pfn_pte(pfn, canon_pgprot(ref_prot)));
+   set_pte(&pbase[i], pfn_pte(pfn, ref_prot));
 
if (virt_addr_valid(address)) {
unsigned long pfn = PFN_DOWN(__pa(address));
@@ -935,7 +934,6 @@ static void populate_pte(struct cpa_data *cpa,
pte = pte_offset_kernel(pmd, start);
 
pgprot = pgprot_clear_protnone_bits(pgprot);
-   pgprot = canon_pgprot(pgprot);
 
while (num_pages-- && start < end) {
set_pte(pte, pfn_pte(cpa->pfn, pgprot));
@@ -1234,7 +1232,7 @@ repeat:
 * after all we're only going to change it's attributes
 * not the memory it points to
 */
-   new_pte = pfn_pte(pfn, canon_pgprot(new_prot));
+   new_pte = pfn_pte(pfn, new_prot);
cpa->pfn = pfn;
/*
 * Do we really change anything ?

Re: [GIT PULL] asm-generic fixes for v4.17-rc1

2018-04-12 Thread Arnd Bergmann

On Thu, Apr 12, 2018 at 1:16 AM, Linus Torvalds
 wrote:
> On Wed, Apr 11, 2018 at 8:40 AM, Arnd Bergmann  wrote:
>>
>> are available in the git repository at:
>>
>>   
>> git+ssh://gitol...@ra.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic.git
>> tags/asm-generic
>
> Yeah, no they aren't available there at all.
>
> That tag is some old tag from two years ago that just contains your
> ancient "asm-generic: use compat version for preadv2 and pwritev2".
>
> Forgot to push out? Or forgot to use "-f" to overwrite the old tag?

Yes, something like that, I first tagged the local branch in the wrong
tree which had an old branch of the same name, noticed my mistake
as I pushed it, but then screwed up again when I tried to correct it:
I force-pushed the correct branch again, but not the tag.

Pushed the right tag now, please pull from

git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic.git
tags/asm-generic

Arnd

Re: [PATCH 3/3] tpm: Move eventlog declarations to its own header

2018-04-12 Thread Jarkko Sakkinen

On Thu, Apr 12, 2018 at 10:12:30AM +0300, Jarkko Sakkinen wrote:
> On Wed, Apr 11, 2018 at 02:55:01PM +0200, Thiebaud Weksteen wrote:
> > Reduce the size of tpm.h by moving eventlog declarations to a separate
> > header.
> > 
> > Signed-off-by: Thiebaud Weksteen 
> 
> Will be fine with suggested-by added.

I don't see anything else to complain in subsequent patches.

Fix kbuild issue and add suggested-by tags and I can move on to
testing. Thanks!

/Jarkko

[tip:x86/pti] x86/mm: Comment _PAGE_GLOBAL mystery

2018-04-12 Thread tip-bot for Dave Hansen

Commit-ID:  430d4005b8b41c19966dd3bfdb33004bdb2de01c
Gitweb: https://git.kernel.org/tip/430d4005b8b41c19966dd3bfdb33004bdb2de01c
Author: Dave Hansen 
AuthorDate: Fri, 6 Apr 2018 13:55:13 -0700
Committer:  Ingo Molnar 
CommitDate: Thu, 12 Apr 2018 09:05:58 +0200

x86/mm: Comment _PAGE_GLOBAL mystery

I was mystified as to where the _PAGE_GLOBAL in the kernel page tables
for kernel text came from.  I audited all the places I could find, but
I missed one: head_64.S.

The page tables that we create in here live for a long time, and they
also have _PAGE_GLOBAL set, despite whether the processor supports it
or not.  It's harmless, and we got *lucky* that the pageattr code
accidentally clears it when we wipe it out of __supported_pte_mask and
then later try to mark kernel text read-only.

Comment some of these properties to make it easier to find and
understand in the future.

Signed-off-by: Dave Hansen 
Cc: Andrea Arcangeli 
Cc: Andy Lutomirski 
Cc: Arjan van de Ven 
Cc: Borislav Petkov 
Cc: Dan Williams 
Cc: David Woodhouse 
Cc: Greg Kroah-Hartman 
Cc: Hugh Dickins 
Cc: Josh Poimboeuf 
Cc: Juergen Gross 
Cc: Kees Cook 
Cc: Linus Torvalds 
Cc: Nadav Amit 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: linux...@kvack.org
Link: http://lkml.kernel.org/r/20180406205513.079bb...@viggo.jf.intel.com
Signed-off-by: Ingo Molnar 
---
 arch/x86/kernel/head_64.S | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
index 48385c1074a5..8344dd2f310a 100644
--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -399,8 +399,13 @@ NEXT_PAGE(level3_ident_pgt)
.quad   level2_ident_pgt - __START_KERNEL_map + _KERNPG_TABLE_NOENC
.fill   511, 8, 0
 NEXT_PAGE(level2_ident_pgt)
-   /* Since I easily can, map the first 1G.
+   /*
+* Since I easily can, map the first 1G.
 * Don't set NX because code runs from these pages.
+*
+* Note: This sets _PAGE_GLOBAL despite whether
+* the CPU supports it or it is enabled.  But,
+* the CPU should ignore the bit.
 */
PMDS(0, __PAGE_KERNEL_IDENT_LARGE_EXEC, PTRS_PER_PMD)
 #else
@@ -431,6 +436,10 @@ NEXT_PAGE(level2_kernel_pgt)
 * (NOTE: at +512MB starts the module area, see MODULES_VADDR.
 *  If you want to increase this then increase MODULES_VADDR
 *  too.)
+*
+*  This table is eventually used by the kernel during normal
+*  runtime.  Care must be taken to clear out undesired bits
+*  later, like _PAGE_RW or _PAGE_GLOBAL in some cases.
 */
PMDS(0, __PAGE_KERNEL_LARGE_EXEC,
KERNEL_IMAGE_SIZE/PMD_SIZE)

[tip:x86/pti] x86/mm: Do not forbid _PAGE_RW before init for __ro_after_init

2018-04-12 Thread tip-bot for Dave Hansen

Commit-ID:  639d6aafe437a7464399d2a77d006049053df06f
Gitweb: https://git.kernel.org/tip/639d6aafe437a7464399d2a77d006049053df06f
Author: Dave Hansen 
AuthorDate: Fri, 6 Apr 2018 13:55:14 -0700
Committer:  Ingo Molnar 
CommitDate: Thu, 12 Apr 2018 09:05:59 +0200

x86/mm: Do not forbid _PAGE_RW before init for __ro_after_init

__ro_after_init data gets stuck in the .rodata section.  That's normally
fine because the kernel itself manages the R/W properties.

But, if we run __change_page_attr() on an area which is __ro_after_init,
the .rodata checks will trigger and force the area to be immediately
read-only, even if it is early-ish in boot.  This caused problems when
trying to clear the _PAGE_GLOBAL bit for these area in the PTI code:
it cleared _PAGE_GLOBAL like I asked, but also took it up on itself
to clear _PAGE_RW.  The kernel then oopses the next time it wrote to
a __ro_after_init data structure.

To fix this, add the kernel_set_to_readonly check, just like we have
for kernel text, just a few lines below in this function.

Signed-off-by: Dave Hansen 
Acked-by: Kees Cook 
Cc: Andrea Arcangeli 
Cc: Andy Lutomirski 
Cc: Arjan van de Ven 
Cc: Borislav Petkov 
Cc: Dan Williams 
Cc: David Woodhouse 
Cc: Greg Kroah-Hartman 
Cc: Hugh Dickins 
Cc: Josh Poimboeuf 
Cc: Juergen Gross 
Cc: Linus Torvalds 
Cc: Nadav Amit 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: linux...@kvack.org
Link: http://lkml.kernel.org/r/20180406205514.8d898...@viggo.jf.intel.com
Signed-off-by: Ingo Molnar 
---
 arch/x86/mm/pageattr.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c
index 968f51a2e39b..a7324045d87d 100644
--- a/arch/x86/mm/pageattr.c
+++ b/arch/x86/mm/pageattr.c
@@ -298,9 +298,11 @@ static inline pgprot_t static_protections(pgprot_t prot, 
unsigned long address,
 
/*
 * The .rodata section needs to be read-only. Using the pfn
-* catches all aliases.
+* catches all aliases.  This also includes __ro_after_init,
+* so do not enforce until kernel_set_to_readonly is true.
 */
-   if (within(pfn, __pa_symbol(__start_rodata) >> PAGE_SHIFT,
+   if (kernel_set_to_readonly &&
+   within(pfn, __pa_symbol(__start_rodata) >> PAGE_SHIFT,
   __pa_symbol(__end_rodata) >> PAGE_SHIFT))
pgprot_val(forbidden) |= _PAGE_RW;

[tip:x86/pti] x86/pti: Enable global pages for shared areas

2018-04-12 Thread tip-bot for Dave Hansen

Commit-ID:  0f561fce4d6979a50415616896512f87a6d1d5c8
Gitweb: https://git.kernel.org/tip/0f561fce4d6979a50415616896512f87a6d1d5c8
Author: Dave Hansen 
AuthorDate: Fri, 6 Apr 2018 13:55:15 -0700
Committer:  Ingo Molnar 
CommitDate: Thu, 12 Apr 2018 09:05:59 +0200

x86/pti: Enable global pages for shared areas

The entry/exit text and cpu_entry_area are mapped into userspace and
the kernel.  But, they are not _PAGE_GLOBAL.  This creates unnecessary
TLB misses.

Add the _PAGE_GLOBAL flag for these areas.

Signed-off-by: Dave Hansen 
Cc: Andrea Arcangeli 
Cc: Andy Lutomirski 
Cc: Arjan van de Ven 
Cc: Borislav Petkov 
Cc: Dan Williams 
Cc: David Woodhouse 
Cc: Greg Kroah-Hartman 
Cc: Hugh Dickins 
Cc: Josh Poimboeuf 
Cc: Juergen Gross 
Cc: Kees Cook 
Cc: Linus Torvalds 
Cc: Nadav Amit 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: linux...@kvack.org
Link: http://lkml.kernel.org/r/20180406205515.2977e...@viggo.jf.intel.com
Signed-off-by: Ingo Molnar 
---
 arch/x86/mm/cpu_entry_area.c | 14 +-
 arch/x86/mm/pti.c| 23 ++-
 2 files changed, 35 insertions(+), 2 deletions(-)

diff --git a/arch/x86/mm/cpu_entry_area.c b/arch/x86/mm/cpu_entry_area.c
index 476d810639a8..b45f5aaefd74 100644
--- a/arch/x86/mm/cpu_entry_area.c
+++ b/arch/x86/mm/cpu_entry_area.c
@@ -27,8 +27,20 @@ EXPORT_SYMBOL(get_cpu_entry_area);
 void cea_set_pte(void *cea_vaddr, phys_addr_t pa, pgprot_t flags)
 {
unsigned long va = (unsigned long) cea_vaddr;
+   pte_t pte = pfn_pte(pa >> PAGE_SHIFT, flags);
 
-   set_pte_vaddr(va, pfn_pte(pa >> PAGE_SHIFT, flags));
+   /*
+* The cpu_entry_area is shared between the user and kernel
+* page tables.  All of its ptes can safely be global.
+* _PAGE_GLOBAL gets reused to help indicate PROT_NONE for
+* non-present PTEs, so be careful not to set it in that
+* case to avoid confusion.
+*/
+   if (boot_cpu_has(X86_FEATURE_PGE) &&
+   (pgprot_val(flags) & _PAGE_PRESENT))
+   pte = pte_set_flags(pte, _PAGE_GLOBAL);
+
+   set_pte_vaddr(va, pte);
 }
 
 static void __init
diff --git a/arch/x86/mm/pti.c b/arch/x86/mm/pti.c
index 631507f0c198..8082f8b0c10e 100644
--- a/arch/x86/mm/pti.c
+++ b/arch/x86/mm/pti.c
@@ -299,6 +299,27 @@ pti_clone_pmds(unsigned long start, unsigned long end, 
pmdval_t clear)
if (WARN_ON(!target_pmd))
return;
 
+   /*
+* Only clone present PMDs.  This ensures only setting
+* _PAGE_GLOBAL on present PMDs.  This should only be
+* called on well-known addresses anyway, so a non-
+* present PMD would be a surprise.
+*/
+   if (WARN_ON(!(pmd_flags(*pmd) & _PAGE_PRESENT)))
+   return;
+
+   /*
+* Setting 'target_pmd' below creates a mapping in both
+* the user and kernel page tables.  It is effectively
+* global, so set it as global in both copies.  Note:
+* the X86_FEATURE_PGE check is not _required_ because
+* the CPU ignores _PAGE_GLOBAL when PGE is not
+* supported.  The check keeps consistentency with
+* code that only set this bit when supported.
+*/
+   if (boot_cpu_has(X86_FEATURE_PGE))
+   *pmd = pmd_set_flags(*pmd, _PAGE_GLOBAL);
+
/*
 * Copy the PMD.  That is, the kernelmode and usermode
 * tables will share the last-level page tables of this
@@ -348,7 +369,7 @@ static void __init pti_clone_entry_text(void)
 {
pti_clone_pmds((unsigned long) __entry_text_start,
(unsigned long) __irqentry_text_end,
-  _PAGE_RW | _PAGE_GLOBAL);
+  _PAGE_RW);
 }
 
 /*

[tip:x86/pti] x86/pti: Leave kernel text global for !PCID

2018-04-12 Thread tip-bot for Dave Hansen

Commit-ID:  8c06c7740d191b9055cb9be920579d5ecdd26303
Gitweb: https://git.kernel.org/tip/8c06c7740d191b9055cb9be920579d5ecdd26303
Author: Dave Hansen 
AuthorDate: Fri, 6 Apr 2018 13:55:18 -0700
Committer:  Ingo Molnar 
CommitDate: Thu, 12 Apr 2018 09:06:00 +0200

x86/pti: Leave kernel text global for !PCID

Global pages are bad for hardening because they potentially let an
exploit read the kernel image via a Meltdown-style attack which
makes it easier to find gadgets.

But, global pages are good for performance because they reduce TLB
misses when making user/kernel transitions, especially when PCIDs
are not available, such as on older hardware, or where a hypervisor
has disabled them for some reason.

This patch implements a basic, sane policy: If you have PCIDs, you
only map a minimal amount of kernel text global.  If you do not have
PCIDs, you map all kernel text global.

This policy effectively makes PCIDs something that not only adds
performance but a little bit of hardening as well.

I ran a simple "lseek" microbenchmark[1] to test the benefit on
a modern Atom microserver.  Most of the benefit comes from applying
the series before this patch ("entry only"), but there is still a
signifiant benefit from this patch.

  No Global Lines (baseline  ): 6077741 lseeks/sec
  88 Global Lines (entry only): 7528609 lseeks/sec (+23.9%)
  94 Global Lines (this patch): 8433111 lseeks/sec (+38.8%)

[1.] https://github.com/antonblanchard/will-it-scale/blob/master/tests/lseek1.c

Signed-off-by: Dave Hansen 
Cc: Andrea Arcangeli 
Cc: Andy Lutomirski 
Cc: Arjan van de Ven 
Cc: Borislav Petkov 
Cc: Dan Williams 
Cc: David Woodhouse 
Cc: Greg Kroah-Hartman 
Cc: Hugh Dickins 
Cc: Josh Poimboeuf 
Cc: Juergen Gross 
Cc: Kees Cook 
Cc: Linus Torvalds 
Cc: Nadav Amit 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: linux...@kvack.org
Link: http://lkml.kernel.org/r/20180406205518.e3d98...@viggo.jf.intel.com
Signed-off-by: Ingo Molnar 
---
 arch/x86/include/asm/pti.h |  2 ++
 arch/x86/mm/init_64.c  |  6 
 arch/x86/mm/pti.c  | 78 +++---
 3 files changed, 82 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/pti.h b/arch/x86/include/asm/pti.h
index 0b5ef05b2d2d..38a17f1d5c9d 100644
--- a/arch/x86/include/asm/pti.h
+++ b/arch/x86/include/asm/pti.h
@@ -6,8 +6,10 @@
 #ifdef CONFIG_PAGE_TABLE_ISOLATION
 extern void pti_init(void);
 extern void pti_check_boottime_disable(void);
+extern void pti_clone_kernel_text(void);
 #else
 static inline void pti_check_boottime_disable(void) { }
+static inline void pti_clone_kernel_text(void) { }
 #endif
 
 #endif /* __ASSEMBLY__ */
diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index e6c52dbbf649..6d1ff39c2438 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -1290,6 +1290,12 @@ void mark_rodata_ro(void)
(unsigned long) __va(__pa_symbol(_sdata)));
 
debug_checkwx();
+
+   /*
+* Do this after all of the manipulation of the
+* kernel text page tables are complete.
+*/
+   pti_clone_kernel_text();
 }
 
 int kern_addr_valid(unsigned long addr)
diff --git a/arch/x86/mm/pti.c b/arch/x86/mm/pti.c
index 1470b173963f..f1fd52f449e0 100644
--- a/arch/x86/mm/pti.c
+++ b/arch/x86/mm/pti.c
@@ -66,12 +66,22 @@ static void __init pti_print_if_secure(const char *reason)
pr_info("%s\n", reason);
 }
 
+enum pti_mode {
+   PTI_AUTO = 0,
+   PTI_FORCE_OFF,
+   PTI_FORCE_ON
+} pti_mode;
+
 void __init pti_check_boottime_disable(void)
 {
char arg[5];
int ret;
 
+   /* Assume mode is auto unless overridden. */
+   pti_mode = PTI_AUTO;
+
if (hypervisor_is_type(X86_HYPER_XEN_PV)) {
+   pti_mode = PTI_FORCE_OFF;
pti_print_if_insecure("disabled on XEN PV.");
return;
}
@@ -79,18 +89,23 @@ void __init pti_check_boottime_disable(void)
ret = cmdline_find_option(boot_command_line, "pti", arg, sizeof(arg));
if (ret > 0)  {
if (ret == 3 && !strncmp(arg, "off", 3)) {
+   pti_mode = PTI_FORCE_OFF;
pti_print_if_insecure("disabled on command line.");
return;
}
if (ret == 2 && !strncmp(arg, "on", 2)) {
+   pti_mode = PTI_FORCE_ON;
pti_print_if_secure("force enabled on command line.");
goto enable;
}
-   if (ret == 4 && !strncmp(arg, "auto", 4))
+   if (ret == 4 && !strncmp(arg, "auto", 4)) {
+   pti_mode = PTI_AUTO;
goto autosel;
+   }
}
 
if (cmdline_find_option_bool(boot_command_line, "nopti")) {
+   pti_mode = PTI_FORCE_OFF;
pti_print_if_insecure("disabled on command line.");
return;
}
@@ -149,7 +164,7 @@

Re: [PATCH] mtd: nand: raw: atmel: add module param to avoid using dma

2018-04-12 Thread Peter Rosin

On 2018-04-11 17:34, Nicolas Ferre wrote:
> On 11/04/2018 at 16:44, Peter Rosin wrote:
>> Hi Nicolas,
>>
>> Boris asked for your input on this (the datasheet difference appears to
>> have no bearing on the issue) elsewhere in the tree of messages. It's
>> now been a week or so and I'm starting to wonder if you missed this
>> altogether or if you are simply out of office or something?
> 
> Hi Peter,
> 
> I didn't dig into this issue with matrix datasheet reset values and your 
> findings below. I'll try to move forward with your detailed explanation 
> and with my contacts within the "product" team internally.

Thanks, I feel that experimenting with the matrix with bogus documentation
is harder than needed, so I'm waiting for that information.

> However, I have the feeling that this sounds a little bit familiar to me 
> and that the pins drive strength for the NAND Flash *and* LCD must be 
> positioned to "Medium drive" at least for these interfaces (register 
> PIO_CFGR).
> 
> We use this particular setting for our own vendor branch and found that 
> the LCD and NAND Flash was far more sable on *some HW boards*. Here is 
> an example for NAND but you can find the same for LCD:
> https://github.com/linux4sam/linux-at91/commit/99d9e4c8848a2f16cc5b34bb27e588ca7504b695
> Obviously the "drive strength" property added by Ludovic has been 
> proposed but is not accepted yet in Mainline and this is why you don't 
> see it positioned here.

Ok, translating this from SAMA5D2 to SAMA5D3 (which is what I have), I
assume this is PIO_DRIVER1 and PIO_DRIVER2 instead. Peeking at those
registers they all contain 0x, so I guess all pins are already
"Medium drive" on my board. Also, looking at that sama5d2-ptc-ek patch
it seems possible to adjust the drive strength of the nand D lines,
and I don't think that's possible on the SAMA5D3? The NAND uses D0-D15
on our board, but there is no alternative use for those pins.

So I can change the drive-strength for these LCD pins: LCDDAT0-15 (only
using rgb565), LCDVSYNC, LCDHSYNC, LCDPCK and LCDDEN (LCDDISP is not
used). And for the NAND I can fiddle with NANDALE and NANDCLE.

I.e. PA0-15, PA26-29 and PE21-22

I tried to change the drive strength of these pins to both "Low Drive"
and "High Drive" and it didn't have any visible effect on the display
artifacts during NAND accesses.

> If it feels like an issue with "crosstalk" it might be the reason why. 
> For overruns or underruns, it's true that I would say that it's not 
> related and that dealing with the matrix is the way to go.

It does feel like underruns and that the LCD controller isn't able to
keep up with the needed tempo of the display output. At least that is
consistent with how the artifacts look.

> You can simply test this using devmem2 and see if it's better.

See above.

> Hope that it helps.

Sorry, but no disco. Thanks though!

Cheers,
Peter

> Best regards,
>Nicolas
> 
>> On 2018-04-03 09:18, Boris Brezillon wrote:
>>> On Tue, 3 Apr 2018 08:11:30 +0200
>>> Peter Rosin  wrote:
>>>
 On 2018-04-02 22:20, Boris Brezillon wrote:
> On Mon, 2 Apr 2018 21:28:43 +0200
> Boris Brezillon  wrote:
>
>> On Mon, 2 Apr 2018 19:59:39 +0200
>> Peter Rosin  wrote:
>>   
>>> On 2018-04-02 14:22, Boris Brezillon wrote:
 On Thu, 29 Mar 2018 16:27:12 +0200
 Peter Rosin  wrote:

> On 2018-03-29 15:44, Boris Brezillon wrote:
>> On Thu, 29 Mar 2018 15:37:43 +0200
>> Peter Rosin  wrote:
>>  
>>> On 2018-03-29 15:33, Boris Brezillon wrote:
 On Thu, 29 Mar 2018 15:10:54 +0200
 Peter Rosin  wrote:

> On a sama5d31 with a Full-HD dual LVDS panel (132MHz pixel clock) 
> NAND
> flash accesses have a tendency to cause display disturbances. Add 
> a
> module param to disable DMA from the NAND controller, since that 
> fixes
> the display problem for me.
>
> Signed-off-by: Peter Rosin 
> ---
>   drivers/mtd/nand/raw/atmel/nand-controller.c | 7 ++-
>   1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/mtd/nand/raw/atmel/nand-controller.c 
> b/drivers/mtd/nand/raw/atmel/nand-controller.c
> index b2f00b398490..2ff7a77c7b8e 100644
> --- a/drivers/mtd/nand/raw/atmel/nand-controller.c
> +++ b/drivers/mtd/nand/raw/atmel/nand-controller.c
> @@ -129,6 +129,11 @@
>   #define DEFAULT_TIMEOUT_MS  1000
>   #define MIN_DMA_LEN 128
>   
> +static bool atmel_nand_avoid_dma __read_mostly;
> +
> +MODULE_PARM_DESC(avoiddma, "Avoid using DMA");
> +module_param_named(avoiddma, atmel_nand_avoid_dma, bool, 0400);
>>

RE: [PATCH][v3] tools/power turbostat: if --max_loop, print for specific time of loops

2018-04-12 Thread Doug Smythies

On 2018.04.11 03:31 Yu Chen wrote: 

> From: Chen Yu 
>
> There's a use case during test to only print specific round of loops
> if --iterations is specified, for example, with this patch applied:
>
> turbostat -i 5 -t 4
> will capture 4 samples with 5 seconds interval.

Hi Yu,

This would be a very useful addition to turbostat.
Please also update the man turbostat man page.

tools/power/x86/turbostat/turbostat.8

... Doug

[tip:x86/pti] x86/pti: Never implicitly clear _PAGE_GLOBAL for kernel image

2018-04-12 Thread tip-bot for Dave Hansen

Commit-ID:  39114b7a743e6759bab4d96b7d9651d44d17e3f9
Gitweb: https://git.kernel.org/tip/39114b7a743e6759bab4d96b7d9651d44d17e3f9
Author: Dave Hansen 
AuthorDate: Fri, 6 Apr 2018 13:55:17 -0700
Committer:  Ingo Molnar 
CommitDate: Thu, 12 Apr 2018 09:06:00 +0200

x86/pti: Never implicitly clear _PAGE_GLOBAL for kernel image

Summary:

In current kernels, with PTI enabled, no pages are marked Global. This
potentially increases TLB misses.  But, the mechanism by which the Global
bit is set and cleared is rather haphazard.  This patch makes the process
more explicit.  In the end, it leaves us with Global entries in the page
tables for the areas truly shared by userspace and kernel and increases
TLB hit rates.

The place this patch really shines in on systems without PCIDs.  In this
case, we are using an lseek microbenchmark[1] to see how a reasonably
non-trivial syscall behaves.  Higher is better:

  No Global pages (baseline): 6077741 lseeks/sec
  88 Global Pages (this set): 7528609 lseeks/sec (+23.9%)

On a modern Skylake desktop with PCIDs, the benefits are tangible, but not
huge for a kernel compile (lower is better):

  No Global pages (baseline): 186.951 seconds time elapsed  ( +-  0.35% )
  28 Global pages (this set): 185.756 seconds time elapsed  ( +-  0.09% )
   -1.195 seconds (-0.64%)

I also re-checked everything using the lseek1 test[1]:

  No Global pages (baseline): 15783951 lseeks/sec
  28 Global pages (this set): 16054688 lseeks/sec
 +270737 lseeks/sec (+1.71%)

The effect is more visible, but still modest.

Details:

The kernel page tables are inherited from head_64.S which rudely marks
them as _PAGE_GLOBAL.  For PTI, we have been relying on the grace of
$DEITY and some insane behavior in pageattr.c to clear _PAGE_GLOBAL.
This patch tries to do better.

First, stop filtering out "unsupported" bits from being cleared in the
pageattr code.  It's fine to filter out *setting* these bits but it
is insane to keep us from clearing them.

Then, *explicitly* go clear _PAGE_GLOBAL from the kernel identity map.
Do not rely on pageattr to do it magically.

After this patch, we can see that "GLB" shows up in each copy of the
page tables, that we have the same number of global entries in each
and that they are the *same* entries.

  /sys/kernel/debug/page_tables/current_kernel:11
  /sys/kernel/debug/page_tables/current_user:11
  /sys/kernel/debug/page_tables/kernel:11

  9caae8ad6a1fb53aca2407ec037f612d  current_kernel.GLB
  9caae8ad6a1fb53aca2407ec037f612d  current_user.GLB
  9caae8ad6a1fb53aca2407ec037f612d  kernel.GLB

A quick visual audit also shows that all the entries make sense.
0xfe00 is the cpu_entry_area and 0x81c0
is the entry/exit text:

  0xfe00-0xfe002000   8K ro GLB 
NX pte
  0xfe002000-0xfe003000   4K RW GLB 
NX pte
  0xfe003000-0xfe006000  12K ro GLB 
NX pte
  0xfe006000-0xfe007000   4K ro GLB 
x  pte
  0xfe007000-0xfe00d000  24K RW GLB 
NX pte
  0xfe02d000-0xfe02e000   4K ro GLB 
NX pte
  0xfe02e000-0xfe02f000   4K RW GLB 
NX pte
  0xfe02f000-0xfe032000  12K ro GLB 
NX pte
  0xfe032000-0xfe033000   4K ro GLB 
x  pte
  0xfe033000-0xfe039000  24K RW GLB 
NX pte
  0x81c0-0x81e0   2M ro PSE GLB 
x  pmd

[1.] https://github.com/antonblanchard/will-it-scale/blob/master/tests/lseek1.c

Signed-off-by: Dave Hansen 
Cc: Andrea Arcangeli 
Cc: Andy Lutomirski 
Cc: Arjan van de Ven 
Cc: Borislav Petkov 
Cc: Dan Williams 
Cc: David Woodhouse 
Cc: Greg Kroah-Hartman 
Cc: Hugh Dickins 
Cc: Josh Poimboeuf 
Cc: Juergen Gross 
Cc: Kees Cook 
Cc: Linus Torvalds 
Cc: Nadav Amit 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: linux...@kvack.org
Link: http://lkml.kernel.org/r/20180406205517.c80fb...@viggo.jf.intel.com
Signed-off-by: Ingo Molnar 
---
 arch/x86/mm/init.c |  8 +---
 arch/x86/mm/pageattr.c | 12 +---
 arch/x86/mm/pti.c  | 25 +
 3 files changed, 35 insertions(+), 10 deletions(-)

diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
index 583a88c8a6ee..fec82b577c18 100644
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -161,12 +161,6 @@ struct map_range {
 
 static int page_size_mask;
 
-static void enable_global_pages(void)
-{
-   if (!static_cpu_has(X86_FEATURE_PTI))
-   __supported_pte_mask |= _PAGE_GLOBAL;
-}
-
 static void __init probe_page_size_mask(void)
 {
/*
@@ -187,7 +181,7 @@ static void __init probe_page_size_mask(void)
__supported_pte_mask &=

Re: [PATCH v2] usb: typec: ucsi: fix tracepoint related build error

2018-04-12 Thread Heikki Krogerus

On Tue, Apr 10, 2018 at 10:38:06AM +0200, Tobias Regnery wrote:
> There is the following build error with CONFIG_TYPEC_UCSI=m, CONFIG_FTRACE=y
> and CONFIG_TRACING=n:
> 
> ERROR: "__tracepoint_ucsi_command" [drivers/usb/typec/ucsi/typec_ucsi.ko] 
> undefined!
> ERROR: "__tracepoint_ucsi_register_port" 
> [drivers/usb/typec/ucsi/typec_ucsi.ko] undefined!
> ERROR: "__tracepoint_ucsi_notify" [drivers/usb/typec/ucsi/typec_ucsi.ko] 
> undefined!
> ERROR: "__tracepoint_ucsi_reset_ppm" [drivers/usb/typec/ucsi/typec_ucsi.ko] 
> undefined!
> ERROR: "__tracepoint_ucsi_run_command" [drivers/usb/typec/ucsi/typec_ucsi.ko] 
> undefined!
> ERROR: "__tracepoint_ucsi_ack" [drivers/usb/typec/ucsi/typec_ucsi.ko] 
> undefined!
> ERROR: "__tracepoint_ucsi_connector_change" 
> [drivers/usb/typec/ucsi/typec_ucsi.ko] undefined!
> 
> This compination is quite hard to create because CONFIG_TRACING gets selected
> only in rare cases without CONFIG_FTRACE.
> 
> The build failure is caused by conditionally compiling trace.c depending on
> the wrong option CONFIG_FTRACE. Change this to depend on CONFIG_TRACING like
> other users of tracepoints do.
> 
> Fixes: c1b0bc2dabfa ("usb: typec: Add support for UCSI interface")
> Signed-off-by: Tobias Regnery 

Acked-by: Heikki Krogerus 


Thanks,

-- 
heikki

Re: [PATCH][v3] tools/power turbostat: if --max_loop, print for specific time of loops

2018-04-12 Thread Yu Chen

Hi Doug,
On Thu, Apr 12, 2018 at 12:18:44AM -0700, Doug Smythies wrote:
> On 2018.04.11 03:31 Yu Chen wrote: 
> 
> > From: Chen Yu 
> >
> > There's a use case during test to only print specific round of loops
> > if --iterations is specified, for example, with this patch applied:
> >
> > turbostat -i 5 -t 4
> > will capture 4 samples with 5 seconds interval.
> 
> Hi Yu,
> 
> This would be a very useful addition to turbostat.
> Please also update the man turbostat man page.
> 
> tools/power/x86/turbostat/turbostat.8
> 
OK, I will do this.
Thanks,
Yu
> ... Doug
> 
>

Re: [Xen-devel] [PATCH] x86/xen: zero MSR_IA32_SPEC_CTRL before suspend

2018-04-12 Thread Ingo Molnar


* Jan Beulich  wrote:

> >>> On 11.04.18 at 13:53,  wrote:
> > * Jan Beulich  wrote:
> > 
> >> Additionally, x86 maintainers: is there a particular reason this (or
> >> any functionally equivalent patch) isn't upstream yet? As indicated
> >> before, I had not been able to find any discussion, and hence I
> >> see no reason why this is a patch we effectively carry privately in
> >> our distro branches (and likely other distros do so too).
> > 
> > The patch was merged 6 weeks ago and is now upstream:
> > 
> >   71c208dd54ab: x86/xen: Zero MSR_IA32_SPEC_CTRL before suspend
> 
> I'm sorry, but no, this isn't the patch I was inquiring about.
> Instead I'm wondering of the disposition of the patch disabling
> IBRS around a CPU going idle.

Got any specific link or subject line for that submission?

Thanks,

Ingo

Re: [PATCH 7/9] mtd: nand: qcom: check for operation errors in case of raw read

2018-04-12 Thread Abhishek Sahu


On 2018-04-10 15:42, Miquel Raynal wrote:

Hi Abhishek,

On Wed,  4 Apr 2018 18:12:23 +0530, Abhishek Sahu
 wrote:


Currently there is no error checking for raw read. For raw
reads, there won’t be any ECC failure but the operational
failures are possible so schedule the NAND_FLASH_STATUS read
after each codeword.

Signed-off-by: Abhishek Sahu 
---
 drivers/mtd/nand/qcom_nandc.c | 56 
+++

 1 file changed, 46 insertions(+), 10 deletions(-)

diff --git a/drivers/mtd/nand/qcom_nandc.c 
b/drivers/mtd/nand/qcom_nandc.c

index dce97e8..40c790e 100644
--- a/drivers/mtd/nand/qcom_nandc.c
+++ b/drivers/mtd/nand/qcom_nandc.c
@@ -1099,7 +1099,8 @@ static void config_nand_page_read(struct 
qcom_nand_controller *nandc)

  * Helper to prepare DMA descriptors for configuring registers
  * before reading each codeword in NAND page.
  */
-static void config_nand_cw_read(struct qcom_nand_controller *nandc)
+static void
+config_nand_cw_read(struct qcom_nand_controller *nandc, bool use_ecc)
 {
if (nandc->props->is_bam)
write_reg_dma(nandc, NAND_READ_LOCATION_0, 4,
@@ -1108,19 +1109,25 @@ static void config_nand_cw_read(struct 
qcom_nand_controller *nandc)

write_reg_dma(nandc, NAND_FLASH_CMD, 1, NAND_BAM_NEXT_SGL);
write_reg_dma(nandc, NAND_EXEC_CMD, 1, NAND_BAM_NEXT_SGL);

-   read_reg_dma(nandc, NAND_FLASH_STATUS, 2, 0);
-   read_reg_dma(nandc, NAND_ERASED_CW_DETECT_STATUS, 1,
-NAND_BAM_NEXT_SGL);
+   if (use_ecc) {
+   read_reg_dma(nandc, NAND_FLASH_STATUS, 2, 0);
+   read_reg_dma(nandc, NAND_ERASED_CW_DETECT_STATUS, 1,
+NAND_BAM_NEXT_SGL);
+   } else {
+   read_reg_dma(nandc, NAND_FLASH_STATUS, 1, NAND_BAM_NEXT_SGL);
+   }
 }

 /*
  * Helper to prepare dma descriptors to configure registers needed 
for reading a

  * single codeword in page
  */
-static void config_nand_single_cw_page_read(struct 
qcom_nand_controller *nandc)

+static void
+config_nand_single_cw_page_read(struct qcom_nand_controller *nandc,
+   bool use_ecc)
 {
config_nand_page_read(nandc);
-   config_nand_cw_read(nandc);
+   config_nand_cw_read(nandc, use_ecc);
 }

 /*
@@ -1201,7 +1208,7 @@ static int nandc_param(struct qcom_nand_host 
*host)

nandc->buf_count = 512;
memset(nandc->data_buffer, 0xff, nandc->buf_count);

-   config_nand_single_cw_page_read(nandc);
+   config_nand_single_cw_page_read(nandc, false);

read_data_dma(nandc, FLASH_BUF_ACC, nandc->data_buffer,
  nandc->buf_count, 0);
@@ -1565,6 +1572,23 @@ struct read_stats {
__le32 erased_cw;
 };

+/* reads back FLASH_STATUS register set by the controller */
+static int check_flash_errors(struct qcom_nand_host *host, int 
cw_cnt)

+{
+   struct nand_chip *chip = &host->chip;
+   struct qcom_nand_controller *nandc = get_qcom_nand_controller(chip);
+   int i;
+
+   for (i = 0; i < cw_cnt; i++) {
+   u32 flash = le32_to_cpu(nandc->reg_read_buf[i]);
+
+   if (flash & (FS_OP_ERR | FS_MPU_ERR))
+   return -EIO;


This is already checked in parse_read_error(), maybe it would be
preferable to have different path inside this function depending on the
'raw' nature of the operation?



 Thanks Miquel,

 The parse_read_error will be called only for reads with ECC enabled
 which uses 3 status registers. It has other code also related with
 erased page detection and more code will be added in last patch
 for bitflip detection.

 For all others cases, only one status register FLASH_STATUS needs
 to be checked and this check_flash_errors does the same.


+   }
+
+   return 0;
+}
+
 /*
  * reads back status registers set by the controller to notify page 
read

  * errors. this is equivalent to what 'ecc->correct()' would do.
@@ -1707,7 +1731,7 @@ static int read_page_ecc(struct qcom_nand_host 
*host, u8 *data_buf,

}
}

-   config_nand_cw_read(nandc);
+   config_nand_cw_read(nandc, true);

if (data_buf)
read_data_dma(nandc, FLASH_BUF_ACC, data_buf,
@@ -1771,7 +1795,7 @@ static int copy_last_cw(struct qcom_nand_host 
*host, int page)

set_address(host, host->cw_size * (ecc->steps - 1), page);
update_rw_regs(host, 1, true);

-   config_nand_single_cw_page_read(nandc);
+   config_nand_single_cw_page_read(nandc, host->use_ecc);

read_data_dma(nandc, FLASH_BUF_ACC, nandc->data_buffer, size, 0);

@@ -1781,6 +1805,15 @@ static int copy_last_cw(struct qcom_nand_host 
*host, int page)


free_descs(nandc);

+   if (!ret) {
+   if (host->use_ecc)
+   ret = parse_read_errors(host, nandc->data_buffer,
+   nandc->data_buffer + size,
+

Re: [PATCH v1 0/2] mm: migrate: vm event counter for hugepage migration

2018-04-12 Thread Naoya Horiguchi

On Thu, Apr 12, 2018 at 08:18:59AM +0200, Michal Hocko wrote:
> On Wed 11-04-18 17:09:25, Naoya Horiguchi wrote:
> > Hi everyone,
> > 
> > I wrote patches introducing separate vm event counters for hugepage 
> > migration
> > (both for hugetlb and thp.)
> > Hugepage migration is different from normal page migration in event 
> > frequency
> > and/or how likely it succeeds, so maintaining statistics for them in mixed
> > counters might not be helpful both for develors and users.
> 
> This is quite a lot of code to be added se we should better document
> what it is intended for. Sure I understand your reasonaning about huge
> pages are more likely to fail but is this really worth a separate
> counter? Do you have an example of how this would be useful?

Our customers periodically collect some log info to understand what
happened after system failures happen.  Then if we have separate counters
for hugepage migration and the values show some anomaly, that might
help admins and developers understand the issue more quickly.
We have other ways to get this info like checking /proc/pid/pagemap and
/proc/kpageflags, but they are costly and most users decide not to
collect them in periodical logging.

> 
> If we are there then what about different huge page sizes (for hugetlb)?
> Do we need per-hstate stats?

Yes, per-hstate counters are better. And existing hugetlb counters
htlb_buddy_alloc_* are also affected by this point.

> 
> In other words, is this really worth it?

Actually, I'm not sure at this point.

Thanks,
Naoya Horiguchi

> 
> >  include/linux/vm_event_item.h |   7 +++
> >  mm/migrate.c  | 103 
> > +++---
> >  mm/vmstat.c   |   8 
> >  3 files changed, 102 insertions(+), 16 deletions(-)
> 
> -- 
> Michal Hocko
> SUSE Labs
>

Re: [PATCH] staging: wilc1000: Remove unnecessary braces {} around single statement block

2018-04-12 Thread Claudiu Beznea



On 10.04.2018 17:49, Eyal Ilsar wrote:
> Remove unnecessary braces {} around an 'if' statement block with a single 
> statement. Issue found by checkpatch.

You should add an empty line before "Signed-off" line as stated in [1]. I
would also add a space b/w your name and your email in Signed-off line as
is exemplified in [2].

[1]
https://www.kernel.org/doc/html/latest/process/submitting-patches.html#the-canonical-patch-format
[2] https://www.kernel.org/doc/html/latest/process/sub
mitting-patches.html#developer-s-certificate-of-origin-1-1

> Signed-off-by: Eyal Ilsar
> ---
> This is part of my take on the Eudyptula challenge
> 
>  drivers/staging/wilc1000/wilc_wfi_cfgoperations.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/drivers/staging/wilc1000/wilc_wfi_cfgoperations.c 
> b/drivers/staging/wilc1000/wilc_wfi_cfgoperations.c
> index 205304c..325afe1 100644
> --- a/drivers/staging/wilc1000/wilc_wfi_cfgoperations.c
> +++ b/drivers/staging/wilc1000/wilc_wfi_cfgoperations.c
> @@ -284,9 +284,8 @@ static void remove_network_from_shadow(struct timer_list 
> *unused)
>   }
>   }
>  
> - if (last_scanned_cnt != 0) {
> + if (last_scanned_cnt != 0)
>   mod_timer(&hAgingTimer, jiffies + msecs_to_jiffies(AGING_TIME));
> - }
>  }
>  
>  static void clear_duringIP(struct timer_list *unused)
>

Re: [PATCH] checkpatch: Add a --strict test for structs with bool member definitions

2018-04-12 Thread Ingo Molnar

* Peter Zijlstra  wrote:

> I still have room in my /dev/null mailbox for pure checkpatch patches.
> 
> > (ooh, https://lkml.org/lkml/2017/11/21/384 is working this morning)
> 
> Yes, we really should not use lkml.org for references. Sadly google
> displays it very prominently when you search for something.

lkml.org is nice in emails that have a short expected life time and relevance - 
but it probably shouldn't be used for permanent references such as kernel 
messages, code comments and Git log entries.

Thanks,

Ingo

Re: [PATCH v1 0/2] mm: migrate: vm event counter for hugepage migration

2018-04-12 Thread Michal Hocko

On Thu 12-04-18 07:40:41, Naoya Horiguchi wrote:
> On Thu, Apr 12, 2018 at 08:18:59AM +0200, Michal Hocko wrote:
> > On Wed 11-04-18 17:09:25, Naoya Horiguchi wrote:
> > > Hi everyone,
> > > 
> > > I wrote patches introducing separate vm event counters for hugepage 
> > > migration
> > > (both for hugetlb and thp.)
> > > Hugepage migration is different from normal page migration in event 
> > > frequency
> > > and/or how likely it succeeds, so maintaining statistics for them in mixed
> > > counters might not be helpful both for develors and users.
> > 
> > This is quite a lot of code to be added se we should better document
> > what it is intended for. Sure I understand your reasonaning about huge
> > pages are more likely to fail but is this really worth a separate
> > counter? Do you have an example of how this would be useful?
> 
> Our customers periodically collect some log info to understand what
> happened after system failures happen.  Then if we have separate counters
> for hugepage migration and the values show some anomaly, that might
> help admins and developers understand the issue more quickly.
> We have other ways to get this info like checking /proc/pid/pagemap and
> /proc/kpageflags, but they are costly and most users decide not to
> collect them in periodical logging.

Wouldn't tracepoints be more suitable for that purpose? They can collect
more valuable information.

> > If we are there then what about different huge page sizes (for hugetlb)?
> > Do we need per-hstate stats?
> 
> Yes, per-hstate counters are better. And existing hugetlb counters
> htlb_buddy_alloc_* are also affected by this point.

The thing is that this would bloat the code and the vmstat output even more.
I am not really convinced this is a great idea for something that
tracepoints would handle as well.
-- 
Michal Hocko
SUSE Labs

cifs buffer overflow in kernels before 3.7

2018-04-12 Thread Vlastimil Babka

Hi,

we have tracked down (in our 3.0-based kernel) a nasty overflow from
cifs_build_path_to_root() calling convert_delimiter() on a kmalloced
buffer that's not guaranteed to be null-terminated. AFAIU this happens
during mount of a share's subdirectory (vol->prepath is non-zero). This
was fixed (unknowingly) in 839db3d10a5b ("cifs: fix up handling of
prefixpath= option") in 3.7, so I believe there's at least the 3.2 LTS
affected. You could either backport the commit with followup fixes, or
do something like we did (below). Current mainline could also use
kzalloc() and move (or remove) the manual trailing null setting, as now
it maybe give false assurance to whoever will be modifying the code.

Vlastimil

-8<-
From: Vlastimil Babka 
Subject: [PATCH] cifs: fix buffer overflow in cifs_build_path_to_root()

After the strncpy() in cifs_build_path_to_root(), there is no guarantee of a
trailing null, because pplen is initialized by strlen() which doesn't include
it. Then convert_delimiter() is called before the trailing null is added, which
means it can overflow the kmalloced object and corrupt unrelated memory until
it hits a null byte.

Make sure pplen includes the trailing null in vol->prepath. Also use kzalloc()
and add the trailing null (now redundant) before convert_delimiter().

Reviewed-by: Aurelien Aptel 
Signed-off-by: Vlastimil Babka 
---
 fs/cifs/inode.c |6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

--- a/fs/cifs/inode.c
+++ b/fs/cifs/inode.c
@@ -792,7 +792,7 @@ static const struct inode_operations cif
 char *cifs_build_path_to_root(struct smb_vol *vol, struct cifs_sb_info 
*cifs_sb,
  struct cifs_tcon *tcon, int add_treename)
 {
-   int pplen = vol->prepath ? strlen(vol->prepath) : 0;
+   int pplen = vol->prepath ? strlen(vol->prepath) + 1: 0;
int dfsplen;
char *full_path = NULL;
 
@@ -809,15 +809,15 @@ char *cifs_build_path_to_root(struct smb
else
dfsplen = 0;
 
-   full_path = kmalloc(dfsplen + pplen + 1, GFP_KERNEL);
+   full_path = kzalloc(dfsplen + pplen + 1, GFP_KERNEL);
if (full_path == NULL)
return full_path;
 
if (dfsplen)
strncpy(full_path, tcon->treeName, dfsplen);
strncpy(full_path + dfsplen, vol->prepath, pplen);
-   convert_delimiter(full_path, CIFS_DIR_SEP(cifs_sb));
full_path[dfsplen + pplen] = 0; /* add trailing null */
+   convert_delimiter(full_path, CIFS_DIR_SEP(cifs_sb));
return full_path;
 }

Re: perf: fuzzer leads to trace_kprobe: Could not insert message flood

2018-04-12 Thread Ingo Molnar


* Song Liu  wrote:

> > spamming the kernel log ...
> 
> Yeah, the new API allows non-root user to trigger this message. We should 
> only 
> allow root to create kprobe with perf_event_open(). 
> 
> On the other hand, do we need to fix this for root? In fact, a simple bash 
> loop 
> can create something similar through the text interface (with root):
> 
> root@virt-test:~# for x in {0..5} ; do echo p:xx xx+$x >> 
> /sys/kernel/debug/tracing/kprobe_events ; done
> -bash: echo: write error: No such file or directory
> -bash: echo: write error: No such file or directory
> -bash: echo: write error: No such file or directory
> -bash: echo: write error: No such file or directory
> -bash: echo: write error: No such file or directory
> -bash: echo: write error: No such file or directory
> root@virt-test:~# dmesg | tail -n 5
> [  664.208374] trace_kprobe: Could not insert probe at xx+1: -2
> [  664.237882] trace_kprobe: Could not insert probe at xx+2: -2
> [  664.268067] trace_kprobe: Could not insert probe at xx+3: -2
> [  664.297395] trace_kprobe: Could not insert probe at xx+4: -2
> [  664.327614] trace_kprobe: Could not insert probe at xx+5: -2
> 
> This happens before the new API is introduced. 
> 
> The following patch does capable(CAP_SYS_ADMIN) for perf_kprobe and 
> perf_uprobe at an earlier stage, so non-root user cannot trigger 
> this error message. Please let me know whether we need to fix this 
> for root. 

That's two bugs then, and yes, I think we should fix the log spamming: what's 
the 
point? We already get an error code from the write.

I'll apply your CAP_SYS_ADMIN fix.

Thanks,

Ingo

Re: perf: fuzzer leads to trace_kprobe: Could not insert message flood

2018-04-12 Thread Ingo Molnar


* Song Liu  wrote:

> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index d7af828..2d5fe26 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -8400,6 +8400,10 @@ static int perf_kprobe_event_init(struct perf_event 
> *event)
> 
> if (event->attr.type != perf_kprobe.type)
> return -ENOENT;
> +
> +   if (!capable(CAP_SYS_ADMIN))
> +   return -EACCES;
> +
> /*
>  * no branch sampling for probe events
>  */
> @@ -8437,6 +8441,10 @@ static int perf_uprobe_event_init(struct perf_event 
> *event)
> 
> if (event->attr.type != perf_uprobe.type)
> return -ENOENT;
> +
> +   if (!capable(CAP_SYS_ADMIN))
> +   return -EACCES;

This is seriously whitespace damaged: all tabs are spaces ...

Thanks,

Ingo

Re: [Xen-devel] [PATCH] x86/xen: zero MSR_IA32_SPEC_CTRL before suspend

2018-04-12 Thread Jan Beulich

>>> On 12.04.18 at 09:32,  wrote:

> * Jan Beulich  wrote:
> 
>> >>> On 11.04.18 at 13:53,  wrote:
>> > * Jan Beulich  wrote:
>> > 
>> >> Additionally, x86 maintainers: is there a particular reason this (or
>> >> any functionally equivalent patch) isn't upstream yet? As indicated
>> >> before, I had not been able to find any discussion, and hence I
>> >> see no reason why this is a patch we effectively carry privately in
>> >> our distro branches (and likely other distros do so too).
>> > 
>> > The patch was merged 6 weeks ago and is now upstream:
>> > 
>> >   71c208dd54ab: x86/xen: Zero MSR_IA32_SPEC_CTRL before suspend
>> 
>> I'm sorry, but no, this isn't the patch I was inquiring about.
>> Instead I'm wondering of the disposition of the patch disabling
>> IBRS around a CPU going idle.
> 
> Got any specific link or subject line for that submission?

Sure, as written in the original response to Jürgen's patch:
https://patchwork.kernel.org/patch/10153843/

Jan

[tip:x86/pti] x86/pgtable: Don't set huge PUD/PMD on non-leaf entries

2018-04-12 Thread tip-bot for Joerg Roedel

Commit-ID:  e3e288121408c3abeed5af60b87b95c847143845
Gitweb: https://git.kernel.org/tip/e3e288121408c3abeed5af60b87b95c847143845
Author: Joerg Roedel 
AuthorDate: Wed, 11 Apr 2018 17:24:38 +0200
Committer:  Ingo Molnar 
CommitDate: Thu, 12 Apr 2018 09:41:41 +0200

x86/pgtable: Don't set huge PUD/PMD on non-leaf entries

The pmd_set_huge() and pud_set_huge() functions are used from
the generic ioremap() code to establish large mappings where this
is possible.

But the generic ioremap() code does not check whether the
PMD/PUD entries are already populated with a non-leaf entry,
so that any page-table pages these entries point to will be
lost.

Further, on x86-32 with SHARED_KERNEL_PMD=0, this causes a
BUG_ON() in vmalloc_sync_one() when PMD entries are synced
from swapper_pg_dir to the current page-table. This happens
because the PMD entry from swapper_pg_dir was promoted to a
huge-page entry while the current PGD still contains the
non-leaf entry. Because both entries are present and point
to a different page, the BUG_ON() triggers.

This was actually triggered with pti-x32 enabled in a KVM
virtual machine by the graphics driver.

A real and better fix for that would be to improve the
page-table handling in the generic ioremap() code. But that is
out-of-scope for this patch-set and left for later work.

Reported-by: David H. Gutteridge 
Signed-off-by: Joerg Roedel 
Reviewed-by: Thomas Gleixner 
Cc: Andrea Arcangeli 
Cc: Andy Lutomirski 
Cc: Boris Ostrovsky 
Cc: Borislav Petkov 
Cc: Brian Gerst 
Cc: Dave Hansen 
Cc: David Laight 
Cc: Denys Vlasenko 
Cc: Eduardo Valentin 
Cc: Greg KH 
Cc: Jiri Kosina 
Cc: Josh Poimboeuf 
Cc: Juergen Gross 
Cc: Linus Torvalds 
Cc: Pavel Machek 
Cc: Peter Zijlstra 
Cc: Waiman Long 
Cc: Will Deacon 
Cc: aligu...@amazon.com
Cc: daniel.gr...@iaik.tugraz.at
Cc: hu...@google.com
Cc: keesc...@google.com
Cc: linux...@kvack.org
Link: http://lkml.kernel.org/r/20180411152437.gc15...@8bytes.org
Signed-off-by: Ingo Molnar 
---
 arch/x86/mm/pgtable.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c
index d10a40aceeaa..ffc8c13c50e4 100644
--- a/arch/x86/mm/pgtable.c
+++ b/arch/x86/mm/pgtable.c
@@ -1,6 +1,7 @@
 // SPDX-License-Identifier: GPL-2.0
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -639,6 +640,10 @@ int pud_set_huge(pud_t *pud, phys_addr_t addr, pgprot_t 
prot)
(mtrr != MTRR_TYPE_WRBACK))
return 0;
 
+   /* Bail out if we are we on a populated non-leaf entry: */
+   if (pud_present(*pud) && !pud_huge(*pud))
+   return 0;
+
prot = pgprot_4k_2_large(prot);
 
set_pte((pte_t *)pud, pfn_pte(
@@ -667,6 +672,10 @@ int pmd_set_huge(pmd_t *pmd, phys_addr_t addr, pgprot_t 
prot)
return 0;
}
 
+   /* Bail out if we are we on a populated non-leaf entry: */
+   if (pmd_present(*pmd) && !pmd_huge(*pmd))
+   return 0;
+
prot = pgprot_4k_2_large(prot);
 
set_pte((pte_t *)pmd, pfn_pte(

[PATCH] timers: remove tvec_base struct declaration

2018-04-12 Thread Liu, Changcheng

timer wheel is implemented by timer_base. tvec_base is obsoleted,
so remove the type declaration.

Signed-off-by: Liu Changcheng 

diff --git a/include/linux/timer.h b/include/linux/timer.h
index 2448f9c..7b066fd 100644
--- a/include/linux/timer.h
+++ b/include/linux/timer.h
@@ -8,8 +8,6 @@
 #include 
 #include 
 
-struct tvec_base;
-
 struct timer_list {
/*
 * All fields that change during normal runtime grouped to the
-- 
2.7.4

[PATCH v2] staging: wilc1000: Remove unnecessary braces {} around single statement block

2018-04-12 Thread Eyal Ilsar

Remove unnecessary braces {} around an 'if' statement block with a single 
statement. Issue found by checkpatch.

Signed-off-by: Eyal Ilsar 
---
Added an empty line before the 'Signed-off-by' line and a space between the
name and e-mail address within that line.

 drivers/staging/wilc1000/wilc_wfi_cfgoperations.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/staging/wilc1000/wilc_wfi_cfgoperations.c 
b/drivers/staging/wilc1000/wilc_wfi_cfgoperations.c
index 205304c..325afe1 100644
--- a/drivers/staging/wilc1000/wilc_wfi_cfgoperations.c
+++ b/drivers/staging/wilc1000/wilc_wfi_cfgoperations.c
@@ -284,9 +284,8 @@ static void remove_network_from_shadow(struct timer_list 
*unused)
}
}
 
-   if (last_scanned_cnt != 0) {
+   if (last_scanned_cnt != 0)
mod_timer(&hAgingTimer, jiffies + msecs_to_jiffies(AGING_TIME));
-   }
 }
 
 static void clear_duringIP(struct timer_list *unused)
-- 
2.7.4

Re: [PATCH v4 1/8] arm: shmobile: Add the RZ/N1 arch to the shmobile Kconfig

2018-04-12 Thread Simon Horman

On Tue, Apr 10, 2018 at 09:30:01AM +0100, Michel Pollet wrote:
> Add the RZ/N1 Family (Part #R9A06G0xx) ARCH config to the rest of
> the Renesas SoC collection.
> 
> Signed-off-by: Michel Pollet 
> Reviewed-by: Geert Uytterhoeven 

This change has already been accepted for v4.18.

[GIT PULL] arch/microblaze patches for 4.17-rc1

2018-04-12 Thread Michal Simek

Hi,

please pull the following patches to your tree.

Thanks,
Michal


The following changes since commit c698ca5278934c0ae32297a8725ced2e27585d7f:

  Linux 4.16-rc6 (2018-03-18 17:48:42 -0700)

are available in the Git repository at:

  git://git.monstr.eu/linux-2.6-microblaze.git tags/microblaze-4.17-rc1

for you to fetch changes up to 70f6283a372bef685fd64564646a3b49a55be1ea:

  microblaze: Use generic pci_mmap_resource_range() (2018-03-19 15:33:05
+0100)


Microblaze patches for 4.17-rc1

- Use generic pci_mmap_resoruce_range()


David Woodhouse (1):
  microblaze: Use generic pci_mmap_resource_range()

Michal Simek (1):
  microblaze: Provide pgprot_device/writecombine macros for nommu

 arch/microblaze/include/asm/pci.h |  7 ---
 arch/microblaze/include/asm/pgtable.h |  2 ++
 arch/microblaze/pci/pci-common.c  | 99
+--
 3 files changed, 15 insertions(+), 93 deletions(-)


-- 
Michal Simek, Ing. (M.Eng), OpenPGP -> KeyID: FE3D1F91
w: www.monstr.eu p: +42-0-721842854
Maintainer of Linux kernel - Xilinx Microblaze
Maintainer of Linux kernel - Xilinx Zynq ARM and ZynqMP ARM64 SoCs
U-Boot custodian - Xilinx Microblaze/Zynq/ZynqMP SoCs




signature.asc
Description: OpenPGP digital signature

Re: [PATCH 9/9] mtd: nand: qcom: erased page bitflips detection

2018-04-12 Thread Abhishek Sahu


On 2018-04-10 16:00, Miquel Raynal wrote:

Hi Abhishek,

On Wed,  4 Apr 2018 18:12:25 +0530, Abhishek Sahu
 wrote:


Some of the newer nand parts can have bit flips in an erased
page due to the process technology used. In this case, qpic


AFAIK, this has always been possible, it was just rare.



 Yes Miquel. It was rare earlier.
 Now, we are observing this more for newer parts coming.


nand controller is not able to identify that page as an erased
page. Currently the driver calls nand_check_erased_ecc_chunk for
identifying the erased pages but this won’t work always since the
checking is being with ECC engine returned data. In case of
bitflips, the ECC engine tries to correct the data and then it
generates the uncorrectable error. Now, this data is not equal to
original raw data. For erased CW identification, the raw data
should be read again from NAND device and this
nand_check_erased_ecc_chunk function should be called for raw
data only.


Absolutely.



Now following logic is being added to identify the erased
codeword bitflips.

1. In most of the case, not all the codewords will have bitflips
   and only single CW will have bitflips. So, there is no need to
   read the complete raw page data. The NAND raw read can be
   scheduled for any CW in page. The NAND controller works on CW
   basis and it will update the status register after each CW read.
   Maintain the bitmask for the CW which generated the uncorrectable
   error.
2. Schedule the raw flash read from NAND flash device to
   NAND controller buffer for all these CWs between first and last
   uncorrectable errors CWs. Copy the content from NAND controller
   buffer to actual data buffer only for the uncorrectable errors
   CWs so that other CW data content won’t be affected, and
   unnecessary data copy can be avoided.


In case of uncorrectable error, the penalty is huge anyway.



 Yes. We can't avoid that.
 But we are reducing that by doing raw read for few subpages in
 which we got uncorrectale error.


3. Both DATA and OOB need to be checked for number of 0. The
   top-level API can be called with only data buf or oob buf so use
   chip->databuf if data buf is null and chip->oob_poi if
   oob buf is null for copying the raw bytes temporarily.


You can do that. But when you do, you should tell the core you used
that buffer and that it cannot rely on what is inside. Please
invalidate the page cache with:

chip->pagebuf = -1;



 Thanks Miquel. I will check and update the patch.


4. For each CW, check the number of 0 in cw_data and usable
   oob bytes, The bbm and spare bytes bit flip won’t affect the ECC
   so don’t check the number of bitflips in this area.


OOB is an area in which you are supposed to find the BBM, the ECC bytes
and the spare bytes. Spare bytes == usable OOB bytes. And the BBM
should be protected too. I don't get this sentence but I don't see its
application neither in the code?



 QCOM NAND layout does not support the BBM ECC protection.

 IN OOB,

 For all the possible layouts (4 bit RS/4 bit BCH/8 bit BCH)
 it has 16 usable OOB bytes which is protected with ECC.

 All the bytes in OOB other than BBM, ECC bytes and usable
 OOB bytes are ununsed.

 You can refer qcom_nand_host_setup for layout detail.

 Thanks,
 Abhishek

KMSAN: uninit-value in __netif_receive_skb_core

2018-04-12 Thread syzbot


Hello,

syzbot hit the following crash on  
https://github.com/google/kmsan.git/master commit

e2ab7e8abba47a2f2698216258e5d8727ae58717 (Fri Apr 6 16:24:31 2018 +)
kmsan: temporarily disable visitAsmInstruction() to help syzbot
syzbot dashboard link:  
https://syzkaller.appspot.com/bug?extid=b202b7208664142954fa


Unfortunately, I don't have any reproducer for this crash yet.
Raw console output:  
https://syzkaller.appspot.com/x/log.txt?id=535651643762
Kernel config:  
https://syzkaller.appspot.com/x/.config?id=6627248707860932248

compiler: clang version 7.0.0 (trunk 329391)

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+b202b720866414295...@syzkaller.appspotmail.com
It will help syzbot understand when the bug is fixed. See footer for  
details.

If you forward the report, please keep this part and the footer.

==
BUG: KMSAN: uninit-value in __read_once_size include/linux/compiler.h:197  
[inline]
BUG: KMSAN: uninit-value in deliver_ptype_list_skb net/core/dev.c:1908  
[inline]
BUG: KMSAN: uninit-value in __netif_receive_skb_core+0x4630/0x4a80  
net/core/dev.c:4545

CPU: 0 PID: 5999 Comm: syz-executor3 Not tainted 4.16.0+ #82
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011

Call Trace:
 
 __dump_stack lib/dump_stack.c:17 [inline]
 dump_stack+0x185/0x1d0 lib/dump_stack.c:53
 kmsan_report+0x142/0x240 mm/kmsan/kmsan.c:1067
 __msan_warning_32+0x6c/0xb0 mm/kmsan/kmsan_instr.c:676
 __read_once_size include/linux/compiler.h:197 [inline]
 deliver_ptype_list_skb net/core/dev.c:1908 [inline]
 __netif_receive_skb_core+0x4630/0x4a80 net/core/dev.c:4545
 __netif_receive_skb net/core/dev.c:4627 [inline]
 process_backlog+0x62d/0xe20 net/core/dev.c:5307
 napi_poll net/core/dev.c:5705 [inline]
 net_rx_action+0x7c1/0x1a70 net/core/dev.c:5771
 __do_softirq+0x56d/0x93d kernel/softirq.c:285
 do_softirq_own_stack+0x2a/0x40 arch/x86/entry/entry_64.S:1040
 
 do_softirq kernel/softirq.c:329 [inline]
 __local_bh_enable_ip+0x114/0x140 kernel/softirq.c:182
 local_bh_enable+0x36/0x40 include/linux/bottom_half.h:32
 rcu_read_unlock_bh include/linux/rcupdate.h:726 [inline]
 __dev_queue_xmit+0x2a31/0x2b60 net/core/dev.c:3584
 dev_queue_xmit+0x4b/0x60 net/core/dev.c:3590
 packet_snd net/packet/af_packet.c:2944 [inline]
 packet_sendmsg+0x7c57/0x8a10 net/packet/af_packet.c:2969
 sock_sendmsg_nosec net/socket.c:630 [inline]
 sock_sendmsg net/socket.c:640 [inline]
 sock_write_iter+0x3b9/0x470 net/socket.c:909
 do_iter_readv_writev+0x7bb/0x970 include/linux/fs.h:1776
 do_iter_write+0x30d/0xd40 fs/read_write.c:932
 vfs_writev fs/read_write.c:977 [inline]
 do_writev+0x3c9/0x830 fs/read_write.c:1012
 SYSC_writev+0x9b/0xb0 fs/read_write.c:1085
 SyS_writev+0x56/0x80 fs/read_write.c:1082
 do_syscall_64+0x309/0x430 arch/x86/entry/common.c:287
 entry_SYSCALL_64_after_hwframe+0x3d/0xa2
RIP: 0033:0x455259
RSP: 002b:7fb53ede8c68 EFLAGS: 0246 ORIG_RAX: 0014
RAX: ffda RBX: 7fb53ede96d4 RCX: 00455259
RDX: 0001 RSI: 200010c0 RDI: 0013
RBP: 0072bea0 R08:  R09: 
R10:  R11: 0246 R12: 
R13: 06cd R14: 006fd3d8 R15: 

Uninit was stored to memory at:
 kmsan_save_stack_with_flags mm/kmsan/kmsan.c:278 [inline]
 kmsan_save_stack mm/kmsan/kmsan.c:293 [inline]
 kmsan_internal_chain_origin+0x12b/0x210 mm/kmsan/kmsan.c:684
 __msan_chain_origin+0x69/0xc0 mm/kmsan/kmsan_instr.c:521
 skb_vlan_untag+0x950/0xee0 include/linux/if_vlan.h:597
 __netif_receive_skb_core+0x70a/0x4a80 net/core/dev.c:4460
 __netif_receive_skb net/core/dev.c:4627 [inline]
 process_backlog+0x62d/0xe20 net/core/dev.c:5307
 napi_poll net/core/dev.c:5705 [inline]
 net_rx_action+0x7c1/0x1a70 net/core/dev.c:5771
 __do_softirq+0x56d/0x93d kernel/softirq.c:285
Uninit was created at:
 kmsan_save_stack_with_flags mm/kmsan/kmsan.c:278 [inline]
 kmsan_internal_poison_shadow+0xb8/0x1b0 mm/kmsan/kmsan.c:188
 kmsan_kmalloc+0x94/0x100 mm/kmsan/kmsan.c:314
 kmsan_slab_alloc+0x11/0x20 mm/kmsan/kmsan.c:321
 slab_post_alloc_hook mm/slab.h:445 [inline]
 slab_alloc_node mm/slub.c:2737 [inline]
 __kmalloc_node_track_caller+0xaed/0x11c0 mm/slub.c:4369
 __kmalloc_reserve net/core/skbuff.c:138 [inline]
 __alloc_skb+0x2cf/0x9f0 net/core/skbuff.c:206
 alloc_skb include/linux/skbuff.h:984 [inline]
 alloc_skb_with_frags+0x1d4/0xb20 net/core/skbuff.c:5234
 sock_alloc_send_pskb+0xb56/0x1190 net/core/sock.c:2085
 packet_alloc_skb net/packet/af_packet.c:2803 [inline]
 packet_snd net/packet/af_packet.c:2894 [inline]
 packet_sendmsg+0x6444/0x8a10 net/packet/af_packet.c:2969
 sock_sendmsg_nosec net/socket.c:630 [inline]
 sock_sendmsg net/socket.c:640 [inline]
 sock_write_iter+0x3b9/0x470 net/socket.c:909
 do_iter_readv_writev+0x7bb/0x970 include/linux/fs.h:1776
 do_iter_write+0x30d/

KMSAN: uninit-value in netif_skb_features

2018-04-12 Thread syzbot


Hello,

syzbot hit the following crash on  
https://github.com/google/kmsan.git/master commit

e2ab7e8abba47a2f2698216258e5d8727ae58717 (Fri Apr 6 16:24:31 2018 +)
kmsan: temporarily disable visitAsmInstruction() to help syzbot
syzbot dashboard link:  
https://syzkaller.appspot.com/bug?extid=0bbe42c764feafa82c5a


So far this crash happened 30 times on  
https://github.com/google/kmsan.git/master.

C reproducer: https://syzkaller.appspot.com/x/repro.c?id=4850744041668608
syzkaller reproducer:  
https://syzkaller.appspot.com/x/repro.syz?id=6289386287136768
Raw console output:  
https://syzkaller.appspot.com/x/log.txt?id=4577411249209344
Kernel config:  
https://syzkaller.appspot.com/x/.config?id=6627248707860932248

compiler: clang version 7.0.0 (trunk 329391)

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+0bbe42c764feafa82...@syzkaller.appspotmail.com
It will help syzbot understand when the bug is fixed. See footer for  
details.

If you forward the report, please keep this part and the footer.

==
BUG: KMSAN: uninit-value in eth_type_vlan include/linux/if_vlan.h:283  
[inline]
BUG: KMSAN: uninit-value in skb_vlan_tagged_multi  
include/linux/if_vlan.h:656 [inline]
BUG: KMSAN: uninit-value in vlan_features_check include/linux/if_vlan.h:672  
[inline]

BUG: KMSAN: uninit-value in dflt_features_check net/core/dev.c:2949 [inline]
BUG: KMSAN: uninit-value in netif_skb_features+0xd1b/0xdc0  
net/core/dev.c:3009

CPU: 1 PID: 3582 Comm: syzkaller435149 Not tainted 4.16.0+ #82
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011

Call Trace:
 __dump_stack lib/dump_stack.c:17 [inline]
 dump_stack+0x185/0x1d0 lib/dump_stack.c:53
 kmsan_report+0x142/0x240 mm/kmsan/kmsan.c:1067
 __msan_warning_32+0x6c/0xb0 mm/kmsan/kmsan_instr.c:676
 eth_type_vlan include/linux/if_vlan.h:283 [inline]
 skb_vlan_tagged_multi include/linux/if_vlan.h:656 [inline]
 vlan_features_check include/linux/if_vlan.h:672 [inline]
 dflt_features_check net/core/dev.c:2949 [inline]
 netif_skb_features+0xd1b/0xdc0 net/core/dev.c:3009
 validate_xmit_skb+0x89/0x1320 net/core/dev.c:3084
 __dev_queue_xmit+0x1cb2/0x2b60 net/core/dev.c:3549
 dev_queue_xmit+0x4b/0x60 net/core/dev.c:3590
 packet_snd net/packet/af_packet.c:2944 [inline]
 packet_sendmsg+0x7c57/0x8a10 net/packet/af_packet.c:2969
 sock_sendmsg_nosec net/socket.c:630 [inline]
 sock_sendmsg net/socket.c:640 [inline]
 sock_write_iter+0x3b9/0x470 net/socket.c:909
 do_iter_readv_writev+0x7bb/0x970 include/linux/fs.h:1776
 do_iter_write+0x30d/0xd40 fs/read_write.c:932
 vfs_writev fs/read_write.c:977 [inline]
 do_writev+0x3c9/0x830 fs/read_write.c:1012
 SYSC_writev+0x9b/0xb0 fs/read_write.c:1085
 SyS_writev+0x56/0x80 fs/read_write.c:1082
 do_syscall_64+0x309/0x430 arch/x86/entry/common.c:287
 entry_SYSCALL_64_after_hwframe+0x3d/0xa2
RIP: 0033:0x43ffa9
RSP: 002b:7fff2cff3948 EFLAGS: 0217 ORIG_RAX: 0014
RAX: ffda RBX: 004002c8 RCX: 0043ffa9
RDX: 0001 RSI: 2080 RDI: 0003
RBP: 006cb018 R08:  R09: 
R10:  R11: 0217 R12: 004018d0
R13: 00401960 R14:  R15: 

Uninit was created at:
 kmsan_save_stack_with_flags mm/kmsan/kmsan.c:278 [inline]
 kmsan_internal_poison_shadow+0xb8/0x1b0 mm/kmsan/kmsan.c:188
 kmsan_kmalloc+0x94/0x100 mm/kmsan/kmsan.c:314
 kmsan_slab_alloc+0x11/0x20 mm/kmsan/kmsan.c:321
 slab_post_alloc_hook mm/slab.h:445 [inline]
 slab_alloc_node mm/slub.c:2737 [inline]
 __kmalloc_node_track_caller+0xaed/0x11c0 mm/slub.c:4369
 __kmalloc_reserve net/core/skbuff.c:138 [inline]
 __alloc_skb+0x2cf/0x9f0 net/core/skbuff.c:206
 alloc_skb include/linux/skbuff.h:984 [inline]
 alloc_skb_with_frags+0x1d4/0xb20 net/core/skbuff.c:5234
 sock_alloc_send_pskb+0xb56/0x1190 net/core/sock.c:2085
 packet_alloc_skb net/packet/af_packet.c:2803 [inline]
 packet_snd net/packet/af_packet.c:2894 [inline]
 packet_sendmsg+0x6444/0x8a10 net/packet/af_packet.c:2969
 sock_sendmsg_nosec net/socket.c:630 [inline]
 sock_sendmsg net/socket.c:640 [inline]
 sock_write_iter+0x3b9/0x470 net/socket.c:909
 do_iter_readv_writev+0x7bb/0x970 include/linux/fs.h:1776
 do_iter_write+0x30d/0xd40 fs/read_write.c:932
 vfs_writev fs/read_write.c:977 [inline]
 do_writev+0x3c9/0x830 fs/read_write.c:1012
 SYSC_writev+0x9b/0xb0 fs/read_write.c:1085
 SyS_writev+0x56/0x80 fs/read_write.c:1082
 do_syscall_64+0x309/0x430 arch/x86/entry/common.c:287
 entry_SYSCALL_64_after_hwframe+0x3d/0xa2
==


---
This bug is generated by a dumb bot. It may contain errors.
See https://goo.gl/tpsmEJ for details.
Direct all questions to syzkal...@googlegroups.com.

syzbot will keep track of this bug report.
If you forgot to add the Reported-by tag, once the fix for t

Re: [PATCH] swiotlb: Fix unexpected swiotlb_alloc_coherent() failures

2018-04-12 Thread Takashi Iwai

On Thu, 12 Apr 2018 08:02:27 +0200,
Christoph Hellwig wrote:
> 
> On Wed, Apr 11, 2018 at 09:28:54AM +0200, Takashi Iwai wrote:
> > > But we should try a GFP_DMA32 allocation first, so this is a bit
> > > surprising.
> > 
> > Hm, do we really try that?
> > Through a quick glance, dma_alloc_coherent_gfp_flags() gives GFP_DMA32
> > only when coherent mask <= DMA_BIT_MASK(32); in the case of iwlwifi,
> > it's 36bit, so GFP_DMA isn't set.
> 
> Oh, yes - it is using an odd dma mask, and amdgpu seems to use an
> just as odd 40-bit dma mask.
> 
> > We had a fallback allocation with GFP_DMA32 in the past, but this
> > seems gone long time ago along with cleanups (commit c647c3bb2d16).
> > 
> > But I haven't followed about this topic for long time, so I might have
> > missed obviously...
> 
> I think a fallback would be much better here rather than relying on the
> limited swiotlb buffer bool.  dma_direct_alloc (which in 4.17 is also
> used for x86) already has a GFP_DMA fallback, so extending this for
> GFP_DMA32 as well would seem reasonable.
> 
> Any volunteers?

Below is a quick attempt, totally untested.  Actually the retry with
GFP_DMA is superfluous for archs without it, so the first patch
corrects it.  The second patch adds the retry with GFP_DMA32.

I'll resubmit properly if these are OK (and better if anyone could
test them :)


thanks,

Takashi



0001-dma-direct-Don-t-repeat-allocation-for-no-op-GFP_DMA.patch
Description: Binary data


0002-dma-direct-Try-reallocation-with-GFP_DMA32-if-possib.patch
Description: Binary data

Re: KMSAN: uninit-value in netif_skb_features

2018-04-12 Thread Dmitry Vyukov

On Thu, Apr 12, 2018 at 10:01 AM, syzbot
 wrote:
> Hello,
>
> syzbot hit the following crash on https://github.com/google/kmsan.git/master
> commit
> e2ab7e8abba47a2f2698216258e5d8727ae58717 (Fri Apr 6 16:24:31 2018 +)
> kmsan: temporarily disable visitAsmInstruction() to help syzbot
> syzbot dashboard link:
> https://syzkaller.appspot.com/bug?extid=0bbe42c764feafa82c5a
>
> So far this crash happened 30 times on
> https://github.com/google/kmsan.git/master.
> C reproducer: https://syzkaller.appspot.com/x/repro.c?id=4850744041668608
> syzkaller reproducer:
> https://syzkaller.appspot.com/x/repro.syz?id=6289386287136768
> Raw console output:
> https://syzkaller.appspot.com/x/log.txt?id=4577411249209344
> Kernel config:
> https://syzkaller.appspot.com/x/.config?id=6627248707860932248
> compiler: clang version 7.0.0 (trunk 329391)

+Toshiaki as this seems to be related to the recent vlan tagging changes.
This also seems to be related to
https://groups.google.com/d/msg/syzkaller-bugs/FNEavkB4QaM/efXl2AeRBgAJ



> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> Reported-by: syzbot+0bbe42c764feafa82...@syzkaller.appspotmail.com
> It will help syzbot understand when the bug is fixed. See footer for
> details.
> If you forward the report, please keep this part and the footer.
>
> ==
> BUG: KMSAN: uninit-value in eth_type_vlan include/linux/if_vlan.h:283
> [inline]
> BUG: KMSAN: uninit-value in skb_vlan_tagged_multi
> include/linux/if_vlan.h:656 [inline]
> BUG: KMSAN: uninit-value in vlan_features_check include/linux/if_vlan.h:672
> [inline]
> BUG: KMSAN: uninit-value in dflt_features_check net/core/dev.c:2949 [inline]
> BUG: KMSAN: uninit-value in netif_skb_features+0xd1b/0xdc0
> net/core/dev.c:3009
> CPU: 1 PID: 3582 Comm: syzkaller435149 Not tainted 4.16.0+ #82
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> Google 01/01/2011
> Call Trace:
>  __dump_stack lib/dump_stack.c:17 [inline]
>  dump_stack+0x185/0x1d0 lib/dump_stack.c:53
>  kmsan_report+0x142/0x240 mm/kmsan/kmsan.c:1067
>  __msan_warning_32+0x6c/0xb0 mm/kmsan/kmsan_instr.c:676
>  eth_type_vlan include/linux/if_vlan.h:283 [inline]
>  skb_vlan_tagged_multi include/linux/if_vlan.h:656 [inline]
>  vlan_features_check include/linux/if_vlan.h:672 [inline]
>  dflt_features_check net/core/dev.c:2949 [inline]
>  netif_skb_features+0xd1b/0xdc0 net/core/dev.c:3009
>  validate_xmit_skb+0x89/0x1320 net/core/dev.c:3084
>  __dev_queue_xmit+0x1cb2/0x2b60 net/core/dev.c:3549
>  dev_queue_xmit+0x4b/0x60 net/core/dev.c:3590
>  packet_snd net/packet/af_packet.c:2944 [inline]
>  packet_sendmsg+0x7c57/0x8a10 net/packet/af_packet.c:2969
>  sock_sendmsg_nosec net/socket.c:630 [inline]
>  sock_sendmsg net/socket.c:640 [inline]
>  sock_write_iter+0x3b9/0x470 net/socket.c:909
>  do_iter_readv_writev+0x7bb/0x970 include/linux/fs.h:1776
>  do_iter_write+0x30d/0xd40 fs/read_write.c:932
>  vfs_writev fs/read_write.c:977 [inline]
>  do_writev+0x3c9/0x830 fs/read_write.c:1012
>  SYSC_writev+0x9b/0xb0 fs/read_write.c:1085
>  SyS_writev+0x56/0x80 fs/read_write.c:1082
>  do_syscall_64+0x309/0x430 arch/x86/entry/common.c:287
>  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
> RIP: 0033:0x43ffa9
> RSP: 002b:7fff2cff3948 EFLAGS: 0217 ORIG_RAX: 0014
> RAX: ffda RBX: 004002c8 RCX: 0043ffa9
> RDX: 0001 RSI: 2080 RDI: 0003
> RBP: 006cb018 R08:  R09: 
> R10:  R11: 0217 R12: 004018d0
> R13: 00401960 R14:  R15: 
>
> Uninit was created at:
>  kmsan_save_stack_with_flags mm/kmsan/kmsan.c:278 [inline]
>  kmsan_internal_poison_shadow+0xb8/0x1b0 mm/kmsan/kmsan.c:188
>  kmsan_kmalloc+0x94/0x100 mm/kmsan/kmsan.c:314
>  kmsan_slab_alloc+0x11/0x20 mm/kmsan/kmsan.c:321
>  slab_post_alloc_hook mm/slab.h:445 [inline]
>  slab_alloc_node mm/slub.c:2737 [inline]
>  __kmalloc_node_track_caller+0xaed/0x11c0 mm/slub.c:4369
>  __kmalloc_reserve net/core/skbuff.c:138 [inline]
>  __alloc_skb+0x2cf/0x9f0 net/core/skbuff.c:206
>  alloc_skb include/linux/skbuff.h:984 [inline]
>  alloc_skb_with_frags+0x1d4/0xb20 net/core/skbuff.c:5234
>  sock_alloc_send_pskb+0xb56/0x1190 net/core/sock.c:2085
>  packet_alloc_skb net/packet/af_packet.c:2803 [inline]
>  packet_snd net/packet/af_packet.c:2894 [inline]
>  packet_sendmsg+0x6444/0x8a10 net/packet/af_packet.c:2969
>  sock_sendmsg_nosec net/socket.c:630 [inline]
>  sock_sendmsg net/socket.c:640 [inline]
>  sock_write_iter+0x3b9/0x470 net/socket.c:909
>  do_iter_readv_writev+0x7bb/0x970 include/linux/fs.h:1776
>  do_iter_write+0x30d/0xd40 fs/read_write.c:932
>  vfs_writev fs/read_write.c:977 [inline]
>  do_writev+0x3c9/0x830 fs/read_write.c:1012
>  SYSC_writev+0x9b/0xb0 fs/read_write.c:1085
>  SyS_writev+0x56/0x80 fs/read_write.c:1082
>  do_syscall_64+0x309/0x430 arch/x86

Re: [PATCH] mmap.2: document new MAP_FIXED_NOREPLACE flag

2018-04-12 Thread Michael Kerrisk (man-pages)

Hello Michal,

On 04/11/2018 02:04 PM, mho...@kernel.org wrote:
> From: Michal Hocko 
> 
> 4.17+ kernels offer a new MAP_FIXED_NOREPLACE flag which allows the caller to
> atomicaly probe for a given address range.
> 
> [wording heavily updated by John Hubbard ]
> Signed-off-by: Michal Hocko 

Thanks! I've applied your patch, and done a little tweaking. The results
have already been pushed.

Cheers

Michael


> ---
> Hi,
> Andrew's sent the MAP_FIXED_NOREPLACE to Linus for the upcoming merge
> window. So here we go with the man page update.
> 
>  man2/mmap.2 | 27 +++
>  1 file changed, 27 insertions(+)
> 
> diff --git a/man2/mmap.2 b/man2/mmap.2
> index ea64eb8f0dcc..f702f3e4eba2 100644
> --- a/man2/mmap.2
> +++ b/man2/mmap.2
> @@ -261,6 +261,27 @@ Examples include
>  and the PAM libraries
>  .UR http://www.linux-pam.org
>  .UE .
> +Newer kernels
> +(Linux 4.17 and later) have a
> +.B MAP_FIXED_NOREPLACE
> +option that avoids the corruption problem; if available, MAP_FIXED_NOREPLACE
> +should be preferred over MAP_FIXED.
> +.TP
> +.BR MAP_FIXED_NOREPLACE " (since Linux 4.17)"
> +Similar to MAP_FIXED with respect to the
> +.I
> +addr
> +enforcement, but different in that MAP_FIXED_NOREPLACE never clobbers a 
> pre-existing
> +mapped range. If the requested range would collide with an existing
> +mapping, then this call fails with
> +.B EEXIST.
> +This flag can therefore be used as a way to atomically (with respect to other
> +threads) attempt to map an address range: one thread will succeed; all others
> +will report failure. Please note that older kernels which do not recognize 
> this
> +flag will typically (upon detecting a collision with a pre-existing mapping)
> +fall back to a "non-MAP_FIXED" type of behavior: they will return an address 
> that
> +is different than the requested one. Therefore, backward-compatible software
> +should check the returned address against the requested address.
>  .TP
>  .B MAP_GROWSDOWN
>  This flag is used for stacks.
> @@ -487,6 +508,12 @@ is not a valid file descriptor (and
>  .B MAP_ANONYMOUS
>  was not set).
>  .TP
> +.B EEXIST
> +range covered by
> +.IR addr ,
> +.IR length
> +is clashing with an existing mapping.
> +.TP
>  .B EINVAL
>  We don't like
>  .IR addr ,
> 


-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

Re: KMSAN: uninit-value in __netif_receive_skb_core

2018-04-12 Thread Dmitry Vyukov

On Thu, Apr 12, 2018 at 10:01 AM, syzbot
 wrote:
> Hello,
>
> syzbot hit the following crash on https://github.com/google/kmsan.git/master
> commit
> e2ab7e8abba47a2f2698216258e5d8727ae58717 (Fri Apr 6 16:24:31 2018 +)
> kmsan: temporarily disable visitAsmInstruction() to help syzbot
> syzbot dashboard link:
> https://syzkaller.appspot.com/bug?extid=b202b7208664142954fa
>
> Unfortunately, I don't have any reproducer for this crash yet.
> Raw console output:
> https://syzkaller.appspot.com/x/log.txt?id=535651643762
> Kernel config:
> https://syzkaller.appspot.com/x/.config?id=6627248707860932248
> compiler: clang version 7.0.0 (trunk 329391)

+Toshiaki as this seems to be related to the recent vlan tagging changes.
This also seems to be related to
https://groups.google.com/d/msg/syzkaller-bugs/VRH9NnUi2k0/90GYsAeRBgAJ

> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> Reported-by: syzbot+b202b720866414295...@syzkaller.appspotmail.com
> It will help syzbot understand when the bug is fixed. See footer for
> details.
> If you forward the report, please keep this part and the footer.
>
> ==
> BUG: KMSAN: uninit-value in __read_once_size include/linux/compiler.h:197
> [inline]
> BUG: KMSAN: uninit-value in deliver_ptype_list_skb net/core/dev.c:1908
> [inline]
> BUG: KMSAN: uninit-value in __netif_receive_skb_core+0x4630/0x4a80
> net/core/dev.c:4545
> CPU: 0 PID: 5999 Comm: syz-executor3 Not tainted 4.16.0+ #82
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> Google 01/01/2011
> Call Trace:
>  
>  __dump_stack lib/dump_stack.c:17 [inline]
>  dump_stack+0x185/0x1d0 lib/dump_stack.c:53
>  kmsan_report+0x142/0x240 mm/kmsan/kmsan.c:1067
>  __msan_warning_32+0x6c/0xb0 mm/kmsan/kmsan_instr.c:676
>  __read_once_size include/linux/compiler.h:197 [inline]
>  deliver_ptype_list_skb net/core/dev.c:1908 [inline]
>  __netif_receive_skb_core+0x4630/0x4a80 net/core/dev.c:4545
>  __netif_receive_skb net/core/dev.c:4627 [inline]
>  process_backlog+0x62d/0xe20 net/core/dev.c:5307
>  napi_poll net/core/dev.c:5705 [inline]
>  net_rx_action+0x7c1/0x1a70 net/core/dev.c:5771
>  __do_softirq+0x56d/0x93d kernel/softirq.c:285
>  do_softirq_own_stack+0x2a/0x40 arch/x86/entry/entry_64.S:1040
>  
>  do_softirq kernel/softirq.c:329 [inline]
>  __local_bh_enable_ip+0x114/0x140 kernel/softirq.c:182
>  local_bh_enable+0x36/0x40 include/linux/bottom_half.h:32
>  rcu_read_unlock_bh include/linux/rcupdate.h:726 [inline]
>  __dev_queue_xmit+0x2a31/0x2b60 net/core/dev.c:3584
>  dev_queue_xmit+0x4b/0x60 net/core/dev.c:3590
>  packet_snd net/packet/af_packet.c:2944 [inline]
>  packet_sendmsg+0x7c57/0x8a10 net/packet/af_packet.c:2969
>  sock_sendmsg_nosec net/socket.c:630 [inline]
>  sock_sendmsg net/socket.c:640 [inline]
>  sock_write_iter+0x3b9/0x470 net/socket.c:909
>  do_iter_readv_writev+0x7bb/0x970 include/linux/fs.h:1776
>  do_iter_write+0x30d/0xd40 fs/read_write.c:932
>  vfs_writev fs/read_write.c:977 [inline]
>  do_writev+0x3c9/0x830 fs/read_write.c:1012
>  SYSC_writev+0x9b/0xb0 fs/read_write.c:1085
>  SyS_writev+0x56/0x80 fs/read_write.c:1082
>  do_syscall_64+0x309/0x430 arch/x86/entry/common.c:287
>  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
> RIP: 0033:0x455259
> RSP: 002b:7fb53ede8c68 EFLAGS: 0246 ORIG_RAX: 0014
> RAX: ffda RBX: 7fb53ede96d4 RCX: 00455259
> RDX: 0001 RSI: 200010c0 RDI: 0013
> RBP: 0072bea0 R08:  R09: 
> R10:  R11: 0246 R12: 
> R13: 06cd R14: 006fd3d8 R15: 
>
> Uninit was stored to memory at:
>  kmsan_save_stack_with_flags mm/kmsan/kmsan.c:278 [inline]
>  kmsan_save_stack mm/kmsan/kmsan.c:293 [inline]
>  kmsan_internal_chain_origin+0x12b/0x210 mm/kmsan/kmsan.c:684
>  __msan_chain_origin+0x69/0xc0 mm/kmsan/kmsan_instr.c:521
>  skb_vlan_untag+0x950/0xee0 include/linux/if_vlan.h:597
>  __netif_receive_skb_core+0x70a/0x4a80 net/core/dev.c:4460
>  __netif_receive_skb net/core/dev.c:4627 [inline]
>  process_backlog+0x62d/0xe20 net/core/dev.c:5307
>  napi_poll net/core/dev.c:5705 [inline]
>  net_rx_action+0x7c1/0x1a70 net/core/dev.c:5771
>  __do_softirq+0x56d/0x93d kernel/softirq.c:285
> Uninit was created at:
>  kmsan_save_stack_with_flags mm/kmsan/kmsan.c:278 [inline]
>  kmsan_internal_poison_shadow+0xb8/0x1b0 mm/kmsan/kmsan.c:188
>  kmsan_kmalloc+0x94/0x100 mm/kmsan/kmsan.c:314
>  kmsan_slab_alloc+0x11/0x20 mm/kmsan/kmsan.c:321
>  slab_post_alloc_hook mm/slab.h:445 [inline]
>  slab_alloc_node mm/slub.c:2737 [inline]
>  __kmalloc_node_track_caller+0xaed/0x11c0 mm/slub.c:4369
>  __kmalloc_reserve net/core/skbuff.c:138 [inline]
>  __alloc_skb+0x2cf/0x9f0 net/core/skbuff.c:206
>  alloc_skb include/linux/skbuff.h:984 [inline]
>  alloc_skb_with_frags+0x1d4/0xb20 net/core/skbuff.c:5234
>  sock_alloc_send_pskb+0xb56/0x1

Re: [Xen-devel] [PATCH] x86/xen: zero MSR_IA32_SPEC_CTRL before suspend

2018-04-12 Thread Ingo Molnar


* Jan Beulich  wrote:

> >>> On 12.04.18 at 09:32,  wrote:
> 
> > * Jan Beulich  wrote:
> > 
> >> >>> On 11.04.18 at 13:53,  wrote:
> >> > * Jan Beulich  wrote:
> >> > 
> >> >> Additionally, x86 maintainers: is there a particular reason this (or
> >> >> any functionally equivalent patch) isn't upstream yet? As indicated
> >> >> before, I had not been able to find any discussion, and hence I
> >> >> see no reason why this is a patch we effectively carry privately in
> >> >> our distro branches (and likely other distros do so too).
> >> > 
> >> > The patch was merged 6 weeks ago and is now upstream:
> >> > 
> >> >   71c208dd54ab: x86/xen: Zero MSR_IA32_SPEC_CTRL before suspend
> >> 
> >> I'm sorry, but no, this isn't the patch I was inquiring about.
> >> Instead I'm wondering of the disposition of the patch disabling
> >> IBRS around a CPU going idle.
> > 
> > Got any specific link or subject line for that submission?
> 
> Sure, as written in the original response to Jürgen's patch:
> https://patchwork.kernel.org/patch/10153843/

Argh, indeed you did!

In any case, this submission from Tim Chen:

   [PATCH v3 0/5] IBRS patch series

Contained a glaring bug in patch #2 which Thomas pointed out, and AFAICS the 
series was never resubmitted to lkml so it got lost.

In any case thanks for the reminder!

Thanks,

Ingo

Re: [PATCH v3 0/5] V3M-Eagle HDMI output enablement

2018-04-12 Thread Simon Horman

On Wed, Apr 11, 2018 at 02:43:05PM +0200, Jacopo Mondi wrote:
> Hello,
>I have rebased the Eagle display enablement on top of (part of) Sergei's
> series:
>  [PATCH v2 0/5] Add R8A77970/V3MSK LVDS/HDMI support

Hi Jacopo,

the emails with the patches of this series do not
seem to have hit my inbox or patchwork.

Re: [PATCH] mmap.2: document new MAP_FIXED_NOREPLACE flag

2018-04-12 Thread Michael Kerrisk (man-pages)

On 04/11/2018 06:40 PM, Jann Horn wrote:
> On Wed, Apr 11, 2018 at 6:36 PM, Michal Hocko  wrote:
>> On Wed 11-04-18 17:37:46, Jann Horn wrote:
>>> On Wed, Apr 11, 2018 at 2:04 PM,   wrote:
 From: Michal Hocko 

 4.17+ kernels offer a new MAP_FIXED_NOREPLACE flag which allows the caller 
 to
 atomicaly probe for a given address range.

 [wording heavily updated by John Hubbard ]
 Signed-off-by: Michal Hocko 
 ---
 Hi,
 Andrew's sent the MAP_FIXED_NOREPLACE to Linus for the upcoming merge
 window. So here we go with the man page update.

  man2/mmap.2 | 27 +++
  1 file changed, 27 insertions(+)

 diff --git a/man2/mmap.2 b/man2/mmap.2
 index ea64eb8f0dcc..f702f3e4eba2 100644
 --- a/man2/mmap.2
 +++ b/man2/mmap.2
 @@ -261,6 +261,27 @@ Examples include
  and the PAM libraries
  .UR http://www.linux-pam.org
  .UE .
 +Newer kernels
 +(Linux 4.17 and later) have a
 +.B MAP_FIXED_NOREPLACE
 +option that avoids the corruption problem; if available, 
 MAP_FIXED_NOREPLACE
 +should be preferred over MAP_FIXED.
>>>
>>> This still looks wrong to me. There are legitimate uses for MAP_FIXED,
>>> and for most users of MAP_FIXED that I'm aware of, MAP_FIXED_NOREPLACE
>>> wouldn't work while MAP_FIXED works perfectly well.
>>>
>>> MAP_FIXED is for when you have already reserved the targeted memory
>>> area using another VMA; MAP_FIXED_NOREPLACE is for when you haven't.
>>> Please don't make it sound as if MAP_FIXED is always wrong.
>>
>> Well, this was suggested by John. I think, nobody is objecting that
>> MAP_FIXED has legitimate usecases. The above text just follows up on
>> the previous section which emphasises the potential memory corruption
>> problems and it suggests that a new flag is safe with that regards.
>>
>> If you have specific wording that would be better I am open for changes.
> 
> I guess I'd probably also want to change the previous text; so I
> should probably send a followup patch once this one has landed.
> 
 +.TP
 +.BR MAP_FIXED_NOREPLACE " (since Linux 4.17)"
 +Similar to MAP_FIXED with respect to the
 +.I
 +addr
 +enforcement, but different in that MAP_FIXED_NOREPLACE never clobbers a 
 pre-existing
 +mapped range. If the requested range would collide with an existing
 +mapping, then this call fails with
 +.B EEXIST.
 +This flag can therefore be used as a way to atomically (with respect to 
 other
 +threads) attempt to map an address range: one thread will succeed; all 
 others
 +will report failure. Please note that older kernels which do not 
 recognize this
 +flag will typically (upon detecting a collision with a pre-existing 
 mapping)
 +fall back to a "non-MAP_FIXED" type of behavior: they will return an 
 address that
 +is different than the requested one. Therefore, backward-compatible 
 software
 +should check the returned address against the requested address.
  .TP
  .B MAP_GROWSDOWN
  This flag is used for stacks.
 @@ -487,6 +508,12 @@ is not a valid file descriptor (and
  .B MAP_ANONYMOUS
  was not set).
  .TP
 +.B EEXIST
 +range covered by
 +.IR addr ,
 +.IR length
 +is clashing with an existing mapping.
>>>
>>> Maybe add something like ", and MAP_FIXED_NOREPLACE was specified"? I
>>> think most manpages explicitly document which error conditions can be
>>> triggered by which flags.
>>
>> sure, no objection from me.

I've added the suggested piece from Jann to the EEXIST error description.

Cheers,

Michael



-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

Re: [PATCH] mmap.2: document new MAP_FIXED_NOREPLACE flag

2018-04-12 Thread Michael Kerrisk (man-pages)

Hi Jann,

On 04/11/2018 06:40 PM, Jann Horn wrote:
> On Wed, Apr 11, 2018 at 6:36 PM, Michal Hocko  wrote:
>> On Wed 11-04-18 17:37:46, Jann Horn wrote:
>>> On Wed, Apr 11, 2018 at 2:04 PM,   wrote:
 From: Michal Hocko 

 4.17+ kernels offer a new MAP_FIXED_NOREPLACE flag which allows the caller 
 to
 atomicaly probe for a given address range.

 [wording heavily updated by John Hubbard ]
 Signed-off-by: Michal Hocko 
 ---
 Hi,
 Andrew's sent the MAP_FIXED_NOREPLACE to Linus for the upcoming merge
 window. So here we go with the man page update.

  man2/mmap.2 | 27 +++
  1 file changed, 27 insertions(+)

 diff --git a/man2/mmap.2 b/man2/mmap.2
 index ea64eb8f0dcc..f702f3e4eba2 100644
 --- a/man2/mmap.2
 +++ b/man2/mmap.2
 @@ -261,6 +261,27 @@ Examples include
  and the PAM libraries
  .UR http://www.linux-pam.org
  .UE .
 +Newer kernels
 +(Linux 4.17 and later) have a
 +.B MAP_FIXED_NOREPLACE
 +option that avoids the corruption problem; if available, 
 MAP_FIXED_NOREPLACE
 +should be preferred over MAP_FIXED.
>>>
>>> This still looks wrong to me. There are legitimate uses for MAP_FIXED,
>>> and for most users of MAP_FIXED that I'm aware of, MAP_FIXED_NOREPLACE
>>> wouldn't work while MAP_FIXED works perfectly well.
>>>
>>> MAP_FIXED is for when you have already reserved the targeted memory
>>> area using another VMA; MAP_FIXED_NOREPLACE is for when you haven't.
>>> Please don't make it sound as if MAP_FIXED is always wrong.
>>
>> Well, this was suggested by John. I think, nobody is objecting that
>> MAP_FIXED has legitimate usecases. The above text just follows up on
>> the previous section which emphasises the potential memory corruption
>> problems and it suggests that a new flag is safe with that regards.
>>
>> If you have specific wording that would be better I am open for changes.
> 
> I guess I'd probably also want to change the previous text; so I
> should probably send a followup patch once this one has landed.
Okay -- I'm ready to take that piece now. Please send me a patch!

Cheers,

Michael

-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

Re: [PATCH v3 0/5] V3M-Eagle HDMI output enablement

2018-04-12 Thread jacopo mondi

Hi Simon,

On Thu, Apr 12, 2018 at 10:08:27AM +0200, Simon Horman wrote:
> On Wed, Apr 11, 2018 at 02:43:05PM +0200, Jacopo Mondi wrote:
> > Hello,
> >I have rebased the Eagle display enablement on top of (part of) Sergei's
> > series:
> >  [PATCH v2 0/5] Add R8A77970/V3MSK LVDS/HDMI support
>
> Hi Jacopo,
>
> the emails with the patches of this series do not
> seem to have hit my inbox or patchwork.
>

Damn, I've noticed! I have no idea what happened, I have 3 out of 5
patches in my inbox, but it seems most of them didn't went out.

ISP problems again? :(

I'll resend right away


signature.asc
Description: PGP signature

Re: [PATCH] checkpatch: Add a --strict test for structs with bool member definitions

2018-04-12 Thread Peter Zijlstra

On Thu, Apr 12, 2018 at 09:47:19AM +0200, Ingo Molnar wrote:
> lkml.org is nice in emails that have a short expected life time and relevance 
> - 

I like lkml.org's archive (although it's not without its problems), but
the site suffers from serious availability issues -- it is down a lot,
which is _really_ tedious.

[PATCH v3 0/5] V3M-Eagle HDMI output enablement

2018-04-12 Thread Jacopo Mondi

Hello,
   I have rebased the Eagle display enablement on top of (part of) Sergei's
series:
 [PATCH v2 0/5] Add R8A77970/V3MSK LVDS/HDMI support

Simon: you can skip "[1/5]  arm64: dts: renesas: r8a77970: add FCPVD support"
as you already collected that

Sergei: I re-sent your series because there was an additional comment from
Laurent on [3/5]. I felt it was wrong to send a follow up patch on a series
still not collected by Simon, so I've resent it. Hope this time is ok with you.
Also, please note that [5/5] of your original series shall be re-sent using
the newly introduced (still in-review) LVDS decoder. Please see [5/5] of this
series as an example.

Niklas: [5/5] of this series is a fixup of your patches and mine. I added
your signed-off-by, hope it is ok.

The series depends on THC63LVD1024 driver, currently submitted for inclusion
"[PATCH v8 0/2]  drm: Add Thine THC63LVD1024 LVDS decoder bridge"
currently available at:
git://jmondi.org/linux lvds-bridge/linus-master/v8

Thanks
   j

v2 -> v3:
- Use Sergei's series for patches [1-4] with a minor comment from Laurent
- Remove the lvds-decoder node label and add Laurent's Reviewed-by in [5/5]

v1 -> v2:
- Add Laurent's reviewed by tags
- Fixup patch 5, 6 and 7 of v1
- Remove DU digital output pin muxing
- Update thc63lvd1024 to use the new bindings with mandatory power supply
- Minor fixes (changes are described individually in each patch)

Jacopo Mondi (1):
  arm64: dts: renesas: eagle: Enable HDMI output

Sergei Shtylyov (4):
  arm64: dts: renesas: r8a77970: add FCPVD support
  arm64: dts: renesas: r8a77970: add VSPD support
  arm64: dts: renesas: r8a77970: add DU support
  arm64: dts: renesas: r8a77970: add LVDS support

 arch/arm64/boot/dts/renesas/r8a77970-eagle.dts | 93 ++
 arch/arm64/boot/dts/renesas/r8a77970.dtsi  | 75 +
 2 files changed, 168 insertions(+)

--
2.7.4

Re: [PATCH] checkpatch: Add a --strict test for structs with bool member definitions

2018-04-12 Thread Peter Zijlstra

On Wed, Apr 11, 2018 at 11:42:20PM -0700, Joe Perches wrote:
> I personally do not find a significant issue with
> uncontrolled sizes of bool in kernel structs as
> all of the kernel structs are transitory and not
> written out to storage.

People that care about cache locality, false sharing and other such
things really care about structure layout.

Growing a structure into another cacheline can be a significant
performance hit -- cache misses hurt.

[PATCH v3 0/5] V3M-Eagle HDMI output enablement

2018-04-12 Thread Jacopo Mondi

Hello,
   I have rebased the Eagle display enablement on top of (part of) Sergei's
series:
 [PATCH v2 0/5] Add R8A77970/V3MSK LVDS/HDMI support

Simon: you can skip "[1/5]  arm64: dts: renesas: r8a77970: add FCPVD support"
as you already collected that

Sergei: I re-sent your series because there was an additional comment from
Laurent on [3/5]. I felt it was wrong to send a follow up patch on a series
still not collected by Simon, so I've resent it. Hope this time is ok with you.
Also, please note that [5/5] of your original series shall be re-sent using
the newly introduced (still in-review) LVDS decoder. Please see [5/5] of this
series as an example.

Niklas: [5/5] of this series is a fixup of your patches and mine. I added
your signed-off-by, hope it is ok.

The series depends on THC63LVD1024 driver, currently submitted for inclusion
"[PATCH v8 0/2]  drm: Add Thine THC63LVD1024 LVDS decoder bridge"
currently available at:
git://jmondi.org/linux lvds-bridge/linus-master/v8

Thanks
   j

v2 -> v3:
- Use Sergei's series for patches [1-4] with a minor comment from Laurent
- Remove the lvds-decoder node label and add Laurent's Reviewed-by in [5/5]

v1 -> v2:
- Add Laurent's reviewed by tags
- Fixup patch 5, 6 and 7 of v1
- Remove DU digital output pin muxing
- Update thc63lvd1024 to use the new bindings with mandatory power supply
- Minor fixes (changes are described individually in each patch)

Jacopo Mondi (1):
  arm64: dts: renesas: eagle: Enable HDMI output

Sergei Shtylyov (4):
  arm64: dts: renesas: r8a77970: add FCPVD support
  arm64: dts: renesas: r8a77970: add VSPD support
  arm64: dts: renesas: r8a77970: add DU support
  arm64: dts: renesas: r8a77970: add LVDS support

 arch/arm64/boot/dts/renesas/r8a77970-eagle.dts | 93 ++
 arch/arm64/boot/dts/renesas/r8a77970.dtsi  | 75 +
 2 files changed, 168 insertions(+)

--
2.7.4

[PATCH v3 3/5] arm64: dts: renesas: r8a77970: add DU support

2018-04-12 Thread Jacopo Mondi

From: Sergei Shtylyov 

Define the generic R8A77970 part of the DU device node.

Based on the original (and large) patch by Daisuke Matsushita
.

Signed-off-by: Vladimir Barinov 
Signed-off-by: Sergei Shtylyov 
Reviewed-by: Laurent Pinchart 
---
 arch/arm64/boot/dts/renesas/r8a77970.dtsi | 29 +
 1 file changed, 29 insertions(+)

diff --git a/arch/arm64/boot/dts/renesas/r8a77970.dtsi 
b/arch/arm64/boot/dts/renesas/r8a77970.dtsi
index a3ef3bd..5860b0fb 100644
--- a/arch/arm64/boot/dts/renesas/r8a77970.dtsi
+++ b/arch/arm64/boot/dts/renesas/r8a77970.dtsi
@@ -635,6 +635,35 @@
resets = <&cpg 623>;
renesas,fcp = <&fcpvd0>;
};
+
+   du: display@feb0 {
+   compatible = "renesas,du-r8a77970";
+   reg = <0 0xfeb0 0 0x8>;
+   interrupts = ;
+   clocks = <&cpg CPG_MOD 724>;
+   clock-names = "du.0";
+   power-domains = <&sysc R8A77970_PD_ALWAYS_ON>;
+   resets = <&cpg 724>;
+   vsps = <&vspd0>;
+   status = "disabled";
+
+   ports {
+   #address-cells = <1>;
+   #size-cells = <0>;
+
+   port@0 {
+   reg = <0>;
+   du_out_rgb: endpoint {
+   };
+   };
+
+   port@1 {
+   reg = <1>;
+   du_out_lvds0: endpoint {
+   };
+   };
+   };
+   };
};
 
timer {
-- 
2.7.4

[PATCH v3 5/5] arm64: dts: renesas: eagle: Enable HDMI output

2018-04-12 Thread Jacopo Mondi

Enable HDMI output on Renesas R-Car V3M Eagle board.

The HDMI ouput is enabled connecting the DU LVDS output to the
transparent LVDS converter THC63LVD1024, and successively routing its
RGB output to the ADV7511W HDMI encoder.

Signed-off-by: Niklas Söderlund 
Signed-off-by: Jacopo Mondi 
Reviewed-by: Laurent Pinchart 
[for THC63LVD1024: ]
Reviewed-by: Andrzej Hajda 

---
v1 -> v2:
- Squash patches [5/7], [6/7] and [7/7] of v1 in a single patch as
  suggested by Laurent
- Remove DU pinmuxing as it is used for DU parallel RGB output only used
  by Eagle's display expander board not enabled by this series.
---
 arch/arm64/boot/dts/renesas/r8a77970-eagle.dts | 93 ++
 1 file changed, 93 insertions(+)

diff --git a/arch/arm64/boot/dts/renesas/r8a77970-eagle.dts 
b/arch/arm64/boot/dts/renesas/r8a77970-eagle.dts
index 3c5f598..ebfbb51 100644
--- a/arch/arm64/boot/dts/renesas/r8a77970-eagle.dts
+++ b/arch/arm64/boot/dts/renesas/r8a77970-eagle.dts
@@ -31,6 +31,51 @@
/* first 128MB is reserved for secure area. */
reg = <0x0 0x4800 0x0 0x3800>;
};
+
+   hdmi-out {
+   compatible = "hdmi-connector";
+   type = "a";
+
+   port {
+   hdmi_con_out: endpoint {
+   remote-endpoint = <&adv7511_out>;
+   };
+   };
+   };
+
+   d3p3: regulator-fixed {
+   compatible = "regulator-fixed";
+   regulator-name = "fixed-3.3V";
+   regulator-min-microvolt = <330>;
+   regulator-max-microvolt = <330>;
+   regulator-boot-on;
+   regulator-always-on;
+   };
+
+   lvds-decoder {
+   compatible = "thine,thc63lvd1024";
+
+   vcc-supply = <&d3p3>;
+
+   ports {
+   #address-cells = <1>;
+   #size-cells = <0>;
+
+   port@0 {
+   reg = <0>;
+   thc63lvd1024_in: endpoint {
+   remote-endpoint = <&lvds0_out>;
+   };
+   };
+
+   port@2 {
+   reg = <2>;
+   thc63lvd1024_out: endpoint {
+   remote-endpoint = <&adv7511_in>;
+   };
+   };
+   };
+   };
 };
 
 &avb {
@@ -68,6 +113,38 @@
gpio-controller;
#gpio-cells = <2>;
};
+
+   hdmi@39 {
+   compatible = "adi,adv7511w";
+   reg = <0x39>;
+   interrupt-parent = <&gpio1>;
+   interrupts = <20 IRQ_TYPE_LEVEL_LOW>;
+
+   adi,input-depth = <8>;
+   adi,input-colorspace = "rgb";
+   adi,input-clock = "1x";
+   adi,input-style = <1>;
+   adi,input-justification = "evenly";
+
+   ports {
+   #address-cells = <1>;
+   #size-cells = <0>;
+
+   port@0 {
+   reg = <0>;
+   adv7511_in: endpoint {
+   remote-endpoint = <&thc63lvd1024_out>;
+   };
+   };
+
+   port@1 {
+   reg = <1>;
+   adv7511_out: endpoint {
+   remote-endpoint = <&hdmi_con_out>;
+   };
+   };
+   };
+   };
 };
 
 &pfc {
@@ -93,3 +170,19 @@
 
status = "okay";
 };
+
+&du {
+   status = "okay";
+};
+
+&lvds0 {
+   status = "okay";
+
+   ports {
+   port@1 {
+   lvds0_out: endpoint {
+   remote-endpoint = <&thc63lvd1024_in>;
+   };
+   };
+   };
+};
-- 
2.7.4

[PATCH v3 4/5] arm64: dts: renesas: r8a77970: add LVDS support

2018-04-12 Thread Jacopo Mondi

From: Sergei Shtylyov 

Define the generic R8A77970 part of the LVDS device node.

Signed-off-by: Sergei Shtylyov 
---
 arch/arm64/boot/dts/renesas/r8a77970.dtsi | 28 
 1 file changed, 28 insertions(+)

diff --git a/arch/arm64/boot/dts/renesas/r8a77970.dtsi 
b/arch/arm64/boot/dts/renesas/r8a77970.dtsi
index 5860b0fb..614b571 100644
--- a/arch/arm64/boot/dts/renesas/r8a77970.dtsi
+++ b/arch/arm64/boot/dts/renesas/r8a77970.dtsi
@@ -660,6 +660,34 @@
port@1 {
reg = <1>;
du_out_lvds0: endpoint {
+   remote-endpoint = <&lvds0_in>;
+   };
+   };
+   };
+   };
+
+   lvds0: lvds-encoder@feb9 {
+   compatible = "renesas,r8a77970-lvds";
+   reg = <0 0xfeb9 0 0x14>;
+   clocks = <&cpg CPG_MOD 727>;
+   power-domains = <&sysc R8A77970_PD_ALWAYS_ON>;
+   resets = <&cpg 727>;
+   status = "disabled";
+
+   ports {
+   #address-cells = <1>;
+   #size-cells = <0>;
+
+   port@0 {
+   reg = <0>;
+   lvds0_in: endpoint {
+   remote-endpoint =
+   <&du_out_lvds0>;
+   };
+   };
+   port@1 {
+   reg = <1>;
+   lvds0_out: endpoint {
};
};
};
-- 
2.7.4

[PATCH v3 1/5] arm64: dts: renesas: r8a77970: add FCPVD support

2018-04-12 Thread Jacopo Mondi

From: Sergei Shtylyov 

Describe FCPVD0 in the R8A77970 device tree; it will be used by VSPD0 in
the next patch...

Based on the original (and large) patch by Daisuke Matsushita
.

Signed-off-by: Vladimir Barinov 
Signed-off-by: Sergei Shtylyov 
Signed-off-by: Niklas Söderlund 
---
 arch/arm64/boot/dts/renesas/r8a77970.dtsi | 8 
 1 file changed, 8 insertions(+)

diff --git a/arch/arm64/boot/dts/renesas/r8a77970.dtsi 
b/arch/arm64/boot/dts/renesas/r8a77970.dtsi
index c6db8ea..97c27ef 100644
--- a/arch/arm64/boot/dts/renesas/r8a77970.dtsi
+++ b/arch/arm64/boot/dts/renesas/r8a77970.dtsi
@@ -617,6 +617,14 @@
#address-cells = <1>;
#size-cells = <0>;
};
+
+   fcpvd0: fcp@fea27000 {
+   compatible = "renesas,fcpv";
+   reg = <0 0xfea27000 0 0x200>;
+   clocks = <&cpg CPG_MOD 603>;
+   power-domains = <&sysc R8A77970_PD_ALWAYS_ON>;
+   resets = <&cpg 603>;
+   };
};
 
timer {
-- 
2.7.4

[PATCH v3 2/5] arm64: dts: renesas: r8a77970: add VSPD support

2018-04-12 Thread Jacopo Mondi

From: Sergei Shtylyov 

Describe VSPD0 in the R8A77970 device tree; it will be used by DU in
the next patch...

Based on the original (and large) patch by Daisuke Matsushita
.

Signed-off-by: Vladimir Barinov 
Signed-off-by: Sergei Shtylyov 
Signed-off-by: Niklas Söderlund 
Signed-off-by: Jacopo Mondi 
Reviewed-by: Laurent Pinchart 

---
v1 -> v2 (Jacopo) :
- Extend the memory region to include V6_CLUTn_TBL* registers.
---
 arch/arm64/boot/dts/renesas/r8a77970.dtsi | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/arch/arm64/boot/dts/renesas/r8a77970.dtsi 
b/arch/arm64/boot/dts/renesas/r8a77970.dtsi
index 97c27ef..a3ef3bd 100644
--- a/arch/arm64/boot/dts/renesas/r8a77970.dtsi
+++ b/arch/arm64/boot/dts/renesas/r8a77970.dtsi
@@ -625,6 +625,16 @@
power-domains = <&sysc R8A77970_PD_ALWAYS_ON>;
resets = <&cpg 603>;
};
+
+   vspd0: vsp@fea2 {
+   compatible = "renesas,vsp2";
+   reg = <0 0xfea2 0 0x8000>;
+   interrupts = ;
+   clocks = <&cpg CPG_MOD 623>;
+   power-domains = <&sysc R8A77970_PD_ALWAYS_ON>;
+   resets = <&cpg 623>;
+   renesas,fcp = <&fcpvd0>;
+   };
};
 
timer {
-- 
2.7.4

Re: [PATCH v2] staging: wilc1000: Remove unnecessary braces {} around single statement block

2018-04-12 Thread Claudiu Beznea



On 12.04.2018 10:59, Eyal Ilsar wrote:
> Remove unnecessary braces {} around an 'if' statement block with a single 
> statement. Issue found by checkpatch.
> 
> Signed-off-by: Eyal Ilsar 

Reviewed-by: Claudiu Beznea 

> ---
> Added an empty line before the 'Signed-off-by' line and a space between the
> name and e-mail address within that line.
> 
>  drivers/staging/wilc1000/wilc_wfi_cfgoperations.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/drivers/staging/wilc1000/wilc_wfi_cfgoperations.c 
> b/drivers/staging/wilc1000/wilc_wfi_cfgoperations.c
> index 205304c..325afe1 100644
> --- a/drivers/staging/wilc1000/wilc_wfi_cfgoperations.c
> +++ b/drivers/staging/wilc1000/wilc_wfi_cfgoperations.c
> @@ -284,9 +284,8 @@ static void remove_network_from_shadow(struct timer_list 
> *unused)
>   }
>   }
>  
> - if (last_scanned_cnt != 0) {
> + if (last_scanned_cnt != 0)
>   mod_timer(&hAgingTimer, jiffies + msecs_to_jiffies(AGING_TIME));
> - }
>  }
>  
>  static void clear_duringIP(struct timer_list *unused)
>

Re: [PATCH] gpio: xra1403: Switch to a fixed upper bound for registers

2018-04-12 Thread Linus Walleij

On Mon, Apr 9, 2018 at 11:07 PM, Laura Abbott  wrote:

> Geert Uytterhoeven pointed out that the number of register was a
> fixed upper bound so there's no need to use a dynamically allocated
> array in place of a VLA. Use the defined upper bound.
>
> Suggested-by: Geert Uytterhoeven 
> Signed-off-by: Laura Abbott 

Patch applied for v4.18!

Yours,
Linus Walleij

Re: [PATCH] swiotlb: Fix unexpected swiotlb_alloc_coherent() failures

2018-04-12 Thread Takashi Iwai

On Thu, 12 Apr 2018 10:03:56 +0200,
Takashi Iwai wrote:
> 
> On Thu, 12 Apr 2018 08:02:27 +0200,
> Christoph Hellwig wrote:
> > 
> > On Wed, Apr 11, 2018 at 09:28:54AM +0200, Takashi Iwai wrote:
> > > > But we should try a GFP_DMA32 allocation first, so this is a bit
> > > > surprising.
> > > 
> > > Hm, do we really try that?
> > > Through a quick glance, dma_alloc_coherent_gfp_flags() gives GFP_DMA32
> > > only when coherent mask <= DMA_BIT_MASK(32); in the case of iwlwifi,
> > > it's 36bit, so GFP_DMA isn't set.
> > 
> > Oh, yes - it is using an odd dma mask, and amdgpu seems to use an
> > just as odd 40-bit dma mask.
> > 
> > > We had a fallback allocation with GFP_DMA32 in the past, but this
> > > seems gone long time ago along with cleanups (commit c647c3bb2d16).
> > > 
> > > But I haven't followed about this topic for long time, so I might have
> > > missed obviously...
> > 
> > I think a fallback would be much better here rather than relying on the
> > limited swiotlb buffer bool.  dma_direct_alloc (which in 4.17 is also
> > used for x86) already has a GFP_DMA fallback, so extending this for
> > GFP_DMA32 as well would seem reasonable.
> > 
> > Any volunteers?
> 
> Below is a quick attempt, totally untested.  Actually the retry with
> GFP_DMA is superfluous for archs without it, so the first patch
> corrects it.

Gah, scratch this, it doesn't work.  A different check is needed...


Takashi

Re: [PATCH] block/amiflop: Don't log an error message for an invalid ioctl

2018-04-12 Thread Geert Uytterhoeven

On Thu, Apr 12, 2018 at 3:23 AM, Finn Thain  wrote:
> Do as the swim3 driver does and just return -ENOTTY.
>
> Cc: Geert Uytterhoeven 
> Cc: linux-m...@lists.linux-m68k.org
> Signed-off-by: Finn Thain 

Reviewed-by:  Geert Uytterhoeven 

Gr{oetje,eeting}s,

Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

Re: [RFC PATCH v2 1/6] sched/fair: Create util_fits_capacity()

2018-04-12 Thread Dietmar Eggemann


On 04/12/2018 09:02 AM, Viresh Kumar wrote:

On 06-04-18, 16:36, Dietmar Eggemann wrote:

The functionality that a given utilization fits into a given capacity
is factored out into a separate function.

Currently it is only used in wake_cap() but will be re-used to figure
out if a cpu or a scheduler group is over-utilized.

Cc: Ingo Molnar 
Cc: Peter Zijlstra 
Signed-off-by: Dietmar Eggemann 
---
  kernel/sched/fair.c | 7 ++-
  1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 0951d1c58d2f..0a76ad2ef022 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -6574,6 +6574,11 @@ static unsigned long cpu_util_wake(int cpu, struct 
task_struct *p)
return min_t(unsigned long, util, capacity_orig_of(cpu));
  }
  
+static inline int util_fits_capacity(unsigned long util, unsigned long capacity)

+{
+   return capacity * 1024 > util * capacity_margin;


This changes the behavior slightly compared to existing code. If that
wasn't intentional, perhaps you should use >= here.


You're right here ... Already on our v3 list. Thanks!

The 'misfit' patch-set comes with a similar function 
task_fits_capacity() so we have to align on this one with this patch-set 
as well.


[...]

Re: [PATCH] mmap.2: document new MAP_FIXED_NOREPLACE flag

2018-04-12 Thread Michal Hocko

On Thu 12-04-18 10:04:06, Michael Kerrisk wrote:
> Hello Michal,
> 
> On 04/11/2018 02:04 PM, mho...@kernel.org wrote:
> > From: Michal Hocko 
> > 
> > 4.17+ kernels offer a new MAP_FIXED_NOREPLACE flag which allows the caller 
> > to
> > atomicaly probe for a given address range.
> > 
> > [wording heavily updated by John Hubbard ]
> > Signed-off-by: Michal Hocko 
> 
> Thanks! I've applied your patch, and done a little tweaking. The results
> have already been pushed.

Thanks!
-- 
Michal Hocko
SUSE Labs

[PATCH] perf test: Adapt test case record+probe_libc_inet_pton.sh for s390

2018-04-12 Thread Thomas Richter

perf test case 58 (record+probe_libc_inet_pton.sh)
executed on s390x using kernel 4.16.0rc3
displays this result:
 # ./perf trace --no-syscalls
   -e probe_libc:inet_pton/call-graph=dwarf/ ping -6 -c 1 ::1
 probe_libc:inet_pton: (3ffa0240448)
  __GI___inet_pton (/usr/lib64/libc-2.26.so)
  gaih_inet (inlined)
  __GI_getaddrinfo (inlined)
  main (/usr/bin/ping)
  __libc_start_main (/usr/lib64/libc-2.26.so)
 _start (/usr/bin/ping)

After I installed kernel 4.16.0 the same tests uses
commands
 # perf record -e probe_libc:inet_pton/call-graph=dwarf/
  -o /tmp/perf.data.abc ping -6 -c 1 ::1
 # perf script -i /tmp/perf.data.abc
and displays:
 ping 39048 [006] 84230.381198: probe_libc:inet_pton: (3ffa0240448)
   140448 __GI___inet_pton (/usr/lib64/libc-2.26.so)
   fbde1 gaih_inet (inlined)
   fe2b9 __GI_getaddrinfo (inlined)
398d main (/usr/bin/ping)

Nothing else changed including glibc elfutils and other libraries
picked up by the build.
The entries for __libc_start_main and _start are missing.

I bisected missing __libc_start_main and _start to commit
3d20c6246690219881786de10d2dda93f616d0ac
("perf unwind: Unwind with libdw doesn't take symfs into account")

When I undo this commit I get this call stack on s390:
 [root@s35lp76 perf]# ./perf script  -i /tmp/perf.data.abc
 ping 39048 [006] 84230.381198: probe_libc:inet_pton: (3ffa0240448)
140448 __GI___inet_pton (/usr/lib64/libc-2.26.so)
 fbde1 gaih_inet (inlined)
 fe2b9 __GI_getaddrinfo (inlined)
  398d main (/usr/bin/ping)
 22fbd __libc_start_main (/usr/lib64/libc-2.26.so)
  457b _start (/usr/bin/ping)

Looks like dwarf functions dwfl_xxx  create different call back
stack trace when using file
/usr/lib/debug/usr/bin/ping-20161105-7.fc27.s390x.debug instead of
file /usr/bin/ping.

Fix this test case on s390 and do not expect any call back stack
entry after the main() function. Also be more robust and accept a
leading __GI_ prefix in front of getaddrinfo.

On x86 this test case shows the same call stack using
both kernel versions 4.16.0rc3 and 4.16.0 and also
stops at main:

[root@f27 perf]# ./perf script -i /tmp/perf.data.tmr
ping  4446 [000]   172.027088: probe_libc:inet_pton: (7fdfa08c93c0)
   1393c0 __GI___inet_pton (/usr/lib64/libc-2.26.so)
fe60d getaddrinfo (/usr/lib64/libc-2.26.so)
 2f40 main (/usr/bin/ping)

[root@f27 perf]#

Signed-off-by: Thomas Richter 
Reviewed-by: Hendrik Brueckner 
---
 tools/perf/tests/shell/record+probe_libc_inet_pton.sh | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/tools/perf/tests/shell/record+probe_libc_inet_pton.sh 
b/tools/perf/tests/shell/record+probe_libc_inet_pton.sh
index 1ecc1f0ff84a..016882dbbc16 100755
--- a/tools/perf/tests/shell/record+probe_libc_inet_pton.sh
+++ b/tools/perf/tests/shell/record+probe_libc_inet_pton.sh
@@ -19,12 +19,10 @@ trace_libc_inet_pton_backtrace() {
expected[1]=".*inet_pton[[:space:]]\($libc\)$"
case "$(uname -m)" in
s390x)
-   eventattr='call-graph=dwarf'
+   eventattr='call-graph=dwarf,max-stack=4'
expected[2]="gaih_inet.*[[:space:]]\($libc|inlined\)$"
-   expected[3]="__GI_getaddrinfo[[:space:]]\($libc|inlined\)$"
+   expected[3]="(__GI_)?getaddrinfo[[:space:]]\($libc|inlined\)$"
expected[4]="main[[:space:]]\(.*/bin/ping.*\)$"
-   expected[5]="__libc_start_main[[:space:]]\($libc\)$"
-   expected[6]="_start[[:space:]]\(.*/bin/ping.*\)$"
;;
*)
eventattr='max-stack=3'
-- 
2.14.3

Re: [PATCH 24/24] debugfs: Restrict debugfs when the kernel is locked down

2018-04-12 Thread Greg KH

On Wed, Apr 11, 2018 at 07:54:12PM -0700, Andy Lutomirski wrote:
> On Wed, Apr 11, 2018 at 1:33 PM, Greg KH  wrote:
> > On Wed, Apr 11, 2018 at 09:09:16PM +0100, David Howells wrote:
> >> Greg KH  wrote:
> >>
> >> > Why not just disable debugfs entirely?  This half-hearted way to sorta
> >> > lock it down is odd, it is meant to not be there at all, nothing in your
> >> > normal system should ever depend on it.
> >> >
> >> > So again just don't allow it to be mounted at all, much simpler and more
> >> > obvious as to what is going on.
> >>
> >> Yeah, I agree - and then I got complaints because it seems that it's been
> >> abused to allow drivers and userspace components to communicate.
> >
> > With in-kernel code?  Please let me know and I'll go fix it up to not
> > allow that, as that is not ok.
> >
> > I do know of some bad examples of out-of-tree code abusing debugfs to do
> > crazy things (battery level monitoring?), but that's their own fault...
> >
> > debugfs is for DEBUGGING!  For anything you all feel should be "secure",
> > then just disable it entirely.
> >
> 
> Debugfs is very, very useful for, ahem, debugging.  I really think
> this is an example of why we should split lockdown into the read and
> write varieties and allow mounting and reading debugfs when only write
> is locked down.

Ok, but be sure that there are no "secrets" in those debugging files if
you really buy into the whole "lock down" mess...

Really, it's easier to just disable the whole thing.

greg k-h

Re: [PATCH 4.9 022/310] pidns: disable pid allocation if pid_ns_prepare_proc() is failed in alloc_pid()

2018-04-12 Thread Greg Kroah-Hartman

On Wed, Apr 11, 2018 at 02:27:19PM -0500, Eric W. Biederman wrote:
> Greg Kroah-Hartman  writes:
> 
> > 4.9-stable review patch.  If anyone has any objections, please let me
> > know.
> 
> No objections but if you are grabbing that one please check if you
> have it's follow on fix.
> 
> c0ee554906c3 ("pid: Handle failure to allocate the first pid in a pid 
> namespace")
> 
> There were a few cases not handled but the fix below, that made a more
> comprehensive fix desirable.

Ok, but it looks like that commit needs to also go into 4.14.y as well,
as the original patch here, 8896c23d2ef8 ("pidns: disable pid allocation
if pid_ns_prepare_proc() is failed in alloc_pid()") showed up in 4.12.

Is that ok?

thanks,

greg k-h

Re: [PATCH] swiotlb: Fix unexpected swiotlb_alloc_coherent() failures

2018-04-12 Thread Takashi Iwai

On Thu, 12 Apr 2018 10:19:05 +0200,
Takashi Iwai wrote:
> 
> On Thu, 12 Apr 2018 10:03:56 +0200,
> Takashi Iwai wrote:
> > 
> > On Thu, 12 Apr 2018 08:02:27 +0200,
> > Christoph Hellwig wrote:
> > > 
> > > On Wed, Apr 11, 2018 at 09:28:54AM +0200, Takashi Iwai wrote:
> > > > > But we should try a GFP_DMA32 allocation first, so this is a bit
> > > > > surprising.
> > > > 
> > > > Hm, do we really try that?
> > > > Through a quick glance, dma_alloc_coherent_gfp_flags() gives GFP_DMA32
> > > > only when coherent mask <= DMA_BIT_MASK(32); in the case of iwlwifi,
> > > > it's 36bit, so GFP_DMA isn't set.
> > > 
> > > Oh, yes - it is using an odd dma mask, and amdgpu seems to use an
> > > just as odd 40-bit dma mask.
> > > 
> > > > We had a fallback allocation with GFP_DMA32 in the past, but this
> > > > seems gone long time ago along with cleanups (commit c647c3bb2d16).
> > > > 
> > > > But I haven't followed about this topic for long time, so I might have
> > > > missed obviously...
> > > 
> > > I think a fallback would be much better here rather than relying on the
> > > limited swiotlb buffer bool.  dma_direct_alloc (which in 4.17 is also
> > > used for x86) already has a GFP_DMA fallback, so extending this for
> > > GFP_DMA32 as well would seem reasonable.
> > > 
> > > Any volunteers?
> > 
> > Below is a quick attempt, totally untested.  Actually the retry with
> > GFP_DMA is superfluous for archs without it, so the first patch
> > corrects it.
> 
> Gah, scratch this, it doesn't work.  A different check is needed...

The v2 patches are below, replaced with IS_ENABLED(CONFIG_ZONE_DMA).


Takashi



0001-dma-direct-Don-t-repeat-allocation-for-no-op-GFP_DMA.patch
Description: Binary data


0002-dma-direct-Try-reallocation-with-GFP_DMA32-if-possib.patch
Description: Binary data

Re: net_dim() may use uninitialized data

2018-04-12 Thread Tal Gilboa


On 4/5/2018 4:13 PM, Geert Uytterhoeven wrote:

Hi Tal,

With gcc-4.1.2:

 drivers/net/ethernet/broadcom/bcmsysport.c: In function ‘bcm_sysport_poll’:
 include/linux/net_dim.h:354: warning: ‘curr_stats.ppms’ may be
used uninitialized in this function
 include/linux/net_dim.h:354: warning: ‘curr_stats.bpms’ may be
used uninitialized in this function
 include/linux/net_dim.h:354: warning: ‘curr_stats.epms’ may be
used uninitialized in this function

Indeed, ...

| static inline void net_dim_calc_stats(struct net_dim_sample *start,
|   struct net_dim_sample *end,
|   struct net_dim_stats *curr_stats)
| {
| /* u32 holds up to 71 minutes, should be enough */
| u32 delta_us = ktime_us_delta(end->time, start->time);
| u32 npkts = BIT_GAP(BITS_PER_TYPE(u32), end->pkt_ctr, start->pkt_ctr);
| u32 nbytes = BIT_GAP(BITS_PER_TYPE(u32), end->byte_ctr,
|  start->byte_ctr);
|
| if (!delta_us)
| return;

... if delta_us is zero, none of the below will be initialized ...

| curr_stats->ppms = DIV_ROUND_UP(npkts * USEC_PER_MSEC, delta_us);
| curr_stats->bpms = DIV_ROUND_UP(nbytes * USEC_PER_MSEC, delta_us);
| curr_stats->epms = DIV_ROUND_UP(NET_DIM_NEVENTS * USEC_PER_MSEC,
| delta_us);
| }
|
| static inline void net_dim(struct net_dim *dim,
|struct net_dim_sample end_sample)
| {
| struct net_dim_stats curr_stats;
| u16 nevents;
|
| switch (dim->state) {
| case NET_DIM_MEASURE_IN_PROGRESS:
| nevents = BIT_GAP(BITS_PER_TYPE(u16),
|   end_sample.event_ctr,
|   dim->start_sample.event_ctr);
| if (nevents < NET_DIM_NEVENTS)
| break;
| net_dim_calc_stats(&dim->start_sample, &end_sample,
|&curr_stats);

... in the output parameter curr_stats ...

| if (net_dim_decision(&curr_stats, dim)) {

... and net_dim_decision will make some funky decisions based on
uninitialized data.

What are the proper values to initialize curr_stats with?
Alternatively, perhaps the call to net_dim_decision() should be made
dependent on delta_us being non-zero?


First, thanks a lot for pointing this out. There are no valid values for 
initializing curr_stats. If we consider the most straightforward (all 
0s) this may result in a (big) negative delta between current and 
previous stats and a wrong decision. Any other value would make very 
little sense.
The case of !delta_us is an error flow (0 time passed or more probably 
issues when setting start and/or end times). I suggest adding a return 
value to net_dim_calc_stats() and abort the net_dim cycle if an error 
occurs.




| dim->state = NET_DIM_APPLY_NEW_PROFILE;
| schedule_work(&dim->work);
| break;
| }
| /* fall through */
| case NET_DIM_START_MEASURE:
| dim->state = NET_DIM_MEASURE_IN_PROGRESS;
| break;
| case NET_DIM_APPLY_NEW_PROFILE:
| break;
| }
| }

Gr{oetje,eeting}s,

 Geert

Re: [PATCHv4] gpio: Remove VLA from gpiolib

2018-04-12 Thread Linus Walleij

On Wed, Apr 11, 2018 at 3:03 AM, Laura Abbott  wrote:

> The new challenge is to remove VLAs from the kernel
> (see https://lkml.org/lkml/2018/3/7/621) to eventually
> turn on -Wvla.
>
> Using a kmalloc array is the easy way to fix this but kmalloc is still
> more expensive than stack allocation. Introduce a fast path with a
> fixed size stack array to cover most chip with gpios below some fixed
> amount. The slow path dynamically allocates an array to cover those
> chips with a large number of gpios.
>
> Reviewed-and-tested-by: Lukas Wunner 
> Signed-off-by: Lukas Wunner 
> Signed-off-by: Laura Abbott 
> ---
> v4: Changed some local variables to avoid coccinelle warnings. Added a
> warning if the number of GPIOs exceeds the current fast path define.
>
> Lukas, I kept your Tested-by because the changes were pretty minimal.
> Let me know if you want to run the tests again.

This patch is starting to look really good.

> +/*
> + * Number of GPIOs to use for the fast path in set array
> + */
> +#define FASTPATH_NGPIO 256

There is still some comment about this.

And now that I am also tryint to think I wonder about it, we
have a global ARCH_NR_GPIOS that is typically 512.
Some archs set it up.

This define is something of an abomination, in the ARM
case it comes from arch/arm/include/asm/gpio.h
where #define ARCH_NR_GPIOS CONFIG_ARCH_NR_GPIO
where the latter is a Kconfig option that is mostly 512 for
most ARM systems.

Well, ARM looks like this:

config ARCH_NR_GPIO
int
default 2048 if ARCH_SOCFPGA
default 1024 if ARCH_BRCMSTB || ARCH_SHMOBILE || ARCH_TEGRA || \
ARCH_ZYNQ
default 512 if ARCH_EXYNOS || ARCH_KEYSTONE || SOC_OMAP5 || \
SOC_DRA7XX || ARCH_S3C24XX || ARCH_S3C64XX || ARCH_S5PV210
default 416 if ARCH_SUNXI
default 392 if ARCH_U8500
default 352 if ARCH_VT8500
default 288 if ARCH_ROCKCHIP
default 264 if MACH_H4700
default 0
help
  Maximum number of GPIOs in the system.

  If unsure, leave the default value.

So if FASTPATH_NGPIO should be anything else than
ARCH_NR_GPIO this has to be established somewhere
as a floor or half or something, but I would just set it as
the same as ARCH_NR_GPIOS...

The main reason this define exist is for this function
from :

/* Convert between the old gpio_ and new gpiod_ interfaces */
struct gpio_desc *gpio_to_desc(unsigned gpio);

Nowadays that fact is a bit obscured since the variable is
only used when assigning the base (in the global GPIO
number space, which is what we want to get rid of but
sigh) in gpiochip_find_base() where it attempts to place
a newly allocated gpiochip in the higher region of this
numberspace since the embedded SoC GPIO base tends
to be 0, on old platforms.

So I don't know about this.

Can't we just use ARCH_NR_GPIOS?

Very few systems have more than 512 assigned global
GPIO numbers and those are FPGA experimental machines.

In the long run obviously I want to get rid of these defines
altogether and only allocate GPIO descriptos dynamically
so as you see I am reluctant to add new numberspace weirdness
around here.

Yours,
Linus Walleij

Re: [PATCH v13 5/6] PCI: Unify wait for link active into generic PCI

2018-04-12 Thread poza


On 2018-04-10 04:55, Keith Busch wrote:

On Mon, Apr 09, 2018 at 10:41:53AM -0400, Oza Pawandeep wrote:

+/**
+ * pcie_wait_for_link - Wait for link till it's active/inactive
+ * @pdev: Bridge device
+ * @active: waiting for active or inactive ?
+ *
+ * Use this to wait till link becomes active or inactive.
+ */
+bool pcie_wait_for_link(struct pci_dev *pdev, bool active)
+{
+   int timeout = 1000;
+   bool ret;
+   u16 lnk_status;
+
+   for (;;) {
+   pcie_capability_read_word(pdev, PCI_EXP_LNKSTA, &lnk_status);
+   ret = !!(lnk_status & PCI_EXP_LNKSTA_DLLLA);
+   if (ret == active)
+   return true;
+   if (timeout <= 0)
+   break;
+   timeout -= 10;
+   }


This is missing an msleep(10) at each iteration.


will take care.




+
+   pci_info(pdev, "Data Link Layer Link Active not %s in 1000 msec\n",
+active ? "set" : "cleared");
+
+   return false;
+}

Re: [PATCH] ARM: omap2: Fix build when using split object directories

2018-04-12 Thread Anders Roxell

On 2018-04-11 16:15, Dave Gerlach wrote:
> The sleep33xx and sleep43xx files should not depend on a header file
> generated in drivers/memory. Remove this dependency and instead allow
> both drivers/memory and arch/arm/mach-omap2 to generate all macros
> needed in headers local to their own paths.
> 
> This fixes an issue where the build fail will when using O= to set a
> split object directory and arch/arm/mach-omap2 is built before
> drivers/memory with the following error:
> 
> .../drivers/memory/emif-asm-offsets.c:1:0: fatal error: can't open 
> drivers/memory/emif-asm-offsets.s for writing: No such file or directory
> compilation terminated.
> 
> Fixes: 41d9d44d7258 ("ARM: OMAP2+: pm33xx-core: Add platform code needed for 
> PM")
> Acked-by: Tony Lindgren 
> Reviewed-by: Masahiro Yamada 
> Signed-off-by: Dave Gerlach 

Tested-by: Anders Roxell 

Maybe we can remove drivers/memory/Makefile.asm-offsets and move those
changes into drivers/memory/Makefile ?

Cheers,
Anders

> ---
>  arch/arm/mach-omap2/Makefile |  6 +--
>  arch/arm/mach-omap2/pm-asm-offsets.c |  3 ++
>  arch/arm/mach-omap2/sleep33xx.S  |  1 -
>  arch/arm/mach-omap2/sleep43xx.S  |  1 -
>  drivers/memory/emif-asm-offsets.c| 72 +-
>  include/linux/ti-emif-sram.h | 75 
> 
>  6 files changed, 80 insertions(+), 78 deletions(-)
> 
> diff --git a/arch/arm/mach-omap2/Makefile b/arch/arm/mach-omap2/Makefile
> index 4603c30fef73..0d9ce58bc464 100644
> --- a/arch/arm/mach-omap2/Makefile
> +++ b/arch/arm/mach-omap2/Makefile
> @@ -243,8 +243,4 @@ arch/arm/mach-omap2/pm-asm-offsets.s: 
> arch/arm/mach-omap2/pm-asm-offsets.c
>  include/generated/ti-pm-asm-offsets.h: arch/arm/mach-omap2/pm-asm-offsets.s 
> FORCE
>   $(call filechk,offsets,__TI_PM_ASM_OFFSETS_H__)
>  
> -# For rule to generate ti-emif-asm-offsets.h dependency
> -include drivers/memory/Makefile.asm-offsets
> -
> -arch/arm/mach-omap2/sleep33xx.o: include/generated/ti-pm-asm-offsets.h 
> include/generated/ti-emif-asm-offsets.h
> -arch/arm/mach-omap2/sleep43xx.o: include/generated/ti-pm-asm-offsets.h 
> include/generated/ti-emif-asm-offsets.h
> +$(obj)/sleep33xx.o $(obj)/sleep43xx.o: include/generated/ti-pm-asm-offsets.h
> diff --git a/arch/arm/mach-omap2/pm-asm-offsets.c 
> b/arch/arm/mach-omap2/pm-asm-offsets.c
> index 6d4392da7c11..b9846b19e5e2 100644
> --- a/arch/arm/mach-omap2/pm-asm-offsets.c
> +++ b/arch/arm/mach-omap2/pm-asm-offsets.c
> @@ -7,9 +7,12 @@
>  
>  #include 
>  #include 
> +#include 
>  
>  int main(void)
>  {
> + ti_emif_asm_offsets();
> +
>   DEFINE(AMX3_PM_WFI_FLAGS_OFFSET,
>  offsetof(struct am33xx_pm_sram_data, wfi_flags));
>   DEFINE(AMX3_PM_L2_AUX_CTRL_VAL_OFFSET,
> diff --git a/arch/arm/mach-omap2/sleep33xx.S b/arch/arm/mach-omap2/sleep33xx.S
> index 218d79930b04..322b3bb868b4 100644
> --- a/arch/arm/mach-omap2/sleep33xx.S
> +++ b/arch/arm/mach-omap2/sleep33xx.S
> @@ -6,7 +6,6 @@
>   *   Dave Gerlach, Vaibhav Bedia
>   */
>  
> -#include 
>  #include 
>  #include 
>  #include 
> diff --git a/arch/arm/mach-omap2/sleep43xx.S b/arch/arm/mach-omap2/sleep43xx.S
> index b24be624e8b9..8903814a6677 100644
> --- a/arch/arm/mach-omap2/sleep43xx.S
> +++ b/arch/arm/mach-omap2/sleep43xx.S
> @@ -6,7 +6,6 @@
>   *   Dave Gerlach, Vaibhav Bedia
>   */
>  
> -#include 
>  #include 
>  #include 
>  #include 
> diff --git a/drivers/memory/emif-asm-offsets.c 
> b/drivers/memory/emif-asm-offsets.c
> index 71a89d5d3efd..db8043019ec6 100644
> --- a/drivers/memory/emif-asm-offsets.c
> +++ b/drivers/memory/emif-asm-offsets.c
> @@ -16,77 +16,7 @@
>  
>  int main(void)
>  {
> - DEFINE(EMIF_SDCFG_VAL_OFFSET,
> -offsetof(struct emif_regs_amx3, emif_sdcfg_val));
> - DEFINE(EMIF_TIMING1_VAL_OFFSET,
> -offsetof(struct emif_regs_amx3, emif_timing1_val));
> - DEFINE(EMIF_TIMING2_VAL_OFFSET,
> -offsetof(struct emif_regs_amx3, emif_timing2_val));
> - DEFINE(EMIF_TIMING3_VAL_OFFSET,
> -offsetof(struct emif_regs_amx3, emif_timing3_val));
> - DEFINE(EMIF_REF_CTRL_VAL_OFFSET,
> -offsetof(struct emif_regs_amx3, emif_ref_ctrl_val));
> - DEFINE(EMIF_ZQCFG_VAL_OFFSET,
> -offsetof(struct emif_regs_amx3, emif_zqcfg_val));
> - DEFINE(EMIF_PMCR_VAL_OFFSET,
> -offsetof(struct emif_regs_amx3, emif_pmcr_val));
> - DEFINE(EMIF_PMCR_SHDW_VAL_OFFSET,
> -offsetof(struct emif_regs_amx3, emif_pmcr_shdw_val));
> - DEFINE(EMIF_RD_WR_LEVEL_RAMP_CTRL_OFFSET,
> -offsetof(struct emif_regs_amx3, emif_rd_wr_level_ramp_ctrl));
> - DEFINE(EMIF_RD_WR_EXEC_THRESH_OFFSET,
> -offsetof(struct emif_regs_amx3, emif_rd_wr_exec_thresh));
> - DEFINE(EMIF_COS_CONFIG_OFFSET,
> -offsetof(struct emif_regs_amx3, emif_cos_config));
> - DEFINE(EMIF_PRIORITY_TO_COS_MAPPING_OFFSET,
> -offsetof(struct emif_regs_amx3, emif_priority_to_cos_map

Re: [PATCH v6 4/9] dt-bindings: gpio: Add gpio nodes for Actions S900 SoC

2018-04-12 Thread Linus Walleij

On Wed, Mar 28, 2018 at 7:46 PM, Manivannan Sadhasivam
 wrote:

> Add gpio nodes for Actions Semi S900 SoC.
>
> Signed-off-by: Manivannan Sadhasivam 

This should probably have Subject "add bindings" rather than "add gpio nodes"
but it's fine, I can fix it up when applying if I just get Rob's ACK
on these bindings (that look entirely uncontroversial).

Yours,
Linus Walleij

Re: [PATCH] sched/rt.c: pick and check task if double_lock_balance() unlock the rq

2018-04-12 Thread Libin (Huawei)



在 2018/4/11 18:26, Peter Zijlstra 写道:

On Tue, Apr 10, 2018 at 06:05:46PM -0400, Steven Rostedt wrote:


Peter,

Going through my inbox, I stumbled across this one. And it doesn't
appear to be addressed.

I think this patch is a reasonable solution.


Urgh, yeah, also seem to have forgotten about it. The proposed solution
is in fact simpler than the existing code. Also, I think deadline.c has
the exact same problem.

Zhou, could you respin and fix both?


Thanks for your reply, and I will fix the deadline.c and resend the two 
patches together.


Thanks,
Li Bin

Re: [PATCH] block: ratelimite pr_err on IO path

2018-04-12 Thread Jinpu Wang

On Wed, Apr 11, 2018 at 7:07 PM, Elliott, Robert (Persistent Memory)
 wrote:
>> -Original Message-
>> From: linux-kernel-ow...@vger.kernel.org [mailto:linux-kernel-
>> ow...@vger.kernel.org] On Behalf Of Jack Wang
>> Sent: Wednesday, April 11, 2018 8:21 AM
>> Subject: [PATCH] block: ratelimite pr_err on IO path
>>
>> From: Jack Wang 
> ...
>> - pr_err("%s: ref tag error at location %llu " \
>> -"(rcvd %u)\n", iter->disk_name,
>> -(unsigned long long)
>> -iter->seed, be32_to_cpu(pi->ref_tag));
>> + pr_err_ratelimited("%s: ref tag error at "
>> +"location %llu (rcvd %u)\n",
>
> Per process/coding-style.rst, you should keep a string like that on
> one line even if that exceeds 80 columns:
>
>   Statements longer than 80 columns will be broken into sensible chunks, 
> unless
>   exceeding 80 columns significantly increases readability and does not hide
>   information. ... However, never break user-visible strings such as
>   printk messages, because that breaks the ability to grep for them.
>
>
Thanks Robert, as the original code keep the 80 columns, I just
followed, I will fix it in v2.


-- 
Jack Wang
Linux Kernel Developer

ProfitBricks GmbH
Greifswalder Str. 207
D - 10405 Berlin

[PATCH] spi: imx: Update MODULE_DESCRIPTION to "SPI Controller driver"

2018-04-12 Thread wangbo

Now i.MX SPI controller can work in Slave mode.
Update MODULE_DESCRIPTION to "SPI Controller driver".

Signed-off-by: wangbo 
---
 drivers/spi/spi-imx.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/spi/spi-imx.c b/drivers/spi/spi-imx.c
index 6f57592..a056ee8 100644
--- a/drivers/spi/spi-imx.c
+++ b/drivers/spi/spi-imx.c
@@ -1701,7 +1701,7 @@ static struct platform_driver spi_imx_driver = {
 };
 module_platform_driver(spi_imx_driver);
 
-MODULE_DESCRIPTION("SPI Master Controller driver");
+MODULE_DESCRIPTION("SPI Controller driver");
 MODULE_AUTHOR("Sascha Hauer, Pengutronix");
 MODULE_LICENSE("GPL");
 MODULE_ALIAS("platform:" DRIVER_NAME);
-- 
2.7.4

Re: [PATCH] scsi: qla2xxx: reduce the time granularity of qla2x00_eh_wait_on_command

2018-04-12 Thread jianchao.wang

Would anyone please take a review on this ?

Thanks in advance
Jianchao

On 04/10/2018 04:48 PM, Jianchao Wang wrote:
> If the cmd has not be returned after aborted by qla2x00_eh_abort,
> we have to wait for it. However, the time is 1000ms at least currently.
> If there are a lot cmds need to be aborted, the delay could be long
> enough to lead to panic due to such as hung task, ocfs2 heartbeat,
> etc, just before scsi recovery works.
> Change the granularity to 1ms, even though more context switches
> would be introduced, but it should be ok as it is not hot path.
> 
> Signed-off-by: Jianchao Wang 
> ---
>  drivers/scsi/qla2xxx/qla_os.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c
> index 5c5dcca4..9f52ad9 100644
> --- a/drivers/scsi/qla2xxx/qla_os.c
> +++ b/drivers/scsi/qla2xxx/qla_os.c
> @@ -1072,7 +1072,7 @@ qla2xxx_mqueuecommand(struct Scsi_Host *host, struct 
> scsi_cmnd *cmd,
>  static int
>  qla2x00_eh_wait_on_command(struct scsi_cmnd *cmd)
>  {
> -#define ABORT_POLLING_PERIOD 1000
> +#define ABORT_POLLING_PERIOD 1
>  #define ABORT_WAIT_ITER  ((2 * 1000) / (ABORT_POLLING_PERIOD))
>   unsigned long wait_iter = ABORT_WAIT_ITER;
>   scsi_qla_host_t *vha = shost_priv(cmd->device->host);
>

Re: [PATCH v7 4/9] dt-bindings: gpio: Add gpio nodes for Actions S900 SoC

2018-04-12 Thread Linus Walleij

On Wed, Apr 4, 2018 at 7:22 PM, Manivannan Sadhasivam
 wrote:

> Add gpio nodes for Actions Semi S900 SoC.
>
> Signed-off-by: Manivannan Sadhasivam 

Renamed from "gpio nodes" to "gpio bindings" and applied
for v4.18 with Rob's ACK.

Yours,
Linus Walleij

Re: [PATCH v7 3/9] pinctrl: actions: Add Actions S900 pinctrl driver

2018-04-12 Thread Linus Walleij

On Wed, Apr 4, 2018 at 7:22 PM, Manivannan Sadhasivam
 wrote:

> Add pinctrl driver for Actions Semi S900 SoC. The driver supports
> pinctrl, pinmux and pinconf functionalities through a range of registers
> common to both gpio driver and pinctrl driver.
>
> Pinmux functionality is available only for the pin groups while the
> pinconf functionality is available for both pin groups and individual
> pins.
>
> Signed-off-by: Manivannan Sadhasivam 

Patch applied for v4.18

GOOD WORK!

We really need to get this in so that Andreas can work on S500
patches with this as a base.

If any review comments still remain they can surely be addressed
with incremental improvement patches.

Yours,
Linus Walleij

Re: [PATCH v7 7/9] gpio: Add gpio driver for Actions OWL S900 SoC

2018-04-12 Thread Linus Walleij

On Wed, Apr 4, 2018 at 7:22 PM, Manivannan Sadhasivam
 wrote:

> Add gpio driver for Actions Semi OWL family S900 SoC. Set of registers
> controlling the gpio shares the same register range with pinctrl block.
>
> GPIO registers are organized as 6 banks and each bank controls the
> maximum of 32 gpios.
>
> Signed-off-by: Manivannan Sadhasivam 
> Reviewed-by: Andy Shevchenko 

Patch applied for v4.18.

Again: excellent work.

Yours,
Linus Walleij

Re: [PATCH v7 9/9] MAINTAINERS: Add Actions Semi S900 pinctrl and gpio entries

2018-04-12 Thread Linus Walleij

On Wed, Apr 4, 2018 at 7:22 PM, Manivannan Sadhasivam
 wrote:

> Add S900 pinctrl and gpio entries under ARCH_ACTIONS
>
> Signed-off-by: Manivannan Sadhasivam 

This should probably also go in through the ARM SoC tree.
Reviewed-by: Linus Walleij 

Again, tell me if things get stuck and I will apply it.

(It doesn't matter that the files referenced come in from different trees.
I think.)

Yours,
Linus Walleij

Re: [PATCH v7 8/9] MAINTAINERS: Add reviewer for ACTIONS platforms

2018-04-12 Thread Linus Walleij

On Wed, Apr 4, 2018 at 7:22 PM, Manivannan Sadhasivam
 wrote:

> Since I'll be working on improving support for ACTIONS platforms, adding
> myself as the reviewer.
>
> Signed-off-by: Manivannan Sadhasivam 

Andreas: are you funneling patches to ARM SoC for the
Actions stuff? Or who is?

This should go in through the ARM SoC tree anyways,
Reviewed-by: Linus Walleij 

Tell me if things get stuck and I'll apply it to the GPIO tree...

Yours,
Linus Walleij

Re: [PATCH] ARM: omap2: Fix build when using split object directories

2018-04-12 Thread Masahiro Yamada

2018-04-12 17:21 GMT+09:00 Anders Roxell :
> On 2018-04-11 16:15, Dave Gerlach wrote:
>> The sleep33xx and sleep43xx files should not depend on a header file
>> generated in drivers/memory. Remove this dependency and instead allow
>> both drivers/memory and arch/arm/mach-omap2 to generate all macros
>> needed in headers local to their own paths.
>>
>> This fixes an issue where the build fail will when using O= to set a
>> split object directory and arch/arm/mach-omap2 is built before
>> drivers/memory with the following error:
>>
>> .../drivers/memory/emif-asm-offsets.c:1:0: fatal error: can't open 
>> drivers/memory/emif-asm-offsets.s for writing: No such file or directory
>> compilation terminated.
>>
>> Fixes: 41d9d44d7258 ("ARM: OMAP2+: pm33xx-core: Add platform code needed for 
>> PM")
>> Acked-by: Tony Lindgren 
>> Reviewed-by: Masahiro Yamada 
>> Signed-off-by: Dave Gerlach 
>
> Tested-by: Anders Roxell 
>
> Maybe we can remove drivers/memory/Makefile.asm-offsets and move those
> changes into drivers/memory/Makefile ?

Agree!




-- 
Best Regards
Masahiro Yamada

Re: [PATCH v2] readv.2, io_submit.2: Document RWF_APPEND added in Linux 4.16

2018-04-12 Thread Michael Kerrisk (man-pages)

On 04/06/2018 03:51 PM, Jürg Billeter wrote:
> Signed-off-by: Jürg Billeter 

Thanks, Jürg. Patch applied.

Cheers,

Michael


> ---
> Changes since version 1:
> - Explain offset handling
> 
>  man2/io_submit.2 | 13 +
>  man2/readv.2 | 17 +
>  2 files changed, 30 insertions(+)
> 
> diff --git a/man2/io_submit.2 b/man2/io_submit.2
> index 397fd0b75..25961138a 100644
> --- a/man2/io_submit.2
> +++ b/man2/io_submit.2
> @@ -111,6 +111,19 @@ field of the
>  .I io_event
>  structure (see
>  .BR io_getevents (2)).
> +.TP
> +.BR RWF_APPEND " (since Linux 4.16)"
> +.\" commit e1fc742e14e01d84d9693c4aca4ab23da65811fb
> +Append data to the end of the file.
> +See the description of the flag of the same name in
> +.BR pwritev2 (2)
> +as well as the description of
> +.B O_APPEND
> +in
> +.BR open (2).
> +The
> +.I aio_offset
> +field is ignored. The file offset is not changed.
>  .RE
>  .TP
>  .I aio_lio_opcode
> diff --git a/man2/readv.2 b/man2/readv.2
> index b3b7b9658..9ef250e11 100644
> --- a/man2/readv.2
> +++ b/man2/readv.2
> @@ -248,6 +248,23 @@ to
>  .BR EAGAIN .
>  Currently, this flag is meaningful only for
>  .BR preadv2 ().
> +.TP
> +.BR RWF_APPEND " (since Linux 4.16)"
> +.\" commit e1fc742e14e01d84d9693c4aca4ab23da65811fb
> +Provide a per-write equivalent of the
> +.B O_APPEND
> +.BR open (2)
> +flag.
> +This flag is meaningful only for
> +.BR pwritev2 (),
> +and its effect applies only to the data range written by the system call.
> +The
> +.I offset
> +argument does not affect the write operation, the data is always appended
> +to the end of the file. However, if the
> +.I offset
> +argument is \-1, the current file offset is updated.
> +This matches the behavior when the file is opened in append mode.
>  .SH RETURN VALUE
>  On success,
>  .BR readv (),
> 


-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

Re: [RFC tip/locking/lockdep v6 19/20] rcu: Equip sleepable RCU with lockdep dependency graph checks

2018-04-12 Thread Peter Zijlstra

On Thu, Apr 12, 2018 at 10:12:33AM +0800, Boqun Feng wrote:
> A trivial fix/hack would be adding local_irq_disable() and
> local_irq_enable() around srcu_lock_sync() like:
> 
>   static inline void srcu_lock_sync(struct lockdep_map *map)
>   {
>   local_irq_disable();
>   lock_map_acquire(map);
>   lock_map_release(map);
>   local_irq_enable();
>   }
> 
> However, it might be better, if lockdep could provide some annotation
> API for such an empty critical section to say the grap-and-drop is
> atomic. Something like:
> 
>   /*
>* Annotate a wait point for all previous critical section to
>* go out.
>* 
>* This won't make @map a irq unsafe lock, no matter it's called
>* w/ or w/o irq disabled.
>*/
>   lock_wait_unlock(struct lockdep_map *map, ..)
> 
> And in this primitive, we do something similar like
> lock_acquire()+lock_release(). This primitive could be used elsewhere,
> as I bebieve we have several empty grab-and-drop critical section for
> lockdep annotations, e.g. in start_flush_work().
> 
> Thoughts?
> 
> This cerntainly requires a bit more work, in the meanwhile, I will add
> another self testcase which has a srcu_read_lock() called in irq.

Yeah, I've never really bothered to clean those things up, but I don't
see any reason to stop you from doing it ;-)

As to the initial pattern with disabling IRQs, I think I've seen code
like that before, and in general performance isn't a top priority
(within reason) when you're running lockdep kernels, so I've usually let
it be.

Re: [PATCH] drm/vc4: Fix leak of the file_priv that stored the perfmon.

2018-04-12 Thread Boris Brezillon

On Mon,  9 Apr 2018 13:58:13 -0700
Eric Anholt  wrote:

> Signed-off-by: Eric Anholt 
> Fixes: 65101d8c9108 ("drm/vc4: Expose performance counters to userspace")

Reviewed-by: Boris Brezillon 

> ---
>  drivers/gpu/drm/vc4/vc4_drv.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/gpu/drm/vc4/vc4_drv.c b/drivers/gpu/drm/vc4/vc4_drv.c
> index 94b99c90425a..7c95ed5c5cac 100644
> --- a/drivers/gpu/drm/vc4/vc4_drv.c
> +++ b/drivers/gpu/drm/vc4/vc4_drv.c
> @@ -130,6 +130,7 @@ static void vc4_close(struct drm_device *dev, struct 
> drm_file *file)
>   struct vc4_file *vc4file = file->driver_priv;
>  
>   vc4_perfmon_close_file(vc4file);
> + kfree(vc4file);
>  }
>  
>  static const struct vm_operations_struct vc4_vm_ops = {

Re: [PATCH v7] Revert "PCI: hv: Use device serial number as PCI domain"

2018-04-12 Thread Lorenzo Pieralisi

On Thu, Apr 12, 2018 at 02:44:42AM +, Sridhar Pitchai wrote:
> When Linux runs as a guest VM in Hyper-V and Hyper-V adds the virtual PCI
> bus to the guest, Hyper-V always provides unique PCI domain.
> 
> commit 4a9b0933bdfc ("PCI: hv: Use device serial number as PCI domain")
> overrode unique domain with the serial number of the first device added to
> the virtual PCI bus.
> 
> The reason for that patch was to have a consistent and short name for the
> device, but Hyper-V doesn't provide unique serial numbers. Using non-unique
> serial numbers as domain IDs leads to duplicate device addresses, which
> causes PCI bus registration to fail.
> 
> Revert commit 4a9b0933bdfc ("PCI: hv: Use device serial number as PCI
> domain") so we can reliably support multiple devices being assigned to
> a guest.
> 
> Fixes: 4a9b0933bdfc ("PCI: hv: Use device serial number as PCI domain")
> Signed-off-by: Sridhar Pitchai 
> Cc: sta...@vger.kernel.org

I am still not happy with this patch.

-  You do not explain at all the dependency on commit 0c195567a8f6 and
   you should because that's fundamental, if that patch is not present
   this revert breaks the kernel as per previous discussions[1].
-  You are sending this patch to all stable kernels that contain the
   commit you are fixing - some that may not contain the commit above
   (that was merged in v4.14), you are breaking those kernels, if not
   explain me why please

You must mark the stable kernels you want this revert to be applied to
eg:

Cc:  # v4.14+

and for kernels that do not contain the 0c195567a8f6 commit you have to
add the dependency. Please read the documentation Greg provided you in
relation to stable kernel rules.

Use:

git tag --contains

to detect in what kernel version the given commits are present.

[1] https://marc.info/?l=linux-pci&m=152158684221212&w=2

> Reviewed-by: Bjorn Helgaas 
> 
> ---
> Changes in v7:
> * fix the commit comment. [Bjorn Helgaas]
> ---
>  drivers/pci/host/pci-hyperv.c | 11 ---
>  1 file changed, 11 deletions(-)
> 
> diff --git a/drivers/pci/host/pci-hyperv.c b/drivers/pci/host/pci-hyperv.c
> index 2faf38eab785..ac67e56e451a 100644
> --- a/drivers/pci/host/pci-hyperv.c
> +++ b/drivers/pci/host/pci-hyperv.c
> @@ -1518,17 +1518,6 @@ static struct hv_pci_dev *new_pcichild_device(struct 
> hv_pcibus_device *hbus,
>   get_pcichild(hpdev, hv_pcidev_ref_childlist);
>   spin_lock_irqsave(&hbus->device_list_lock, flags);
>  
> - /*
> -  * When a device is being added to the bus, we set the PCI domain
> -  * number to be the device serial number, which is non-zero and
> -  * unique on the same VM.  The serial numbers start with 1, and
> -  * increase by 1 for each device.  So device names including this
> -  * can have shorter names than based on the bus instance UUID.
> -  * Only the first device serial number is used for domain, so the
> -  * domain number will not change after the first device is added.
> -  */
> - if (list_empty(&hbus->children))
> - hbus->sysdata.domain = desc->ser;
>   list_add_tail(&hpdev->list_entry, &hbus->children);
>   spin_unlock_irqrestore(&hbus->device_list_lock, flags);
>   return hpdev;
> -- 
> 2.14.1
>

Potential problem with 31e77c93e432dec7 ("sched/fair: Update blocked load when newly idle")

2018-04-12 Thread Niklas Söderlund

Hi Vincent,

I have observed issues running on linus/master from a few days back [1]. 
I'm running on a Renesas Koelsch board (arm32) and I can trigger a issue 
by X forwarding the v4l2 test application qv4l2 over ssh and moving the 
courser around in the GUI (best test case description award...). I'm 
sorry about the really bad way I trigger this but I can't do it in any 
other way, I'm happy to try other methods if you got some ideas. The 
symptom of the issue is a complete hang of the system for more then 30 
seconds and then this information is printed in the console:

[  142.849390] INFO: rcu_sched detected stalls on CPUs/tasks:
[  142.854972]  1-...!: (1 GPs behind) idle=7a4/0/0 softirq=3214/3217 fqs=0
[  142.861976]  (detected by 0, t=8232 jiffies, g=930, c=929, q=11)
[  142.868042] Sending NMI from CPU 0 to CPUs 1:
[  142.872443] NMI backtrace for cpu 1
[  142.872452] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 
4.16.0-05506-g28aba11c1393691a #14
[  142.872455] Hardware name: Generic R8A7791 (Flattened Device Tree)
[  142.872473] PC is at arch_cpu_idle+0x28/0x44
[  142.872484] LR is at trace_hardirqs_on_caller+0x1a4/0x1d4
[  142.872488] pc : []lr : []psr: 20070013
[  142.872491] sp : eb0b9f90  ip : eb0b9f60  fp : eb0b9f9c
[  142.872495] r10:   r9 : 413fc0f2  r8 : 4000406a
[  142.872498] r7 : c0c08478  r6 : c0c0842c  r5 : e000  r4 : 0002
[  142.872502] r3 : eb0b6ec0  r2 :   r1 : 0004  r0 : 0001
[  142.872507] Flags: nzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
[  142.872511] Control: 10c5387d  Table: 6a61406a  DAC: 0051
[  142.872516] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 
4.16.0-05506-g28aba11c1393691a #14
[  142.872519] Hardware name: Generic R8A7791 (Flattened Device Tree)
[  142.872522] Backtrace:
[  142.872534] [] (dump_backtrace) from [] 
(show_stack+0x18/0x1c)
[  142.872540]  r7:c0c81388 r6: r5:60070193 r4:c0c81388
[  142.872550] [] (show_stack) from [] 
(dump_stack+0xa4/0xd8)
[  142.872557] [] (dump_stack) from [] (show_regs+0x14/0x18)
[  142.872563]  r9:0001 r8: r7:c0c4f678 r6:eb0b9f40 r5:0001 
r4:c13e1130
[  142.872571] [] (show_regs) from [] 
(nmi_cpu_backtrace+0xfc/0x118)
[  142.872578] [] (nmi_cpu_backtrace) from [] 
(handle_IPI+0x2a8/0x320)
[  142.872583]  r7:c0c4f678 r6:eb0b9f40 r5:0007 r4:c0b75b68
[  142.872594] [] (handle_IPI) from [] 
(gic_handle_irq+0x8c/0x98)
[  142.872599]  r10: r9:eb0b8000 r8:f0803000 r7:c0c4f678 r6:eb0b9f40 
r5:c0c08a90
[  142.872602]  r4:f0802000
[  142.872608] [] (gic_handle_irq) from [] 
(__irq_svc+0x70/0x98)
[  142.872612] Exception stack(0xeb0b9f40 to 0xeb0b9f88)
[  142.872618] 9f40: 0001 0004  eb0b6ec0 0002 e000 
c0c0842c c0c08478
[  142.872624] 9f60: 4000406a 413fc0f2  eb0b9f9c eb0b9f60 eb0b9f90 
c01747a8 c01088a4
[  142.872627] 9f80: 20070013 
[  142.872632]  r9:eb0b8000 r8:4000406a r7:eb0b9f74 r6: r5:20070013 
r4:c01088a4
[  142.872642] [] (arch_cpu_idle) from [] 
(default_idle_call+0x34/0x38)
[  142.872650] [] (default_idle_call) from [] 
(do_idle+0xe0/0x134)
[  142.872656] [] (do_idle) from [] 
(cpu_startup_entry+0x20/0x24)
[  142.872660]  r7:c0c8e9d0 r6:10c0387d r5:0051 r4:0085
[  142.872667] [] (cpu_startup_entry) from [] 
(secondary_start_kernel+0x114/0x134)
[  142.872673] [] (secondary_start_kernel) from [<401026ec>] 
(0x401026ec)
[  142.872676]  r5:0051 r4:6b0a406a
[  142.873456] rcu_sched kthread starved for 8235 jiffies! g930 c929 f0x0 
RCU_GP_WAIT_FQS(3) ->state=0x402 ->cpu=0
[  143.135040] RCU grace-period kthread stack dump:
[  143.139695] rcu_sched   I0 9  2 0x
[  143.145234] Backtrace:
[  143.147719] [] (__schedule) from [] (schedule+0x94/0xb8)
[  143.154823]  r10:c0b714c0 r9:c0c85f8a r8: r7:eb0abec4 r6:a274 
r5:
[  143.162712]  r4:e000
[  143.165273] [] (schedule) from [] 
(schedule_timeout+0x440/0x4b0)
[  143.173076]  r5: r4:eb79c4c0
[  143.176692] [] (schedule_timeout) from [] 
(rcu_gp_kthread+0x958/0x150c)
[  143.185108]  r10:c0c87274 r9: r8:c0c165b8 r7:0001 r6: 
r5:c0c16590
[  143.192997]  r4:c0c16300
[  143.195560] [] (rcu_gp_kthread) from [] 
(kthread+0x148/0x160)
[  143.203099]  r7:c0c16300
[  143.205660] [] (kthread) from [] 
(ret_from_fork+0x14/0x20)
[  143.212938] Exception stack(0xeb0abfb0 to 0xeb0abff8)
[  143.218030] bfa0:   
 
[  143.226271] bfc0:       
 
[  143.234511] bfe0:     0013 
[  143.241177]  r10: r9: r8: r7: r6: 
r5:c0145d70
[  143.249065]  r4:eb037b00

After the freeze the system becomes responsive again and I can sometimes 
trigger the hang multiple times. I tried to bisect the problem and I 
found that by reverting [2] I can no longer reproduce the issue. I can 
also no

[tip:perf/urgent] perf/core: Need CAP_SYS_ADMIN to create k/uprobe with perf_event_open()

2018-04-12 Thread tip-bot for Song Liu

Commit-ID:  32e6e967fb36bf77ed99221ae3ce1909f045d8f9
Gitweb: https://git.kernel.org/tip/32e6e967fb36bf77ed99221ae3ce1909f045d8f9
Author: Song Liu 
AuthorDate: Wed, 11 Apr 2018 18:02:37 +
Committer:  Ingo Molnar 
CommitDate: Thu, 12 Apr 2018 09:55:50 +0200

perf/core: Need CAP_SYS_ADMIN to create k/uprobe with perf_event_open()

Non-root user cannot create kprobe or uprobe through the text-based
interface (kprobe_events, uprobe_events),so they should not be able
to create probes via perf_event_open() either.

Reported-by: Vince Weaver 
Signed-off-by: Song Liu 
Cc: Linus Torvalds 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Fixes: 33ea4b24277b ("perf/core: Implement the 'perf_uprobe' PMU")
Fixes: e12f03d7031a ("perf/core: Implement the 'perf_kprobe' PMU")
Link: http://lkml.kernel.org/r/c0b2efb5-c403-4bdb-9046-c14b3ee66...@fb.com
Signed-off-by: Ingo Molnar 
---
 kernel/events/core.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index d7af82827373..2d5fe26551f8 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -8400,6 +8400,10 @@ static int perf_kprobe_event_init(struct perf_event 
*event)
 
if (event->attr.type != perf_kprobe.type)
return -ENOENT;
+
+   if (!capable(CAP_SYS_ADMIN))
+   return -EACCES;
+
/*
 * no branch sampling for probe events
 */
@@ -8437,6 +8441,10 @@ static int perf_uprobe_event_init(struct perf_event 
*event)
 
if (event->attr.type != perf_uprobe.type)
return -ENOENT;
+
+   if (!capable(CAP_SYS_ADMIN))
+   return -EACCES;
+
/*
 * no branch sampling for probe events
 */

[PATCH v2] block: ratelimite pr_err on IO path

2018-04-12 Thread Jack Wang

From: Jack Wang 

This avoid soft lockup below:
[ 2328.328429] Call Trace:
[ 2328.328433]  vprintk_emit+0x229/0x2e0
[ 2328.328436]  ? t10_pi_type3_verify_ip+0x20/0x20
[ 2328.328437]  printk+0x52/0x6e
[ 2328.328439]  t10_pi_verify+0x9e/0xf0
[ 2328.328441]  bio_integrity_process+0x12e/0x220
[ 2328.328442]  ? t10_pi_type1_verify_crc+0x20/0x20
[ 2328.328443]  bio_integrity_verify_fn+0xde/0x140
[ 2328.328447]  process_one_work+0x13f/0x370
[ 2328.328449]  worker_thread+0x62/0x3d0
[ 2328.328450]  ? rescuer_thread+0x2f0/0x2f0
[ 2328.328452]  kthread+0x116/0x150
[ 2328.328454]  ? __kthread_parkme+0x70/0x70
[ 2328.328457]  ret_from_fork+0x35/0x40

Signed-off-by: Jack Wang 
---
v2: keep the message in same line as Robert and coding style suggested

 block/t10-pi.c | 18 ++
 1 file changed, 10 insertions(+), 8 deletions(-)

diff --git a/block/t10-pi.c b/block/t10-pi.c
index a98db38..6faf8c1 100644
--- a/block/t10-pi.c
+++ b/block/t10-pi.c
@@ -84,10 +84,11 @@ static blk_status_t t10_pi_verify(struct blk_integrity_iter 
*iter,
 
if (be32_to_cpu(pi->ref_tag) !=
lower_32_bits(iter->seed)) {
-   pr_err("%s: ref tag error at location %llu " \
-  "(rcvd %u)\n", iter->disk_name,
-  (unsigned long long)
-  iter->seed, be32_to_cpu(pi->ref_tag));
+   pr_err_ratelimited("%s: ref tag error at 
location %llu (rcvd %u)\n",
+  iter->disk_name,
+  (unsigned long long)
+  iter->seed,
+  be32_to_cpu(pi->ref_tag));
return BLK_STS_PROTECTION;
}
break;
@@ -101,10 +102,11 @@ static blk_status_t t10_pi_verify(struct 
blk_integrity_iter *iter,
csum = fn(iter->data_buf, iter->interval);
 
if (pi->guard_tag != csum) {
-   pr_err("%s: guard tag error at sector %llu " \
-  "(rcvd %04x, want %04x)\n", iter->disk_name,
-  (unsigned long long)iter->seed,
-  be16_to_cpu(pi->guard_tag), be16_to_cpu(csum));
+   pr_err_ratelimited("%s: guard tag error at sector %llu 
(rcvd %04x, want %04x)\n",
+  iter->disk_name,
+  (unsigned long long)iter->seed,
+  be16_to_cpu(pi->guard_tag),
+  be16_to_cpu(csum));
return BLK_STS_PROTECTION;
}
 
-- 
2.7.4

Re: [PATCH] memory-model: fix cheat sheet typo

2018-04-12 Thread Andrea Parri

On Wed, Apr 11, 2018 at 01:15:58PM +0200, Paolo Bonzini wrote:
> On 10/04/2018 23:34, Paul E. McKenney wrote:
> > Glad it helps, and I have queued it for the next merge window.  Of course,
> > if a further improvement comes to mind, please do not keep it a secret.  ;-)
> 
> Yes, there are several changes that could be included:

Thank you for looking into this and for the suggestions.


> 
> - SV could be added to the prior operation case as well?  It should be
> symmetric

Seems reasonable to me.


> 
> - The *_relaxed() case also applies to void RMW

Indeed. *_relaxed() and void RMW do present some semantics differences
(c.f., e.g., 'Noreturn' in the definition of 'rmb' from the .cat file),
but the cheat sheet seems to be already 'cheating' here. ;-)


> 
> - smp_store_mb() is missing

Good point. In fact, we could add this to the model as well: following
Peter's remark/the generic implementation,

diff --git a/tools/memory-model/linux-kernel.def 
b/tools/memory-model/linux-kernel.def
index 397e4e67e8c84..acf86f6f360a7 100644
--- a/tools/memory-model/linux-kernel.def
+++ b/tools/memory-model/linux-kernel.def
@@ -14,6 +14,7 @@ smp_store_release(X,V) { __store{release}(*X,V); }
 smp_load_acquire(X) __load{acquire}(*X)
 rcu_assign_pointer(X,V) { __store{release}(X,V); }
 rcu_dereference(X) __load{once}(X)
+smp_store_mb(X,V) { __store{once}(X,V); __fence{mb}; }
 
 // Fences
 smp_mb() { __fence{mb} ; }

... unless I'm missing something here: I'll send a patch with this.


> 
> - smp_rmb() orders prior reads fully against subsequent RMW because SV
> applies between the two parts of the RMW; likewise smp_wmb() orders prior
> RMW fully against subsequent writes

It could be argued that this ordering is the result of the combination
of two 'mechanisms' (barrier+SV/atomicity), and that it makes sense to
distinguish them... But either way would be fine for me.


> 
> 
> I am going submit these changes separately, but before doing that I can show
> also my rewrite of the cheat sheet.
> 
> The advantage is that, at least to me, it's clearer (and gets rid of
> "Self" :)).
> 
> The disadvantage is that it's much longer---almost twice the lines, even if
> you discount the splitting out of cumulative/propagating to a separate table
> (which in turn is because to me it's a different level of black magic).

Yeah, those 'Ordering is cumulative', 'Ordering propagates' could mean
different things to different readers... (and I'm not going to attempt
some snappy descriptions now). IMO, we may even omit such information;
this doc. does not certainly aim for completeness, after all. OTOH, we
ought to refrain from making this doc. an excuse to transform (what it
is really) high-school maths into some black magic. ;-) So once again,
thank you for your feedback!

  Andrea


> 
> -
> Memory operations are listed in this document as follows:
> 
>   R:  Read portion of RMW
>   W:  Write portion of RMW
>   DR: Dependent read (address dependency)
>   DW: Dependent write (address, data, or control dependency)
>   RMW:Atomic read-modify-write operation
>   SV  Other accesses to the same variable
> 
> 
> Memory access operations order other memory operations against themselves as
> follows:
> 
>Prior Operation   Subsequent Operation
>---   -
>R  W  RMW  SV R  W  DR  DW  RMW  SV
>-  -  ---  -- -  -  --  --  ---  --
> Store, e.g., WRITE_ONCE()  Y Y
> Load, e.g., READ_ONCE()YY   YY
> Unsuccessful RMW operation YY   YY
> *_relaxed() or void RMW operation  YY   YY
> rcu_dereference()  YY   YY
> Successful *_acquire() Y  r  r  r   rr   Y
> Successful *_release() w  ww   Y Y
> smp_store_mb() Y  YY   Y  Y  Y   Y   Y   Y   Y
> Successful full non-void RMW   Y  YY   Y  Y  Y   Y   Y   Y   Y
> 
> Key:  Y:  Memory operation provides ordering
>   r:  Cannot move past the read portion of the *_acquire()
>   w:  Cannot move past the write portion of the *_release()
> 
> 
> Memory barriers order prior memory operations against subsequent memory
> operations.  Two operations are ordered if both have non-empty cells in
> the following table:
> 
>Prior Operation   Subsequent Operation
>---   
>R  W  RMW R  W  DR  DW  RMW
>-  -  --- -  -  --  --  ---
> smp_rmb()  Y

[GIT PULL] pwm: Changes for v4.17-rc1

2018-04-12 Thread Thierry Reding

Hi Linus,

The following changes since commit 7928b2cbe55b2a410a0f5c1f154610059c57b1b2:

  Linux 4.16-rc1 (2018-02-11 15:04:29 -0800)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm.git 
tags/pwm/for-4.17-rc1

for you to fetch changes up to 6873842235d678a245a378669f35e145df2441b9:

  pwm: rcar: Add suspend/resume support (2018-03-28 01:27:10 +0200)

Thanks,
Thierry


pwm: Changes for v4.17-rc1

This set of changes adds support for more generations of the RCar
controller as well as runtime PM support. The JZ4740 driver gains
support for device tree and can now be used on all Ingenic SoCs.

Rounding things off is a random assortment of fixes and cleanups
all across the board.


Alexandre Belloni (1):
  pwm: sun4i: Properly check current state

Andre Przywara (3):
  pwm: sun4i: Drop unused .has_rdy member
  pwm: sun4i: Simplify controller mapping
  dt-bindings: pwm: sunxi: Add new compatible strings

Arvind Yadav (1):
  pwm: sysfs: Use put_device() instead of kfree()

Benjamin Gaignard (1):
  pwm: stm32: Adopt SPDX identifier

Corentin Labbe (1):
  pwm: Remove depends on AVR32

Fabio Estevam (1):
  pwm: imx: Let PWM be active during suspend

Fabrice Gasnier (2):
  pwm: stm32: Remove unused struct device
  pwm: stm32: Protect common prescaler for all channels

Fabrizio Castro (2):
  dt-bindings: pwm: renesas-tpu: Document r8a774[35] support
  dt-bindings: pwm: rcar: Document r8a774[35] PWM bindings

Geert Uytterhoeven (2):
  dt-bindings: pwm: renesas-tpu: Correct example TPU register block size
  dt-bindings: pwm: renesas-tpu: Correct SoC part numbers and family names

Gerald Baeza (2):
  dt-bindings: pwm-stm32-lp: Add #pwm-cells
  pwm: stm32: LPTimer: Use 3 cells ->of_xlate()

Hien Dang (1):
  pwm: rcar: Use PM Runtime to control module clock

Maarten ter Huurne (1):
  pwm: jz4740: Make disable operation compatible with TCU2 mode

Markus Elfring (2):
  pwm: puv3: Delete an error message for a failed memory allocation
  pwm: atmel-tcb: Delete an error message for a failed memory allocation

Paul Cercueil (3):
  pwm: jz4740: Implement ->set_polarity()
  pwm: jz4740: Add support for devicetree
  pwm: jz4740: Enable for all Ingenic SoCs

Ryo Kodama (1):
  pwm: rcar: Fix a condition to prevent mismatch value setting to duty

Sean Wang (3):
  pwm: mediatek: Fix up PWM4 and PWM5 malfunction on MT7623
  pwm: mediatek: Remove redundant MODULE_ALIAS entries
  pwm: mediatek: Improve precision in rate calculation

Yoshihiro Shimoda (2):
  dt-bindings: pwm: rcar: Add bindings for R-Car M3N support
  pwm: rcar: Add suspend/resume support

 .../devicetree/bindings/pwm/ingenic,jz47xx-pwm.txt | 25 ++
 .../devicetree/bindings/pwm/pwm-stm32-lp.txt   |  3 ++
 .../devicetree/bindings/pwm/pwm-sun4i.txt  |  2 +
 .../devicetree/bindings/pwm/renesas,pwm-rcar.txt   | 11 ++--
 .../devicetree/bindings/pwm/renesas,tpu-pwm.txt| 10 ++--
 drivers/pwm/Kconfig|  8 +--
 drivers/pwm/pwm-atmel-tcb.c|  1 -
 drivers/pwm/pwm-imx.c  |  3 +-
 drivers/pwm/pwm-jz4740.c   | 41 ++-
 drivers/pwm/pwm-mediatek.c | 36 +++---
 drivers/pwm/pwm-puv3.c |  4 +-
 drivers/pwm/pwm-rcar.c | 58 +++---
 drivers/pwm/pwm-stm32-lp.c |  5 +-
 drivers/pwm/pwm-stm32.c| 22 ++--
 drivers/pwm/pwm-sun4i.c| 38 +-
 drivers/pwm/sysfs.c|  3 +-
 16 files changed, 206 insertions(+), 64 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/pwm/ingenic,jz47xx-pwm.txt

Re: [PATCH 4/5] dmaengine: sprd: Add Spreadtrum DMA configuration

2018-04-12 Thread Vinod Koul

On Wed, Apr 11, 2018 at 08:13:28PM +0800, Baolin Wang wrote:
> Hi Vinod,
> 
> On 11 April 2018 at 17:36, Vinod Koul  wrote:
> > On Tue, Apr 10, 2018 at 03:46:06PM +0800, Baolin Wang wrote:
> >
> >> +/*
> >> + * struct sprd_dma_config - DMA configuration structure
> >> + * @config: dma slave channel config
> >> + * @fragment_len: specify one fragment transfer length
> >> + * @block_len: specify one block transfer length
> >> + * @transcation_len: specify one transcation transfer length
> >> + * @wrap_ptr: wrap pointer address, once the transfer address reaches the
> >> + * 'wrap_ptr', the next transfer address will jump to the 'wrap_to' 
> >> address.
> >> + * @wrap_to: wrap jump to address
> >> + * @req_mode: specify the DMA request mode
> >> + * @int_mode: specify the DMA interrupt type
> >> + */
> >> +struct sprd_dma_config {
> >> + struct dma_slave_config config;
> >> + u32 fragment_len;
> >
> > why not use _maxburst?
> 
> Yes, I can use maxburst.
> 
> >
> >> + u32 block_len;
> >> + u32 transcation_len;
> >
> > what does block and transaction len refer to here
> 
>  Our DMA has 3 transfer mode: transaction transfer, block transfer and
> fragment transfer. One transaction transfer can contain several blocks
> transfer, and each block can be set proper block step. One block can
> contain several fragments transfer with proper fragment step. It can
> generate interrupts when one transaction transfer or block transfer or
> fragment transfer is completed if user set the interrupt type. So here
> we should set the length for transaction transfer, block transfer and
> fragment transfer.

what are the max size these types support?

> 
> >
> >> + phys_addr_t wrap_ptr;
> >> + phys_addr_t wrap_to;
> >
> > this sound sg_list to me, why are we not using that here
> 
> It is similar to sg list, but it is not one software action, we have
> hardware registers to help to jump one specified address.
> 
> >
> >> + enum sprd_dma_req_mode req_mode;
> >
> > Looking at definition of request mode we have frag, block, transaction list
> > etc.. That should depend upon dma request. If you have been asked to
> > transfer a list, you shall configure list mode. if it is a single
> > transaction then it should be transaction mode!
> 
> If I understand your points correctly, you mean we can specify the
> request mode when requesting one slave channel by
> 'dma_request_slave_channel()'. But we need change the request mode
> dynamically following different transfer task for this channel, so I
> am afraid we can not specify the request mode of this channel at
> requesting time.

Nope a channel has nothing to do with request type. You request and grab a
channel. Then you prepare a descriptor for a dma transaction. Based on
transaction requested you should intelligently break it down and create a
descriptor which uses transaction/block/fragment so that DMA throughput is
efficient. If prepare has sg list then you should use link list mode.
Further if you support max length, say 16KB and request is for 20KB you may
break it down for link list with two segments.

Each prep call has flags associated, that can help you configure interrupt
behaviour.

Btw other dma controllers have similar capabilities and driver manages these
:)

-- 
~Vinod

Re: [PATCH] checkpatch: Add a --strict test for structs with bool member definitions

2018-04-12 Thread Andrea Parri

Hi,

On Thu, Apr 12, 2018 at 09:47:19AM +0200, Ingo Molnar wrote:
> 
> * Peter Zijlstra  wrote:
> 
> > I still have room in my /dev/null mailbox for pure checkpatch patches.
> > 
> > > (ooh, https://lkml.org/lkml/2017/11/21/384 is working this morning)
> > 
> > Yes, we really should not use lkml.org for references. Sadly google
> > displays it very prominently when you search for something.
> 
> lkml.org is nice in emails that have a short expected life time and relevance 
> - 
> but it probably shouldn't be used for permanent references such as kernel 
> messages, code comments and Git log entries.

Is there a better or recommended way to reference posts on LKML in commit
messages? (I do like the idea of linking to previous discussions, results,
...)

  Andrea


> 
> Thanks,
> 
>   Ingo

[PATCH v2 0/6] MIPS: perf: MT fixes and improvements

2018-04-12 Thread Matt Redfearn

This series addresses a few issues with how the MIPS performance
counters code supports the hardware multithreading MT ASE.

Firstly, implementations of the MT ASE may implement performance counters
per core or per thread(TC). MIPS Techologies implementations signal this
via a bit in the implmentation specific CONFIG7 register. Since this
register is implementation specific, checking it should be guarded by a
PRID check. This also replaces a bit defined by a magic number.

Secondly, the code currently uses vpe_id(), defined as
smp_processor_id(), divided by 2, to share core performance counters
between VPEs. This relies on a couple of assumptions of the hardware
implementation to function correctly (always 2 VPEs per core, and the
hardware reading only the least significant bit).

Finally, the method of sharing core performance counters between VPEs is
suboptimal since it does not allow one process running on a VPE to use
all of the performance counters available to it, because the kernel will
reserve half of the coutners for the other VPE even if it may never use
them. This reservation at counter probe is replaced with an allocation
on use strategy.

Tested on a MIPS Creator CI40 (2C2T MIPS InterAptiv with per-TC
counters, though for the purposes of testing the per-TC availability was
hardcoded to allow testing both paths).

Series applies to v4.16-rc7


Changes in v2:
Fix mipsxx_pmu_enable_event for !#ifdef CONFIG_MIPS_MT_SMP
- Fix !#ifdef CONFIG_MIPS_PERF_SHARED_TC_COUNTERS build
- re-use cpuc variable in mipsxx_pmu_alloc_counter,
  mipsxx_pmu_free_counter rather than having sibling_ version.
Since BMIPS5000 does not implement per TC counters, we can remove the
check on cpu_has_mipsmt_pertccounters.
New patch to fix BMIPS5000 system mode perf.

Matt Redfearn (6):
  MIPS: perf: More robustly probe for the presence of per-tc counters
  MIPS: perf: Use correct VPE ID when setting up VPE tracing
  MIPS: perf: Fix perf with MT counting other threads
  MIPS: perf: Allocate per-core counters on demand
  MIPS: perf: Fold vpe_id() macro into it's one last usage
  MIPS: perf: Fix BMIPS5000 system mode counting

 arch/mips/include/asm/mipsregs.h |   6 +
 arch/mips/kernel/perf_event_mipsxx.c | 257 +--
 2 files changed, 158 insertions(+), 105 deletions(-)

-- 
2.7.4

[PATCH v2 1/6] MIPS: perf: More robustly probe for the presence of per-tc counters

2018-04-12 Thread Matt Redfearn

Processors implementing the MIPS MT ASE may have performance counters
implemented per core or per TC. Processors implemented by MIPS
Technologies signify presence per TC through a bit in the implementation
specific Config7 register. Currently the code which probes for their
presence blindly reads a magic number corresponding to this bit, despite
it potentially having a different meaning in the CPU implementation.

The test of Config7.PTC was previously enabled when CONFIG_BMIPS5000 was
enabled. However, according to [florian], the BMIPS5000 manual does not
define this bit, so we can assume it is 0 and the feature is not
supported.

Introduce probe_mipsmt_pertccounters() to probe for the presence of per
TC counters. This detects the ases implemented in the CPU, and reads any
implementation specific bit flagging their presence. In the case of MIPS
implementations, this bit is Config7.PTC. A definition of this bit is
added in mipsregs.h for MIPS Technologies. No other implementations
support this feature.

Signed-off-by: Matt Redfearn 
---

Changes in v2: None

 arch/mips/include/asm/mipsregs.h |  5 +
 arch/mips/kernel/perf_event_mipsxx.c | 29 -
 2 files changed, 33 insertions(+), 1 deletion(-)

diff --git a/arch/mips/include/asm/mipsregs.h b/arch/mips/include/asm/mipsregs.h
index 858752dac337..a4b02bc8 100644
--- a/arch/mips/include/asm/mipsregs.h
+++ b/arch/mips/include/asm/mipsregs.h
@@ -684,6 +684,11 @@
 #define MIPS_CONF7_IAR (_ULCAST_(1) << 10)
 #define MIPS_CONF7_AR  (_ULCAST_(1) << 16)
 
+/* Config7 Bits specific to MIPS Technologies. */
+
+/* Performance counters implemented Per TC */
+#define MTI_CONF7_PTC  (_ULCAST_(1) << 19)
+
 /* WatchLo* register definitions */
 #define MIPS_WATCHLO_IRW   (_ULCAST_(0x7) << 0)
 
diff --git a/arch/mips/kernel/perf_event_mipsxx.c 
b/arch/mips/kernel/perf_event_mipsxx.c
index 6668f67a61c3..f3ec4a36921d 100644
--- a/arch/mips/kernel/perf_event_mipsxx.c
+++ b/arch/mips/kernel/perf_event_mipsxx.c
@@ -1708,6 +1708,33 @@ static const struct mips_perf_event 
*xlp_pmu_map_raw_event(u64 config)
return &raw_event;
 }
 
+#ifdef CONFIG_MIPS_PERF_SHARED_TC_COUNTERS
+/*
+ * The MIPS MT ASE specifies that performance counters may be implemented
+ * per core or per TC. If implemented per TC then all Linux CPUs have their
+ * own unique counters. If implemented per core, then VPEs in the core must
+ * treat the counters as a shared resource.
+ * Probe for the presence of per-TC counters
+ */
+static int probe_mipsmt_pertccounters(void)
+{
+   struct cpuinfo_mips *c = ¤t_cpu_data;
+
+   /* Non-MT cores by definition cannot implement per-TC counters */
+   if (!cpu_has_mipsmt)
+   return 0;
+
+   switch (c->processor_id & PRID_COMP_MASK) {
+   case PRID_COMP_MIPS:
+   /* MTI implementations use CONFIG7.PTC to signify presence */
+   return read_c0_config7() & MTI_CONF7_PTC;
+   default:
+   break;
+   }
+   return 0;
+}
+#endif /* CONFIG_MIPS_PERF_SHARED_TC_COUNTERS */
+
 static int __init
 init_hw_perf_events(void)
 {
@@ -1723,7 +1750,7 @@ init_hw_perf_events(void)
}
 
 #ifdef CONFIG_MIPS_PERF_SHARED_TC_COUNTERS
-   cpu_has_mipsmt_pertccounters = read_c0_config7() & (1<<19);
+   cpu_has_mipsmt_pertccounters = probe_mipsmt_pertccounters();
if (!cpu_has_mipsmt_pertccounters)
counters = counters_total_to_per_cpu(counters);
 #endif
-- 
2.7.4

[PATCH v2 2/6] MIPS: perf: Use correct VPE ID when setting up VPE tracing

2018-04-12 Thread Matt Redfearn

There are a couple of FIXME's in the perf code which state that
cpu_data[event->cpu].vpe_id reports 0 for both CPUs. This is no longer
the case, since the vpe_id is used extensively by SMP CPS.

VPE local counting gets around this by using smp_processor_id() instead.
As it happens this does work correctly to count events on the right VPE,
but relies on 2 assumptions:
a) Always having 2 VPEs / core.
b) The hardware only paying attention to the least significant bit of
the PERFCTL.VPEID field.
If either of these assumptions change then the incorrect VPEs events
will be counted.

Fix this by replacing smp_processor_id() with
cpu_vpe_id(¤t_cpu_data), in the vpe_id() macro, and pass vpe_id()
to M_PERFCTL_VPEID() when setting up PERFCTL.VPEID. The FIXME's can also
be removed since they no longer apply.

Signed-off-by: Matt Redfearn 
---

Changes in v2: None

 arch/mips/kernel/perf_event_mipsxx.c | 12 ++--
 1 file changed, 2 insertions(+), 10 deletions(-)

diff --git a/arch/mips/kernel/perf_event_mipsxx.c 
b/arch/mips/kernel/perf_event_mipsxx.c
index f3ec4a36921d..239c4ca89fb0 100644
--- a/arch/mips/kernel/perf_event_mipsxx.c
+++ b/arch/mips/kernel/perf_event_mipsxx.c
@@ -137,12 +137,8 @@ static DEFINE_RWLOCK(pmuint_rwlock);
 #define vpe_id()   (cpu_has_mipsmt_pertccounters ? \
 0 : (smp_processor_id() & MIPS_CPUID_TO_COUNTER_MASK))
 #else
-/*
- * FIXME: For VSMP, vpe_id() is redefined for Perf-events, because
- * cpu_data[cpuid].vpe_id reports 0 for _both_ CPUs.
- */
 #define vpe_id()   (cpu_has_mipsmt_pertccounters ? \
-0 : smp_processor_id())
+0 : cpu_vpe_id(¤t_cpu_data))
 #endif
 
 /* Copied from op_model_mipsxx.c */
@@ -1279,11 +1275,7 @@ static void check_and_calc_range(struct perf_event 
*event,
 */
hwc->config_base |= M_TC_EN_ALL;
} else {
-   /*
-* FIXME: cpu_data[event->cpu].vpe_id reports 0
-* for both CPUs.
-*/
-   hwc->config_base |= M_PERFCTL_VPEID(event->cpu);
+   hwc->config_base |= M_PERFCTL_VPEID(vpe_id());
hwc->config_base |= M_TC_EN_VPE;
}
} else
-- 
2.7.4

1 2 3 4 5 6 7 8 >

1 - 100 of 774 matches

Mail list logo