date:20180716

Re: [PATCH 1/3] ARM: dts: imx6sll-evk: enable PWM1 for backlight driver

2018-07-16 Thread Shawn Guo

On Fri, Jul 13, 2018 at 03:58:25PM +0800, Anson Huang wrote:
> Enable pwm1 module on i.MX6SLL EVK board to make
> backlight driver really work with LCD panel connected.
> 
> Signed-off-by: Anson Huang 

Applied all, thanks.

Re: [PATCH 1/3] ARM: dts: imx6sll-evk: enable PWM1 for backlight driver

2018-07-16 Thread Shawn Guo

On Fri, Jul 13, 2018 at 03:58:25PM +0800, Anson Huang wrote:
> Enable pwm1 module on i.MX6SLL EVK board to make
> backlight driver really work with LCD panel connected.
> 
> Signed-off-by: Anson Huang 

Applied all, thanks.

Re: [PATCH v5 2/2] cpufreq: qcom-hw: Add support for QCOM cpufreq HW driver

2018-07-16 Thread Taniya Das


Hello Matthias,

Thanks for your review comments.

On 7/13/2018 5:49 AM, Matthias Kaehlcke wrote:

Hi,

On Thu, Jul 12, 2018 at 11:35:45PM +0530, Taniya Das wrote:

The CPUfreq HW present in some QCOM chipsets offloads the steps necessary
for changing the frequency of CPUs. The driver implements the cpufreq
driver interface for this hardware engine.

Signed-off-by: Saravana Kannan 
Signed-off-by: Taniya Das 
---
  drivers/cpufreq/Kconfig.arm   |  10 ++
  drivers/cpufreq/Makefile  |   1 +
  drivers/cpufreq/qcom-cpufreq-hw.c | 344 ++
  3 files changed, 355 insertions(+)
  create mode 100644 drivers/cpufreq/qcom-cpufreq-hw.c

diff --git a/drivers/cpufreq/Kconfig.arm b/drivers/cpufreq/Kconfig.arm
index 52f5f1a..141ec3e 100644
--- a/drivers/cpufreq/Kconfig.arm
+++ b/drivers/cpufreq/Kconfig.arm
@@ -312,3 +312,13 @@ config ARM_PXA2xx_CPUFREQ
  This add the CPUFreq driver support for Intel PXA2xx SOCs.

  If in doubt, say N.
+
+config ARM_QCOM_CPUFREQ_HW
+   bool "QCOM CPUFreq HW driver"
+   help
+Support for the CPUFreq HW driver.
+Some QCOM chipsets have a HW engine to offload the steps
+necessary for changing the frequency of the CPUs. Firmware loaded
+in this engine exposes a programming interafce to the High-level OS.
+The driver implements the cpufreq driver interface for this HW engine.
+Say Y if you want to support CPUFreq HW.
diff --git a/drivers/cpufreq/Makefile b/drivers/cpufreq/Makefile
index fb4a2ec..1226a3e 100644
--- a/drivers/cpufreq/Makefile
+++ b/drivers/cpufreq/Makefile
@@ -86,6 +86,7 @@ obj-$(CONFIG_ARM_TEGRA124_CPUFREQ)+= tegra124-cpufreq.o
  obj-$(CONFIG_ARM_TEGRA186_CPUFREQ)+= tegra186-cpufreq.o
  obj-$(CONFIG_ARM_TI_CPUFREQ)  += ti-cpufreq.o
  obj-$(CONFIG_ARM_VEXPRESS_SPC_CPUFREQ)+= vexpress-spc-cpufreq.o
+obj-$(CONFIG_ARM_QCOM_CPUFREQ_HW)  += qcom-cpufreq-hw.o


  
##
diff --git a/drivers/cpufreq/qcom-cpufreq-hw.c 
b/drivers/cpufreq/qcom-cpufreq-hw.c
new file mode 100644
index 000..fa25a95
--- /dev/null
+++ b/drivers/cpufreq/qcom-cpufreq-hw.c
@@ -0,0 +1,344 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (c) 2018, The Linux Foundation. All rights reserved.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define INIT_RATE  3UL
+#define XO_RATE1920UL
+#define LUT_MAX_ENTRIES40U
+#define CORE_COUNT_VAL(val)(((val) & (GENMASK(18, 16))) >> 16)
+#define LUT_ROW_SIZE   32
+
+enum {
+   REG_ENABLE,
+   REG_LUT_TABLE,
+   REG_PERF_STATE,
+
+   REG_ARRAY_SIZE,
+};
+
+struct cpufreq_qcom {
+   struct cpufreq_frequency_table *table;
+   struct device *dev;
+   const u16 *reg_offset;
+   void __iomem *base;
+   cpumask_t related_cpus;
+   unsigned int max_cores;


Same comment as on v4:

Why *max*_cores? This seems to be the number of CPUs in a cluster and
qcom_read_lut() expects the core count read from the LUT to match
exactly. Maybe it's the name from the datasheet? Should it still be
'num_cores' or similer?



Your understanding is correct. I would prefer to leave the naming as 
'max_cores'.



+static struct cpufreq_qcom *qcom_freq_domain_map[NR_CPUS];


It would be an option to limit this to the number of CPU clusters and
allocate it dynamically when the driver is initialized (key = first
core in the cluster). Probably not worth the hassle with the limited
number of cores though.


+static int qcom_read_lut(struct platform_device *pdev,
+struct cpufreq_qcom *c)
+{
+   struct device *dev = >dev;
+   unsigned int offset;
+   u32 data, src, lval, i, core_count, prev_cc, prev_freq, cur_freq;
+
+   c->table = devm_kcalloc(dev, LUT_MAX_ENTRIES + 1,
+   sizeof(*c->table), GFP_KERNEL);
+   if (!c->table)
+   return -ENOMEM;
+
+   offset = c->reg_offset[REG_LUT_TABLE];
+
+   for (i = 0; i < LUT_MAX_ENTRIES; i++) {
+   data = readl_relaxed(c->base + offset + i * LUT_ROW_SIZE);
+   src = ((data & GENMASK(31, 30)) >> 30);
+   lval = (data & GENMASK(7, 0));
+   core_count = CORE_COUNT_VAL(data);
+
+   if (src == 0)
+   c->table[i].frequency = INIT_RATE / 1000;
+   else
+   c->table[i].frequency = XO_RATE * lval / 1000;


You changed the condition from '!src' to 'src == 0'. My suggestion on
v4 was in part about a negative condition, but also about the
order. If it doesn't obstruct the code otherwise I think for an if-else
branch it is good practice to handle the more common case first and
then the 'exception'. I would expect most entries to have an actual
rate. Just a nit in any case, feel

Re: [PATCH v5 2/2] cpufreq: qcom-hw: Add support for QCOM cpufreq HW driver

2018-07-16 Thread Taniya Das


Hello Matthias,

Thanks for your review comments.

On 7/13/2018 5:49 AM, Matthias Kaehlcke wrote:

Hi,

On Thu, Jul 12, 2018 at 11:35:45PM +0530, Taniya Das wrote:

The CPUfreq HW present in some QCOM chipsets offloads the steps necessary
for changing the frequency of CPUs. The driver implements the cpufreq
driver interface for this hardware engine.

Signed-off-by: Saravana Kannan 
Signed-off-by: Taniya Das 
---
  drivers/cpufreq/Kconfig.arm   |  10 ++
  drivers/cpufreq/Makefile  |   1 +
  drivers/cpufreq/qcom-cpufreq-hw.c | 344 ++
  3 files changed, 355 insertions(+)
  create mode 100644 drivers/cpufreq/qcom-cpufreq-hw.c

diff --git a/drivers/cpufreq/Kconfig.arm b/drivers/cpufreq/Kconfig.arm
index 52f5f1a..141ec3e 100644
--- a/drivers/cpufreq/Kconfig.arm
+++ b/drivers/cpufreq/Kconfig.arm
@@ -312,3 +312,13 @@ config ARM_PXA2xx_CPUFREQ
  This add the CPUFreq driver support for Intel PXA2xx SOCs.

  If in doubt, say N.
+
+config ARM_QCOM_CPUFREQ_HW
+   bool "QCOM CPUFreq HW driver"
+   help
+Support for the CPUFreq HW driver.
+Some QCOM chipsets have a HW engine to offload the steps
+necessary for changing the frequency of the CPUs. Firmware loaded
+in this engine exposes a programming interafce to the High-level OS.
+The driver implements the cpufreq driver interface for this HW engine.
+Say Y if you want to support CPUFreq HW.
diff --git a/drivers/cpufreq/Makefile b/drivers/cpufreq/Makefile
index fb4a2ec..1226a3e 100644
--- a/drivers/cpufreq/Makefile
+++ b/drivers/cpufreq/Makefile
@@ -86,6 +86,7 @@ obj-$(CONFIG_ARM_TEGRA124_CPUFREQ)+= tegra124-cpufreq.o
  obj-$(CONFIG_ARM_TEGRA186_CPUFREQ)+= tegra186-cpufreq.o
  obj-$(CONFIG_ARM_TI_CPUFREQ)  += ti-cpufreq.o
  obj-$(CONFIG_ARM_VEXPRESS_SPC_CPUFREQ)+= vexpress-spc-cpufreq.o
+obj-$(CONFIG_ARM_QCOM_CPUFREQ_HW)  += qcom-cpufreq-hw.o


  
##
diff --git a/drivers/cpufreq/qcom-cpufreq-hw.c 
b/drivers/cpufreq/qcom-cpufreq-hw.c
new file mode 100644
index 000..fa25a95
--- /dev/null
+++ b/drivers/cpufreq/qcom-cpufreq-hw.c
@@ -0,0 +1,344 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (c) 2018, The Linux Foundation. All rights reserved.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define INIT_RATE  3UL
+#define XO_RATE1920UL
+#define LUT_MAX_ENTRIES40U
+#define CORE_COUNT_VAL(val)(((val) & (GENMASK(18, 16))) >> 16)
+#define LUT_ROW_SIZE   32
+
+enum {
+   REG_ENABLE,
+   REG_LUT_TABLE,
+   REG_PERF_STATE,
+
+   REG_ARRAY_SIZE,
+};
+
+struct cpufreq_qcom {
+   struct cpufreq_frequency_table *table;
+   struct device *dev;
+   const u16 *reg_offset;
+   void __iomem *base;
+   cpumask_t related_cpus;
+   unsigned int max_cores;


Same comment as on v4:

Why *max*_cores? This seems to be the number of CPUs in a cluster and
qcom_read_lut() expects the core count read from the LUT to match
exactly. Maybe it's the name from the datasheet? Should it still be
'num_cores' or similer?



Your understanding is correct. I would prefer to leave the naming as 
'max_cores'.



+static struct cpufreq_qcom *qcom_freq_domain_map[NR_CPUS];


It would be an option to limit this to the number of CPU clusters and
allocate it dynamically when the driver is initialized (key = first
core in the cluster). Probably not worth the hassle with the limited
number of cores though.


+static int qcom_read_lut(struct platform_device *pdev,
+struct cpufreq_qcom *c)
+{
+   struct device *dev = >dev;
+   unsigned int offset;
+   u32 data, src, lval, i, core_count, prev_cc, prev_freq, cur_freq;
+
+   c->table = devm_kcalloc(dev, LUT_MAX_ENTRIES + 1,
+   sizeof(*c->table), GFP_KERNEL);
+   if (!c->table)
+   return -ENOMEM;
+
+   offset = c->reg_offset[REG_LUT_TABLE];
+
+   for (i = 0; i < LUT_MAX_ENTRIES; i++) {
+   data = readl_relaxed(c->base + offset + i * LUT_ROW_SIZE);
+   src = ((data & GENMASK(31, 30)) >> 30);
+   lval = (data & GENMASK(7, 0));
+   core_count = CORE_COUNT_VAL(data);
+
+   if (src == 0)
+   c->table[i].frequency = INIT_RATE / 1000;
+   else
+   c->table[i].frequency = XO_RATE * lval / 1000;


You changed the condition from '!src' to 'src == 0'. My suggestion on
v4 was in part about a negative condition, but also about the
order. If it doesn't obstruct the code otherwise I think for an if-else
branch it is good practice to handle the more common case first and
then the 'exception'. I would expect most entries to have an actual
rate. Just a nit in any case, feel

Re: [RFC PATCH 1/4] ARM: dts: am33xx: Add all fck timer clocks

2018-07-16 Thread Tony Lindgren

Hi,

* Neil Armstrong  [180716 20:42]:
> Add the missing fck clock to all timer nodes of the AM33XX dtsi.
> 
> Signed-off-by: Neil Armstrong 
> ---
>  arch/arm/boot/dts/am33xx.dtsi | 10 ++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/arch/arm/boot/dts/am33xx.dtsi b/arch/arm/boot/dts/am33xx.dtsi
> index 9cd62bc2..e8e665d 100644
> --- a/arch/arm/boot/dts/am33xx.dtsi
> +++ b/arch/arm/boot/dts/am33xx.dtsi
> @@ -541,6 +541,8 @@
>   reg = <0x48042000 0x400>;
>   interrupts = <69>;
>   ti,hwmods = "timer3";
> + clocks = <_fck>;
> + clock-names = "fck";
>   };

These should all use the clkctrl clock nodes, see:

$ git grep CLKCTRL include/dt-bindings/clock/ | grep TIMER

The module clock is bit 0, and on some SoCs bit 24 is the
timer fck. But I don't think these SoS use separate bit 24
clock.

Regards,

Tony

Re: [RFC PATCH 1/4] ARM: dts: am33xx: Add all fck timer clocks

2018-07-16 Thread Tony Lindgren

Hi,

* Neil Armstrong  [180716 20:42]:
> Add the missing fck clock to all timer nodes of the AM33XX dtsi.
> 
> Signed-off-by: Neil Armstrong 
> ---
>  arch/arm/boot/dts/am33xx.dtsi | 10 ++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/arch/arm/boot/dts/am33xx.dtsi b/arch/arm/boot/dts/am33xx.dtsi
> index 9cd62bc2..e8e665d 100644
> --- a/arch/arm/boot/dts/am33xx.dtsi
> +++ b/arch/arm/boot/dts/am33xx.dtsi
> @@ -541,6 +541,8 @@
>   reg = <0x48042000 0x400>;
>   interrupts = <69>;
>   ti,hwmods = "timer3";
> + clocks = <_fck>;
> + clock-names = "fck";
>   };

These should all use the clkctrl clock nodes, see:

$ git grep CLKCTRL include/dt-bindings/clock/ | grep TIMER

The module clock is bit 0, and on some SoCs bit 24 is the
timer fck. But I don't think these SoS use separate bit 24
clock.

Regards,

Tony

[PATCH v2 1/2] mm: fix race on soft-offlining free huge pages

2018-07-16 Thread Naoya Horiguchi

There's a race condition between soft offline and hugetlb_fault which
causes unexpected process killing and/or hugetlb allocation failure.

The process killing is caused by the following flow:

  CPU 0   CPU 1  CPU 2

  soft offline
get_any_page
// find the hugetlb is free
  mmap a hugetlb file
  page fault
...
  hugetlb_fault
hugetlb_no_page
  alloc_huge_page
  // succeed
  soft_offline_free_page
  // set hwpoison flag
 mmap the hugetlb file
 page fault
   ...
 hugetlb_fault
   hugetlb_no_page
 find_lock_page
   return VM_FAULT_HWPOISON
   mm_fault_error
 do_sigbus
 // kill the process


The hugetlb allocation failure comes from the following flow:

  CPU 0  CPU 1

 mmap a hugetlb file
 // reserve all free page but don't fault-in
  soft offline
get_any_page
// find the hugetlb is free
  soft_offline_free_page
  // set hwpoison flag
dissolve_free_huge_page
// fail because all free hugepages are reserved
 page fault
   ...
 hugetlb_fault
   hugetlb_no_page
 alloc_huge_page
   ...
 dequeue_huge_page_node_exact
 // ignore hwpoisoned hugepage
 // and finally fail due to no-mem

The root cause of this is that current soft-offline code is written
based on an assumption that PageHWPoison flag should beset at first to
avoid accessing the corrupted data.  This makes sense for memory_failure()
or hard offline, but does not for soft offline because soft offline is
about corrected (not uncorrected) error and is safe from data lost.
This patch changes soft offline semantics where it sets PageHWPoison flag
only after containment of the error page completes successfully.

Reported-by: Xishi Qiu 
Suggested-by: Xishi Qiu 
Signed-off-by: Naoya Horiguchi 
---
changelog v1->v2:
- don't use set_hwpoison_free_buddy_page() (not defined yet)
- updated comment in soft_offline_huge_page()
---
 mm/hugetlb.c| 11 +--
 mm/memory-failure.c | 24 ++--
 mm/migrate.c|  2 --
 3 files changed, 23 insertions(+), 14 deletions(-)

diff --git v4.18-rc4-mmotm-2018-07-10-16-50/mm/hugetlb.c 
v4.18-rc4-mmotm-2018-07-10-16-50_patched/mm/hugetlb.c
index 430be42..937c142 100644
--- v4.18-rc4-mmotm-2018-07-10-16-50/mm/hugetlb.c
+++ v4.18-rc4-mmotm-2018-07-10-16-50_patched/mm/hugetlb.c
@@ -1479,22 +1479,20 @@ static int free_pool_huge_page(struct hstate *h, 
nodemask_t *nodes_allowed,
 /*
  * Dissolve a given free hugepage into free buddy pages. This function does
  * nothing for in-use (including surplus) hugepages. Returns -EBUSY if the
- * number of free hugepages would be reduced below the number of reserved
- * hugepages.
+ * dissolution fails because a give page is not a free hugepage, or because
+ * free hugepages are fully reserved.
  */
 int dissolve_free_huge_page(struct page *page)
 {
-   int rc = 0;
+   int rc = -EBUSY;
 
spin_lock(_lock);
if (PageHuge(page) && !page_count(page)) {
struct page *head = compound_head(page);
struct hstate *h = page_hstate(head);
int nid = page_to_nid(head);
-   if (h->free_huge_pages - h->resv_huge_pages == 0) {
-   rc = -EBUSY;
+   if (h->free_huge_pages - h->resv_huge_pages == 0)
goto out;
-   }
/*
 * Move PageHWPoison flag from head page to the raw error page,
 * which makes any subpages rather than the error page reusable.
@@ -1508,6 +1506,7 @@ int dissolve_free_huge_page(struct page *page)
h->free_huge_pages_node[nid]--;
h->max_huge_pages--;
update_and_free_page(h, head);
+   rc = 0;
}
 out:
spin_unlock(_lock);
diff --git v4.18-rc4-mmotm-2018-07-10-16-50/mm/memory-failure.c 
v4.18-rc4-mmotm-2018-07-10-16-50_patched/mm/memory-failure.c
index 9d142b9..9b77f85 100644
---

[PATCH v2 0/2] mm: soft-offline: fix race against page allocation

2018-07-16 Thread Naoya Horiguchi

I've updated the patchset based on feedbacks:

- updated comments (from Andrew),
- moved calling set_hwpoison_free_buddy_page() from mm/migrate.c to 
mm/memory-failure.c,
  which is necessary to check the return code of set_hwpoison_free_buddy_page(),
- lkp bot reported a build error when only 1/2 is applied.

  >mm/memory-failure.c: In function 'soft_offline_huge_page':
  > >> mm/memory-failure.c:1610:8: error: implicit declaration of function
  > 'set_hwpoison_free_buddy_page'; did you mean 'is_free_buddy_page'?
  > [-Werror=implicit-function-declaration]
  >if (set_hwpoison_free_buddy_page(page))
  >^~~~
  >is_free_buddy_page
  >cc1: some warnings being treated as errors

  set_hwpoison_free_buddy_page() is defined in 2/2, so we can't use it
  in 1/2. Simply doing s/set_hwpoison_free_buddy_page/!TestSetPageHWPoison/
  will fix this.

v1: https://lkml.org/lkml/2018/7/12/968

Thanks,
Naoya Horiguchi
---
Summary:

Naoya Horiguchi (2):
  mm: fix race on soft-offlining free huge pages
  mm: soft-offline: close the race against page allocation

 include/linux/page-flags.h |  5 +
 include/linux/swapops.h| 10 -
 mm/hugetlb.c   | 11 +-
 mm/memory-failure.c| 53 ++
 mm/migrate.c   | 11 --
 mm/page_alloc.c| 30 ++
 6 files changed, 84 insertions(+), 36 deletions(-)

[PATCH v2 2/2] mm: soft-offline: close the race against page allocation

2018-07-16 Thread Naoya Horiguchi

A process can be killed with SIGBUS(BUS_MCEERR_AR) when it tries to
allocate a page that was just freed on the way of soft-offline.
This is undesirable because soft-offline (which is about corrected error)
is less aggressive than hard-offline (which is about uncorrected error),
and we can make soft-offline fail and keep using the page for good reason
like "system is busy."

Two main changes of this patch are:

- setting migrate type of the target page to MIGRATE_ISOLATE. As done
  in free_unref_page_commit(), this makes kernel bypass pcplist when
  freeing the page. So we can assume that the page is in freelist just
  after put_page() returns,

- setting PG_hwpoison on free page under zone->lock which protects
  freelists, so this allows us to avoid setting PG_hwpoison on a page
  that is decided to be allocated soon.

Reported-by: Xishi Qiu 
Signed-off-by: Naoya Horiguchi 
---
changelog v1->v2:
- updated comment on set_hwpoison_free_buddy_page(),
- moved calling set_hwpoison_free_buddy_page() from mm/migrate.c to
  mm/memory-failure.c, which is necessary to check the return code of
  set_hwpoison_free_buddy_page().
---
 include/linux/page-flags.h |  5 +
 include/linux/swapops.h| 10 --
 mm/memory-failure.c| 35 +--
 mm/migrate.c   |  9 -
 mm/page_alloc.c| 30 ++
 5 files changed, 64 insertions(+), 25 deletions(-)

diff --git v4.18-rc4-mmotm-2018-07-10-16-50/include/linux/page-flags.h 
v4.18-rc4-mmotm-2018-07-10-16-50_patched/include/linux/page-flags.h
index 901943e..74bee8c 100644
--- v4.18-rc4-mmotm-2018-07-10-16-50/include/linux/page-flags.h
+++ v4.18-rc4-mmotm-2018-07-10-16-50_patched/include/linux/page-flags.h
@@ -369,8 +369,13 @@ PAGEFLAG_FALSE(Uncached)
 PAGEFLAG(HWPoison, hwpoison, PF_ANY)
 TESTSCFLAG(HWPoison, hwpoison, PF_ANY)
 #define __PG_HWPOISON (1UL << PG_hwpoison)
+extern bool set_hwpoison_free_buddy_page(struct page *page);
 #else
 PAGEFLAG_FALSE(HWPoison)
+static inline bool set_hwpoison_free_buddy_page(struct page *page)
+{
+   return 0;
+}
 #define __PG_HWPOISON 0
 #endif
 
diff --git v4.18-rc4-mmotm-2018-07-10-16-50/include/linux/swapops.h 
v4.18-rc4-mmotm-2018-07-10-16-50_patched/include/linux/swapops.h
index 9c0eb4d..fe8e08b 100644
--- v4.18-rc4-mmotm-2018-07-10-16-50/include/linux/swapops.h
+++ v4.18-rc4-mmotm-2018-07-10-16-50_patched/include/linux/swapops.h
@@ -335,11 +335,6 @@ static inline int is_hwpoison_entry(swp_entry_t entry)
return swp_type(entry) == SWP_HWPOISON;
 }
 
-static inline bool test_set_page_hwpoison(struct page *page)
-{
-   return TestSetPageHWPoison(page);
-}
-
 static inline void num_poisoned_pages_inc(void)
 {
atomic_long_inc(_poisoned_pages);
@@ -362,11 +357,6 @@ static inline int is_hwpoison_entry(swp_entry_t swp)
return 0;
 }
 
-static inline bool test_set_page_hwpoison(struct page *page)
-{
-   return false;
-}
-
 static inline void num_poisoned_pages_inc(void)
 {
 }
diff --git v4.18-rc4-mmotm-2018-07-10-16-50/mm/memory-failure.c 
v4.18-rc4-mmotm-2018-07-10-16-50_patched/mm/memory-failure.c
index 9b77f85..936d0e7 100644
--- v4.18-rc4-mmotm-2018-07-10-16-50/mm/memory-failure.c
+++ v4.18-rc4-mmotm-2018-07-10-16-50_patched/mm/memory-failure.c
@@ -57,6 +57,7 @@
 #include 
 #include 
 #include 
+#include 
 #include "internal.h"
 #include "ras/ras_event.h"
 
@@ -1609,8 +1610,10 @@ static int soft_offline_huge_page(struct page *page, int 
flags)
 */
ret = dissolve_free_huge_page(page);
if (!ret) {
-   if (!TestSetPageHWPoison(page))
+   if (set_hwpoison_free_buddy_page(page))
num_poisoned_pages_inc();
+   else
+   ret = -EBUSY;
}
}
return ret;
@@ -1688,6 +1691,11 @@ static int __soft_offline_page(struct page *page, int 
flags)
pfn, ret, page->flags, >flags);
if (ret > 0)
ret = -EIO;
+   } else {
+   if (set_hwpoison_free_buddy_page(page))
+   num_poisoned_pages_inc();
+   else
+   ret = -EBUSY;
}
} else {
pr_info("soft offline: %#lx: isolation failed: %d, page count 
%d, type %lx (%pGp)\n",
@@ -1699,6 +1707,7 @@ static int __soft_offline_page(struct page *page, int 
flags)
 static int soft_offline_in_use_page(struct page *page, int flags)
 {
int ret;
+   int mt;
struct page *hpage = compound_head(page);
 
if (!PageHuge(page) && PageTransHuge(hpage)) {
@@ -1717,23 +1726,37 @@ static int soft_offline_in_use_page(struct page *page, 
int flags)
put_hwpoison_page(hpage);
}
 
+   /*
+* Setting MIGRATE_ISOLATE here ensures that the

[PATCH v2 2/2] mm: soft-offline: close the race against page allocation

2018-07-16 Thread Naoya Horiguchi

A process can be killed with SIGBUS(BUS_MCEERR_AR) when it tries to
allocate a page that was just freed on the way of soft-offline.
This is undesirable because soft-offline (which is about corrected error)
is less aggressive than hard-offline (which is about uncorrected error),
and we can make soft-offline fail and keep using the page for good reason
like "system is busy."

Two main changes of this patch are:

- setting migrate type of the target page to MIGRATE_ISOLATE. As done
  in free_unref_page_commit(), this makes kernel bypass pcplist when
  freeing the page. So we can assume that the page is in freelist just
  after put_page() returns,

- setting PG_hwpoison on free page under zone->lock which protects
  freelists, so this allows us to avoid setting PG_hwpoison on a page
  that is decided to be allocated soon.

Reported-by: Xishi Qiu 
Signed-off-by: Naoya Horiguchi 
---
changelog v1->v2:
- updated comment on set_hwpoison_free_buddy_page(),
- moved calling set_hwpoison_free_buddy_page() from mm/migrate.c to
  mm/memory-failure.c, which is necessary to check the return code of
  set_hwpoison_free_buddy_page().
---
 include/linux/page-flags.h |  5 +
 include/linux/swapops.h| 10 --
 mm/memory-failure.c| 35 +--
 mm/migrate.c   |  9 -
 mm/page_alloc.c| 30 ++
 5 files changed, 64 insertions(+), 25 deletions(-)

diff --git v4.18-rc4-mmotm-2018-07-10-16-50/include/linux/page-flags.h 
v4.18-rc4-mmotm-2018-07-10-16-50_patched/include/linux/page-flags.h
index 901943e..74bee8c 100644
--- v4.18-rc4-mmotm-2018-07-10-16-50/include/linux/page-flags.h
+++ v4.18-rc4-mmotm-2018-07-10-16-50_patched/include/linux/page-flags.h
@@ -369,8 +369,13 @@ PAGEFLAG_FALSE(Uncached)
 PAGEFLAG(HWPoison, hwpoison, PF_ANY)
 TESTSCFLAG(HWPoison, hwpoison, PF_ANY)
 #define __PG_HWPOISON (1UL << PG_hwpoison)
+extern bool set_hwpoison_free_buddy_page(struct page *page);
 #else
 PAGEFLAG_FALSE(HWPoison)
+static inline bool set_hwpoison_free_buddy_page(struct page *page)
+{
+   return 0;
+}
 #define __PG_HWPOISON 0
 #endif
 
diff --git v4.18-rc4-mmotm-2018-07-10-16-50/include/linux/swapops.h 
v4.18-rc4-mmotm-2018-07-10-16-50_patched/include/linux/swapops.h
index 9c0eb4d..fe8e08b 100644
--- v4.18-rc4-mmotm-2018-07-10-16-50/include/linux/swapops.h
+++ v4.18-rc4-mmotm-2018-07-10-16-50_patched/include/linux/swapops.h
@@ -335,11 +335,6 @@ static inline int is_hwpoison_entry(swp_entry_t entry)
return swp_type(entry) == SWP_HWPOISON;
 }
 
-static inline bool test_set_page_hwpoison(struct page *page)
-{
-   return TestSetPageHWPoison(page);
-}
-
 static inline void num_poisoned_pages_inc(void)
 {
atomic_long_inc(_poisoned_pages);
@@ -362,11 +357,6 @@ static inline int is_hwpoison_entry(swp_entry_t swp)
return 0;
 }
 
-static inline bool test_set_page_hwpoison(struct page *page)
-{
-   return false;
-}
-
 static inline void num_poisoned_pages_inc(void)
 {
 }
diff --git v4.18-rc4-mmotm-2018-07-10-16-50/mm/memory-failure.c 
v4.18-rc4-mmotm-2018-07-10-16-50_patched/mm/memory-failure.c
index 9b77f85..936d0e7 100644
--- v4.18-rc4-mmotm-2018-07-10-16-50/mm/memory-failure.c
+++ v4.18-rc4-mmotm-2018-07-10-16-50_patched/mm/memory-failure.c
@@ -57,6 +57,7 @@
 #include 
 #include 
 #include 
+#include 
 #include "internal.h"
 #include "ras/ras_event.h"
 
@@ -1609,8 +1610,10 @@ static int soft_offline_huge_page(struct page *page, int 
flags)
 */
ret = dissolve_free_huge_page(page);
if (!ret) {
-   if (!TestSetPageHWPoison(page))
+   if (set_hwpoison_free_buddy_page(page))
num_poisoned_pages_inc();
+   else
+   ret = -EBUSY;
}
}
return ret;
@@ -1688,6 +1691,11 @@ static int __soft_offline_page(struct page *page, int 
flags)
pfn, ret, page->flags, >flags);
if (ret > 0)
ret = -EIO;
+   } else {
+   if (set_hwpoison_free_buddy_page(page))
+   num_poisoned_pages_inc();
+   else
+   ret = -EBUSY;
}
} else {
pr_info("soft offline: %#lx: isolation failed: %d, page count 
%d, type %lx (%pGp)\n",
@@ -1699,6 +1707,7 @@ static int __soft_offline_page(struct page *page, int 
flags)
 static int soft_offline_in_use_page(struct page *page, int flags)
 {
int ret;
+   int mt;
struct page *hpage = compound_head(page);
 
if (!PageHuge(page) && PageTransHuge(hpage)) {
@@ -1717,23 +1726,37 @@ static int soft_offline_in_use_page(struct page *page, 
int flags)
put_hwpoison_page(hpage);
}
 
+   /*
+* Setting MIGRATE_ISOLATE here ensures that the

[PATCH v2 1/2] mm: fix race on soft-offlining free huge pages

2018-07-16 Thread Naoya Horiguchi

There's a race condition between soft offline and hugetlb_fault which
causes unexpected process killing and/or hugetlb allocation failure.

The process killing is caused by the following flow:

  CPU 0   CPU 1  CPU 2

  soft offline
get_any_page
// find the hugetlb is free
  mmap a hugetlb file
  page fault
...
  hugetlb_fault
hugetlb_no_page
  alloc_huge_page
  // succeed
  soft_offline_free_page
  // set hwpoison flag
 mmap the hugetlb file
 page fault
   ...
 hugetlb_fault
   hugetlb_no_page
 find_lock_page
   return VM_FAULT_HWPOISON
   mm_fault_error
 do_sigbus
 // kill the process


The hugetlb allocation failure comes from the following flow:

  CPU 0  CPU 1

 mmap a hugetlb file
 // reserve all free page but don't fault-in
  soft offline
get_any_page
// find the hugetlb is free
  soft_offline_free_page
  // set hwpoison flag
dissolve_free_huge_page
// fail because all free hugepages are reserved
 page fault
   ...
 hugetlb_fault
   hugetlb_no_page
 alloc_huge_page
   ...
 dequeue_huge_page_node_exact
 // ignore hwpoisoned hugepage
 // and finally fail due to no-mem

The root cause of this is that current soft-offline code is written
based on an assumption that PageHWPoison flag should beset at first to
avoid accessing the corrupted data.  This makes sense for memory_failure()
or hard offline, but does not for soft offline because soft offline is
about corrected (not uncorrected) error and is safe from data lost.
This patch changes soft offline semantics where it sets PageHWPoison flag
only after containment of the error page completes successfully.

Reported-by: Xishi Qiu 
Suggested-by: Xishi Qiu 
Signed-off-by: Naoya Horiguchi 
---
changelog v1->v2:
- don't use set_hwpoison_free_buddy_page() (not defined yet)
- updated comment in soft_offline_huge_page()
---
 mm/hugetlb.c| 11 +--
 mm/memory-failure.c | 24 ++--
 mm/migrate.c|  2 --
 3 files changed, 23 insertions(+), 14 deletions(-)

diff --git v4.18-rc4-mmotm-2018-07-10-16-50/mm/hugetlb.c 
v4.18-rc4-mmotm-2018-07-10-16-50_patched/mm/hugetlb.c
index 430be42..937c142 100644
--- v4.18-rc4-mmotm-2018-07-10-16-50/mm/hugetlb.c
+++ v4.18-rc4-mmotm-2018-07-10-16-50_patched/mm/hugetlb.c
@@ -1479,22 +1479,20 @@ static int free_pool_huge_page(struct hstate *h, 
nodemask_t *nodes_allowed,
 /*
  * Dissolve a given free hugepage into free buddy pages. This function does
  * nothing for in-use (including surplus) hugepages. Returns -EBUSY if the
- * number of free hugepages would be reduced below the number of reserved
- * hugepages.
+ * dissolution fails because a give page is not a free hugepage, or because
+ * free hugepages are fully reserved.
  */
 int dissolve_free_huge_page(struct page *page)
 {
-   int rc = 0;
+   int rc = -EBUSY;
 
spin_lock(_lock);
if (PageHuge(page) && !page_count(page)) {
struct page *head = compound_head(page);
struct hstate *h = page_hstate(head);
int nid = page_to_nid(head);
-   if (h->free_huge_pages - h->resv_huge_pages == 0) {
-   rc = -EBUSY;
+   if (h->free_huge_pages - h->resv_huge_pages == 0)
goto out;
-   }
/*
 * Move PageHWPoison flag from head page to the raw error page,
 * which makes any subpages rather than the error page reusable.
@@ -1508,6 +1506,7 @@ int dissolve_free_huge_page(struct page *page)
h->free_huge_pages_node[nid]--;
h->max_huge_pages--;
update_and_free_page(h, head);
+   rc = 0;
}
 out:
spin_unlock(_lock);
diff --git v4.18-rc4-mmotm-2018-07-10-16-50/mm/memory-failure.c 
v4.18-rc4-mmotm-2018-07-10-16-50_patched/mm/memory-failure.c
index 9d142b9..9b77f85 100644
---

[PATCH v2 0/2] mm: soft-offline: fix race against page allocation

2018-07-16 Thread Naoya Horiguchi

I've updated the patchset based on feedbacks:

- updated comments (from Andrew),
- moved calling set_hwpoison_free_buddy_page() from mm/migrate.c to 
mm/memory-failure.c,
  which is necessary to check the return code of set_hwpoison_free_buddy_page(),
- lkp bot reported a build error when only 1/2 is applied.

  >mm/memory-failure.c: In function 'soft_offline_huge_page':
  > >> mm/memory-failure.c:1610:8: error: implicit declaration of function
  > 'set_hwpoison_free_buddy_page'; did you mean 'is_free_buddy_page'?
  > [-Werror=implicit-function-declaration]
  >if (set_hwpoison_free_buddy_page(page))
  >^~~~
  >is_free_buddy_page
  >cc1: some warnings being treated as errors

  set_hwpoison_free_buddy_page() is defined in 2/2, so we can't use it
  in 1/2. Simply doing s/set_hwpoison_free_buddy_page/!TestSetPageHWPoison/
  will fix this.

v1: https://lkml.org/lkml/2018/7/12/968

Thanks,
Naoya Horiguchi
---
Summary:

Naoya Horiguchi (2):
  mm: fix race on soft-offlining free huge pages
  mm: soft-offline: close the race against page allocation

 include/linux/page-flags.h |  5 +
 include/linux/swapops.h| 10 -
 mm/hugetlb.c   | 11 +-
 mm/memory-failure.c| 53 ++
 mm/migrate.c   | 11 --
 mm/page_alloc.c| 30 ++
 6 files changed, 84 insertions(+), 36 deletions(-)

Re: [PATCH -next] ipc/sem: prevent queue.status tearing in semop

2018-07-16 Thread Davidlohr Bueso


On Mon, 16 Jul 2018, Bueso wrote:


In order for load/store tearing to work, _all_ accesses to

   ^ prevention

Re: [PATCH -next] ipc/sem: prevent queue.status tearing in semop

2018-07-16 Thread Davidlohr Bueso


On Mon, 16 Jul 2018, Bueso wrote:


In order for load/store tearing to work, _all_ accesses to

   ^ prevention

[PATCH -next] ipc/sem: prevent queue.status tearing in semop

2018-07-16 Thread Davidlohr Bueso

In order for load/store tearing to work, _all_ accesses to
the variable in question need to be done around READ and
WRITE_ONCE() macros. Ensure everyone does so for q->status
variable for semtimedop().

Signed-off-by: Davidlohr Bueso 
---
 ipc/sem.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/ipc/sem.c b/ipc/sem.c
index 6cbbf34a44ac..ccab4e51d351 100644
--- a/ipc/sem.c
+++ b/ipc/sem.c
@@ -2125,7 +2125,7 @@ static long do_semtimedop(int semid, struct sembuf __user 
*tsops,
}
 
do {
-   queue.status = -EINTR;
+   WRITE_ONCE(queue.status, -EINTR);
queue.sleeper = current;
 
__set_current_state(TASK_INTERRUPTIBLE);
-- 
2.16.4

[PATCH -next] ipc/sem: prevent queue.status tearing in semop

2018-07-16 Thread Davidlohr Bueso

In order for load/store tearing to work, _all_ accesses to
the variable in question need to be done around READ and
WRITE_ONCE() macros. Ensure everyone does so for q->status
variable for semtimedop().

Signed-off-by: Davidlohr Bueso 
---
 ipc/sem.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/ipc/sem.c b/ipc/sem.c
index 6cbbf34a44ac..ccab4e51d351 100644
--- a/ipc/sem.c
+++ b/ipc/sem.c
@@ -2125,7 +2125,7 @@ static long do_semtimedop(int semid, struct sembuf __user 
*tsops,
}
 
do {
-   queue.status = -EINTR;
+   WRITE_ONCE(queue.status, -EINTR);
queue.sleeper = current;
 
__set_current_state(TASK_INTERRUPTIBLE);
-- 
2.16.4

[PATCH 3/4] ARM: dts: imx6sll-evk: make pfuze100 sw4 always on

2018-07-16 Thread Anson Huang

On i.MX6SLL EVK board, pfuze100 sw4 supplies
LPDDR3 which is critical for system, must be
always on.

Signed-off-by: Anson Huang 
---
 arch/arm/boot/dts/imx6sll-evk.dts | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm/boot/dts/imx6sll-evk.dts 
b/arch/arm/boot/dts/imx6sll-evk.dts
index dc34da5..c8e1155 100644
--- a/arch/arm/boot/dts/imx6sll-evk.dts
+++ b/arch/arm/boot/dts/imx6sll-evk.dts
@@ -177,6 +177,7 @@
sw4_reg: sw4 {
regulator-min-microvolt = <80>;
regulator-max-microvolt = <330>;
+   regulator-always-on;
};
 
swbst_reg: swbst {
-- 
2.7.4

[PATCH 3/4] ARM: dts: imx6sll-evk: make pfuze100 sw4 always on

2018-07-16 Thread Anson Huang

On i.MX6SLL EVK board, pfuze100 sw4 supplies
LPDDR3 which is critical for system, must be
always on.

Signed-off-by: Anson Huang 
---
 arch/arm/boot/dts/imx6sll-evk.dts | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm/boot/dts/imx6sll-evk.dts 
b/arch/arm/boot/dts/imx6sll-evk.dts
index dc34da5..c8e1155 100644
--- a/arch/arm/boot/dts/imx6sll-evk.dts
+++ b/arch/arm/boot/dts/imx6sll-evk.dts
@@ -177,6 +177,7 @@
sw4_reg: sw4 {
regulator-min-microvolt = <80>;
regulator-max-microvolt = <330>;
+   regulator-always-on;
};
 
swbst_reg: swbst {
-- 
2.7.4

[PATCH 4/4] ARM: dts: imx6sl-evk: make pfuze100 sw4 always on

2018-07-16 Thread Anson Huang

From: Robin Gong 

On i.MX6SL EVK board, pfuze100 sw4 supplies
LPDDR2 which is critical for system, must be
always on.

Signed-off-by: Anson Huang 
---
 arch/arm/boot/dts/imx6sl-evk.dts | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm/boot/dts/imx6sl-evk.dts b/arch/arm/boot/dts/imx6sl-evk.dts
index 046cfab..679b448 100644
--- a/arch/arm/boot/dts/imx6sl-evk.dts
+++ b/arch/arm/boot/dts/imx6sl-evk.dts
@@ -201,6 +201,7 @@
sw4_reg: sw4 {
regulator-min-microvolt = <80>;
regulator-max-microvolt = <330>;
+   regulator-always-on;
};
 
swbst_reg: swbst {
-- 
2.7.4

[PATCH 4/4] ARM: dts: imx6sl-evk: make pfuze100 sw4 always on

2018-07-16 Thread Anson Huang

From: Robin Gong 

On i.MX6SL EVK board, pfuze100 sw4 supplies
LPDDR2 which is critical for system, must be
always on.

Signed-off-by: Anson Huang 
---
 arch/arm/boot/dts/imx6sl-evk.dts | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm/boot/dts/imx6sl-evk.dts b/arch/arm/boot/dts/imx6sl-evk.dts
index 046cfab..679b448 100644
--- a/arch/arm/boot/dts/imx6sl-evk.dts
+++ b/arch/arm/boot/dts/imx6sl-evk.dts
@@ -201,6 +201,7 @@
sw4_reg: sw4 {
regulator-min-microvolt = <80>;
regulator-max-microvolt = <330>;
+   regulator-always-on;
};
 
swbst_reg: swbst {
-- 
2.7.4

[PATCH 2/4] ARM: dts: imx6sx-sdb-reva: make pfuze100 sw4 always on

2018-07-16 Thread Anson Huang

On i.MX6SX SDB Rev-A board, pfuze100 sw4 supplies
csi, audio codec and i2c etc., these modules do NOT
implement power domain control, so pfuze100 sw4
needs to be always on to make sure these modules
work normally.

Signed-off-by: Anson Huang 
---
 arch/arm/boot/dts/imx6sx-sdb-reva.dts | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm/boot/dts/imx6sx-sdb-reva.dts 
b/arch/arm/boot/dts/imx6sx-sdb-reva.dts
index e3533e7..9cc6ff2 100644
--- a/arch/arm/boot/dts/imx6sx-sdb-reva.dts
+++ b/arch/arm/boot/dts/imx6sx-sdb-reva.dts
@@ -63,6 +63,7 @@
sw4_reg: sw4 {
regulator-min-microvolt = <80>;
regulator-max-microvolt = <330>;
+   regulator-always-on;
};
 
swbst_reg: swbst {
-- 
2.7.4

[PATCH 2/4] ARM: dts: imx6sx-sdb-reva: make pfuze100 sw4 always on

2018-07-16 Thread Anson Huang

On i.MX6SX SDB Rev-A board, pfuze100 sw4 supplies
csi, audio codec and i2c etc., these modules do NOT
implement power domain control, so pfuze100 sw4
needs to be always on to make sure these modules
work normally.

Signed-off-by: Anson Huang 
---
 arch/arm/boot/dts/imx6sx-sdb-reva.dts | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm/boot/dts/imx6sx-sdb-reva.dts 
b/arch/arm/boot/dts/imx6sx-sdb-reva.dts
index e3533e7..9cc6ff2 100644
--- a/arch/arm/boot/dts/imx6sx-sdb-reva.dts
+++ b/arch/arm/boot/dts/imx6sx-sdb-reva.dts
@@ -63,6 +63,7 @@
sw4_reg: sw4 {
regulator-min-microvolt = <80>;
regulator-max-microvolt = <330>;
+   regulator-always-on;
};
 
swbst_reg: swbst {
-- 
2.7.4

[PATCH 1/4] ARM: dts: imx6qdl-sabresd: make pfuze100 sw4 always on

2018-07-16 Thread Anson Huang

On i.MX6QDL Sabre-SD board, pfuze100 sw4 supplies
GPS, touch and RGMII etc., these modules do NOT
implement power domain control, so pfuze100 sw4
needs to be always on to make sure these modules
work normally.

Signed-off-by: Anson Huang 
---
 arch/arm/boot/dts/imx6qdl-sabresd.dtsi | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm/boot/dts/imx6qdl-sabresd.dtsi 
b/arch/arm/boot/dts/imx6qdl-sabresd.dtsi
index 15744ad..6e46a19 100644
--- a/arch/arm/boot/dts/imx6qdl-sabresd.dtsi
+++ b/arch/arm/boot/dts/imx6qdl-sabresd.dtsi
@@ -341,6 +341,7 @@
sw4_reg: sw4 {
regulator-min-microvolt = <80>;
regulator-max-microvolt = <330>;
+   regulator-always-on;
};
 
swbst_reg: swbst {
-- 
2.7.4

[PATCH 1/4] ARM: dts: imx6qdl-sabresd: make pfuze100 sw4 always on

2018-07-16 Thread Anson Huang

On i.MX6QDL Sabre-SD board, pfuze100 sw4 supplies
GPS, touch and RGMII etc., these modules do NOT
implement power domain control, so pfuze100 sw4
needs to be always on to make sure these modules
work normally.

Signed-off-by: Anson Huang 
---
 arch/arm/boot/dts/imx6qdl-sabresd.dtsi | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm/boot/dts/imx6qdl-sabresd.dtsi 
b/arch/arm/boot/dts/imx6qdl-sabresd.dtsi
index 15744ad..6e46a19 100644
--- a/arch/arm/boot/dts/imx6qdl-sabresd.dtsi
+++ b/arch/arm/boot/dts/imx6qdl-sabresd.dtsi
@@ -341,6 +341,7 @@
sw4_reg: sw4 {
regulator-min-microvolt = <80>;
regulator-max-microvolt = <330>;
+   regulator-always-on;
};
 
swbst_reg: swbst {
-- 
2.7.4

Re: [PATCH RFC 1/2] mmc: sdhci: Allow platform controlled voltage switching

2018-07-16 Thread Vijay Viswanath

On 7/10/2018 4:37 PM, Adrian Hunter wrote:

On 21/06/18 15:23, Vijay Viswanath wrote:

Some controllers can have internal mechanism to inform the SW that it
is ready for voltage switching. For such controllers, changing voltage
before the HW is ready can result in various issues.

Add a quirk, which can be used by drivers of such controllers.

Signed-off-by: Vijay Viswanath 
---
  drivers/mmc/host/sdhci.c | 20 +++-
  drivers/mmc/host/sdhci.h |  2 ++
  2 files changed, 17 insertions(+), 5 deletions(-)

diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c
index 1c828e0..f0346d4 100644
--- a/drivers/mmc/host/sdhci.c
+++ b/drivers/mmc/host/sdhci.c
@@ -1615,7 +1615,8 @@ void sdhci_set_power_noreg(struct sdhci_host *host, 
unsigned char mode,
  void sdhci_set_power(struct sdhci_host *host, unsigned char mode,
 unsigned short vdd)
  {
-   if (IS_ERR(host->mmc->supply.vmmc))
+   if (IS_ERR(host->mmc->supply.vmmc) ||
+   (host->quirks2 & SDHCI_QUIRK2_INTERNAL_PWR_CTL))

I think you should provide your own ->set_power() instead of this

will do

sdhci_set_power_noreg(host, mode, vdd);
else
sdhci_set_power_reg(host, mode, vdd);
@@ -2009,7 +2010,9 @@ int sdhci_start_signal_voltage_switch(struct mmc_host 
*mmc,
ctrl &= ~SDHCI_CTRL_VDD_180;
sdhci_writew(host, ctrl, SDHCI_HOST_CONTROL2);

-		if (!IS_ERR(mmc->supply.vqmmc)) {

+   if (!IS_ERR(mmc->supply.vqmmc) &&
+   !(host->quirks2 &
+   SDHCI_QUIRK2_INTERNAL_PWR_CTL)) {

And your own ->start_signal_voltage_switch()

sdhci_msm_start_signal_voltage_switch() would be an exact copy of 
sdhci_start_signal_voltage_switch(). will incorporate this if not 
using quirk.

ret = mmc_regulator_set_vqmmc(mmc, ios);
if (ret) {
pr_warn("%s: Switching to 3.3V signalling voltage 
failed\n",
@@ -2032,7 +2035,8 @@ int sdhci_start_signal_voltage_switch(struct mmc_host 
*mmc,
case MMC_SIGNAL_VOLTAGE_180:
if (!(host->flags & SDHCI_SIGNALING_180))
return -EINVAL;
-   if (!IS_ERR(mmc->supply.vqmmc)) {
+   if (!IS_ERR(mmc->supply.vqmmc) &&
+   !(host->quirks2 & SDHCI_QUIRK2_INTERNAL_PWR_CTL)) {
ret = mmc_regulator_set_vqmmc(mmc, ios);
if (ret) {
pr_warn("%s: Switching to 1.8V signalling voltage 
failed\n",
@@ -3485,7 +3489,10 @@ int sdhci_setup_host(struct sdhci_host *host)
 * the host can take the appropriate action if regulators are not
 * available.
 */
-   ret = mmc_regulator_get_supply(mmc);
+   if (!(host->quirks2 & SDHCI_QUIRK2_INTERNAL_PWR_CTL))

Since we expect mmc_regulator_get_supply() to have been called, this could be:

if (!mmc->supply.vmmc) {
ret = mmc_regulator_get_supply(mmc);
enable_vqmmc = true;
} else {
ret = 0;
}
>> +  ret = mmc_regulator_get_supply(mmc);

+   else
+   ret = 0;
if (ret)
return ret;

@@ -3736,7 +3743,10 @@ int sdhci_setup_host(struct sdhci_host *host)

  	/* If vqmmc regulator and no 1.8V signalling, then there's no UHS */

if (!IS_ERR(mmc->supply.vqmmc)) {
-   ret = regulator_enable(mmc->supply.vqmmc);
+   if (!(host->quirks2 & SDHCI_QUIRK2_INTERNAL_PWR_CTL))

And this could be:

if (enable_vqmmc)
ret = regulator_enable(mmc->supply.vqmmc);
else
ret = 0;
 > However, you still need to ensure regulator_disable(mmc->supply.vqmmc) is
only called if regulator_enable() was called.

I missed this. Will cover it.

Also I missed one more place where we are doing regulator_disable. 
During sdhci-msm unbinding, we would end up doing an extra regulator 
disable (thanks Evan for pointing it out) in sdhci_remove_host.

To avoid the quirk( or having any flag), it would require copying the 
code of sdhci_start_signal_voltage_switch() and sdhci_remove_host() and 
creating 2 new functions in sdhci_msm layer which would do the exact 
same as above, with just the regulator parts removed.

This looks messy (considering any future changes to the 2 sdhci API will 
need to be copied to their duplicate sdhci_msm API) and a bit overkill 
to avoid quirk. At the same time, I don't know how useful such a quirk 
would be to other platform drivers.

Please let me know your view/suggestions.

+   ret = regulator_enable(mmc->supply.vqmmc);
+   else
+   ret = 0;
if (!regulator_is_supported_voltage(mmc->supply.vqmmc, 170,

Re: [PATCH RFC 1/2] mmc: sdhci: Allow platform controlled voltage switching

2018-07-16 Thread Vijay Viswanath

On 7/10/2018 4:37 PM, Adrian Hunter wrote:

On 21/06/18 15:23, Vijay Viswanath wrote:

Some controllers can have internal mechanism to inform the SW that it
is ready for voltage switching. For such controllers, changing voltage
before the HW is ready can result in various issues.

Add a quirk, which can be used by drivers of such controllers.

Signed-off-by: Vijay Viswanath 
---
  drivers/mmc/host/sdhci.c | 20 +++-
  drivers/mmc/host/sdhci.h |  2 ++
  2 files changed, 17 insertions(+), 5 deletions(-)

diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c
index 1c828e0..f0346d4 100644
--- a/drivers/mmc/host/sdhci.c
+++ b/drivers/mmc/host/sdhci.c
@@ -1615,7 +1615,8 @@ void sdhci_set_power_noreg(struct sdhci_host *host, 
unsigned char mode,
  void sdhci_set_power(struct sdhci_host *host, unsigned char mode,
 unsigned short vdd)
  {
-   if (IS_ERR(host->mmc->supply.vmmc))
+   if (IS_ERR(host->mmc->supply.vmmc) ||
+   (host->quirks2 & SDHCI_QUIRK2_INTERNAL_PWR_CTL))

I think you should provide your own ->set_power() instead of this

will do

sdhci_set_power_noreg(host, mode, vdd);
else
sdhci_set_power_reg(host, mode, vdd);
@@ -2009,7 +2010,9 @@ int sdhci_start_signal_voltage_switch(struct mmc_host 
*mmc,
ctrl &= ~SDHCI_CTRL_VDD_180;
sdhci_writew(host, ctrl, SDHCI_HOST_CONTROL2);

-		if (!IS_ERR(mmc->supply.vqmmc)) {

+   if (!IS_ERR(mmc->supply.vqmmc) &&
+   !(host->quirks2 &
+   SDHCI_QUIRK2_INTERNAL_PWR_CTL)) {

And your own ->start_signal_voltage_switch()

sdhci_msm_start_signal_voltage_switch() would be an exact copy of 
sdhci_start_signal_voltage_switch(). will incorporate this if not 
using quirk.

ret = mmc_regulator_set_vqmmc(mmc, ios);
if (ret) {
pr_warn("%s: Switching to 3.3V signalling voltage 
failed\n",
@@ -2032,7 +2035,8 @@ int sdhci_start_signal_voltage_switch(struct mmc_host 
*mmc,
case MMC_SIGNAL_VOLTAGE_180:
if (!(host->flags & SDHCI_SIGNALING_180))
return -EINVAL;
-   if (!IS_ERR(mmc->supply.vqmmc)) {
+   if (!IS_ERR(mmc->supply.vqmmc) &&
+   !(host->quirks2 & SDHCI_QUIRK2_INTERNAL_PWR_CTL)) {
ret = mmc_regulator_set_vqmmc(mmc, ios);
if (ret) {
pr_warn("%s: Switching to 1.8V signalling voltage 
failed\n",
@@ -3485,7 +3489,10 @@ int sdhci_setup_host(struct sdhci_host *host)
 * the host can take the appropriate action if regulators are not
 * available.
 */
-   ret = mmc_regulator_get_supply(mmc);
+   if (!(host->quirks2 & SDHCI_QUIRK2_INTERNAL_PWR_CTL))

Since we expect mmc_regulator_get_supply() to have been called, this could be:

if (!mmc->supply.vmmc) {
ret = mmc_regulator_get_supply(mmc);
enable_vqmmc = true;
} else {
ret = 0;
}
>> +  ret = mmc_regulator_get_supply(mmc);

+   else
+   ret = 0;
if (ret)
return ret;

@@ -3736,7 +3743,10 @@ int sdhci_setup_host(struct sdhci_host *host)

  	/* If vqmmc regulator and no 1.8V signalling, then there's no UHS */

if (!IS_ERR(mmc->supply.vqmmc)) {
-   ret = regulator_enable(mmc->supply.vqmmc);
+   if (!(host->quirks2 & SDHCI_QUIRK2_INTERNAL_PWR_CTL))

And this could be:

if (enable_vqmmc)
ret = regulator_enable(mmc->supply.vqmmc);
else
ret = 0;
 > However, you still need to ensure regulator_disable(mmc->supply.vqmmc) is
only called if regulator_enable() was called.

I missed this. Will cover it.

Also I missed one more place where we are doing regulator_disable. 
During sdhci-msm unbinding, we would end up doing an extra regulator 
disable (thanks Evan for pointing it out) in sdhci_remove_host.

To avoid the quirk( or having any flag), it would require copying the 
code of sdhci_start_signal_voltage_switch() and sdhci_remove_host() and 
creating 2 new functions in sdhci_msm layer which would do the exact 
same as above, with just the regulator parts removed.

This looks messy (considering any future changes to the 2 sdhci API will 
need to be copied to their duplicate sdhci_msm API) and a bit overkill 
to avoid quirk. At the same time, I don't know how useful such a quirk 
would be to other platform drivers.

Please let me know your view/suggestions.

+   ret = regulator_enable(mmc->supply.vqmmc);
+   else
+   ret = 0;
if (!regulator_is_supported_voltage(mmc->supply.vqmmc, 170,

Re: [PATCH 1/5] x86/pti: check the return value ofpti_user_pagetable_walk_p4d

2018-07-16 Thread Thomas Gleixner

On Tue, 17 Jul 2018, jiang.bi...@zte.com.cn wrote:
> > On 07/15/2018 09:03 PM, Jiang Biao wrote:
> >> diff --git a/arch/x86/mm/pti.c b/arch/x86/mm/pti.c
> >> index 4d418e7..be9e5bc 100644
> >> --- a/arch/x86/mm/pti.c
> >> +++ b/arch/x86/mm/pti.c
> >> @@ -195,8 +195,10 @@ static p4d_t *pti_user_pagetable_walk_p4d(unsigned 
> >> long address)
> >>  static pmd_t *pti_user_pagetable_walk_pmd(unsigned long address)
> >>  {
> >>  gfp_t gfp = (GFP_KERNEL | __GFP_NOTRACK | __GFP_ZERO);
> >> -p4d_t *p4d = pti_user_pagetable_walk_p4d(address);
> >>  pud_t *pud;
> >> +p4d_t *p4d = pti_user_pagetable_walk_p4d(address);
> >> +if (WARN_ON(!p4d))
> >> +return NULL;
> > 
> > First of all, I don't think we need the (new) warning here.
> > pti_user_pagetable_walk_p4d() only returns NULL if you try it on a
> > userspace _address_ or the page allocation fails.  It already warns on
> > the address case.
> > 
> > If you think the allocation path needs a warning, please do it as close
> > as possible to the _source_ of the warning (the failed allocation), not
> > in the caller.
> Hi,
> Just taking  pti_clone_pmds() as reference, which add warning for NULL 
> *target_pmd*. The warning does not really matter, I will remove the 
> *WARN_ON* here and add a warning close to the failed allocation. 
> Furthermore do you think we need to check  _NULL_ return value?  
> to avoid NULL pointer dereference when the return value is used later, 
> such as,
>   p4d_large(*p4d) //p4d is return value of pti_user_pagetable_walk_p4d
>   *user_p4d = *kernel_p4d;//user_p4d is return value of 
> pti_user_pagetable_walk_p4d

The Null pointer check makes sense.

Thanks,

tglx

Re: [PATCH 1/5] x86/pti: check the return value ofpti_user_pagetable_walk_p4d

2018-07-16 Thread Thomas Gleixner

On Tue, 17 Jul 2018, jiang.bi...@zte.com.cn wrote:
> > On 07/15/2018 09:03 PM, Jiang Biao wrote:
> >> diff --git a/arch/x86/mm/pti.c b/arch/x86/mm/pti.c
> >> index 4d418e7..be9e5bc 100644
> >> --- a/arch/x86/mm/pti.c
> >> +++ b/arch/x86/mm/pti.c
> >> @@ -195,8 +195,10 @@ static p4d_t *pti_user_pagetable_walk_p4d(unsigned 
> >> long address)
> >>  static pmd_t *pti_user_pagetable_walk_pmd(unsigned long address)
> >>  {
> >>  gfp_t gfp = (GFP_KERNEL | __GFP_NOTRACK | __GFP_ZERO);
> >> -p4d_t *p4d = pti_user_pagetable_walk_p4d(address);
> >>  pud_t *pud;
> >> +p4d_t *p4d = pti_user_pagetable_walk_p4d(address);
> >> +if (WARN_ON(!p4d))
> >> +return NULL;
> > 
> > First of all, I don't think we need the (new) warning here.
> > pti_user_pagetable_walk_p4d() only returns NULL if you try it on a
> > userspace _address_ or the page allocation fails.  It already warns on
> > the address case.
> > 
> > If you think the allocation path needs a warning, please do it as close
> > as possible to the _source_ of the warning (the failed allocation), not
> > in the caller.
> Hi,
> Just taking  pti_clone_pmds() as reference, which add warning for NULL 
> *target_pmd*. The warning does not really matter, I will remove the 
> *WARN_ON* here and add a warning close to the failed allocation. 
> Furthermore do you think we need to check  _NULL_ return value?  
> to avoid NULL pointer dereference when the return value is used later, 
> such as,
>   p4d_large(*p4d) //p4d is return value of pti_user_pagetable_walk_p4d
>   *user_p4d = *kernel_p4d;//user_p4d is return value of 
> pti_user_pagetable_walk_p4d

The Null pointer check makes sense.

Thanks,

tglx

Re: [RFC PATCH v2 1/4] dt-bindings: misc: Add bindings for misc. BMC control fields

2018-07-16 Thread Benjamin Herrenschmidt

On Mon, 2018-07-16 at 07:55 -0600, Rob Herring wrote:
> If that data is one set per SoC, then i'm not that concerned having
> platform-specific data in the driver. That doesn't mean the driver is
> not "generic". It's still not clear to me in this thread, how much of
> this is board specific, but given that you've placed all the data in
> an SoC dtsi file it seems to be all per SoC.

So Rob, I think that's precisely where the disconnect is.

I think we all (well hopefully) agree that those few tunables don't fit
in any existing subystem and aren't likely to ever do (famous last
words...).

Where we disagree is we want to make this parametrized via the DT, and
you want us to hard wire the list in some kind of SoC driver for a
given SoC family/version.

The reason I think hard wiring the list in the driver is not a great
solution is that that list in itself is prone to variations, possibly
fairly often, between boards, vendors, versions of boards, etc...

We can't know for sure every SoC tunable (out of the gazillions in
those chips) are going to be needed for a given system. We know which
ones we do use for ours, and that's a couple of handfuls, but it could
be that Dell need a slightly different set, and so might Yadro, or so
might our next board revision for that matter.

Now, updating the device-tree in the board flash with whatever vendor
specific information is needed is a LOT easier than getting the kernel
driver constantly updated. The device-tree after all is there to
reflect among other things system specific ways in which the SoC is
wired and configured. This is rather close...

Cheers,
Ben.

Re: [RFC PATCH v2 1/4] dt-bindings: misc: Add bindings for misc. BMC control fields

2018-07-16 Thread Benjamin Herrenschmidt

On Mon, 2018-07-16 at 07:55 -0600, Rob Herring wrote:
> If that data is one set per SoC, then i'm not that concerned having
> platform-specific data in the driver. That doesn't mean the driver is
> not "generic". It's still not clear to me in this thread, how much of
> this is board specific, but given that you've placed all the data in
> an SoC dtsi file it seems to be all per SoC.

So Rob, I think that's precisely where the disconnect is.

I think we all (well hopefully) agree that those few tunables don't fit
in any existing subystem and aren't likely to ever do (famous last
words...).

Where we disagree is we want to make this parametrized via the DT, and
you want us to hard wire the list in some kind of SoC driver for a
given SoC family/version.

The reason I think hard wiring the list in the driver is not a great
solution is that that list in itself is prone to variations, possibly
fairly often, between boards, vendors, versions of boards, etc...

We can't know for sure every SoC tunable (out of the gazillions in
those chips) are going to be needed for a given system. We know which
ones we do use for ours, and that's a couple of handfuls, but it could
be that Dell need a slightly different set, and so might Yadro, or so
might our next board revision for that matter.

Now, updating the device-tree in the board flash with whatever vendor
specific information is needed is a LOT easier than getting the kernel
driver constantly updated. The device-tree after all is there to
reflect among other things system specific ways in which the SoC is
wired and configured. This is rather close...

Cheers,
Ben.

linux-next: build failure after merge of the integrity tree

2018-07-16 Thread Stephen Rothwell

Hi all,

After merging the integrity tree, today's linux-next build (x86_64
allmodconfig) failed like this:

security/integrity/ima/ima_main.c:549:5: error: redefinition of 'ima_load_data'
 int ima_load_data(enum kernel_load_data_id id)
 ^
security/integrity/ima/ima_main.c:506:5: note: previous definition of 
'ima_load_data' was here
 int ima_load_data(enum kernel_load_data_id id)
 ^

Caused by commit

  4995c7ac4490 ("ima: based on policy require signed kexec kernel images")

interacting with commit

  16c267aac86b ("ima: based on policy require signed kexec kernel images")

from the sceurity tree.

Basically the same set of patches are in the security tree and the
integrity tree as different commits.  The merge managed to add this
function twice.  I have added a patche to remove one copy of the function.

Please clean up the integrity tree relative to the security tree.

-- 
Cheers,
Stephen Rothwell


pgpcthCkrvoFR.pgp
Description: OpenPGP digital signature

linux-next: build failure after merge of the integrity tree

2018-07-16 Thread Stephen Rothwell

Hi all,

After merging the integrity tree, today's linux-next build (x86_64
allmodconfig) failed like this:

security/integrity/ima/ima_main.c:549:5: error: redefinition of 'ima_load_data'
 int ima_load_data(enum kernel_load_data_id id)
 ^
security/integrity/ima/ima_main.c:506:5: note: previous definition of 
'ima_load_data' was here
 int ima_load_data(enum kernel_load_data_id id)
 ^

Caused by commit

  4995c7ac4490 ("ima: based on policy require signed kexec kernel images")

interacting with commit

  16c267aac86b ("ima: based on policy require signed kexec kernel images")

from the sceurity tree.

Basically the same set of patches are in the security tree and the
integrity tree as different commits.  The merge managed to add this
function twice.  I have added a patche to remove one copy of the function.

Please clean up the integrity tree relative to the security tree.

-- 
Cheers,
Stephen Rothwell


pgpcthCkrvoFR.pgp
Description: OpenPGP digital signature

[PATCH v2] spi: spi-fsl-dspi: Fill actual_length when doing DMA transfer

2018-07-16 Thread Andrey Smirnov

Upper layer users of SPI device drivers may rely on 'actual_length',
so it is important that information is correctly reported. One such
example is spi_mem_exec_op() function that will fail if
'actual_length' of the data transferred is not what was requested. Add
necessary code to populate 'actual_length.

Cc: Mark Brown 
Cc: Sanchayan Maity 
Cc: Stefan Agner 
Cc: cphe...@gmail.com
Cc: linux-...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Andrey Smirnov 
---

Changes since [v1]

- Patch rebase on for-next branch of SPI sybsytem's git tree

[v1] lkml.kernel.org/r/20180716062508.7726-1-andrew.smir...@gmail.com

 drivers/spi/spi-fsl-dspi.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/spi/spi-fsl-dspi.c b/drivers/spi/spi-fsl-dspi.c
index 89a1e7a4fe5d..9e598642ca66 100644
--- a/drivers/spi/spi-fsl-dspi.c
+++ b/drivers/spi/spi-fsl-dspi.c
@@ -358,6 +358,7 @@ static int dspi_dma_xfer(struct fsl_dspi *dspi)
 {
struct fsl_dspi_dma *dma = dspi->dma;
struct device *dev = >pdev->dev;
+   struct spi_message *message = dspi->cur_msg;
int curr_remaining_bytes;
int bytes_per_buffer;
int ret = 0;
@@ -377,8 +378,10 @@ static int dspi_dma_xfer(struct fsl_dspi *dspi)
goto exit;
 
} else {
-   curr_remaining_bytes -= dma->curr_xfer_len
-   * dspi->bytes_per_word;
+   const int len =
+   dma->curr_xfer_len * dspi->bytes_per_word;
+   curr_remaining_bytes -= len;
+   message->actual_length += len;
if (curr_remaining_bytes < 0)
curr_remaining_bytes = 0;
}
-- 
2.17.1

[PATCH v2] spi: spi-fsl-dspi: Fill actual_length when doing DMA transfer

2018-07-16 Thread Andrey Smirnov

Upper layer users of SPI device drivers may rely on 'actual_length',
so it is important that information is correctly reported. One such
example is spi_mem_exec_op() function that will fail if
'actual_length' of the data transferred is not what was requested. Add
necessary code to populate 'actual_length.

Cc: Mark Brown 
Cc: Sanchayan Maity 
Cc: Stefan Agner 
Cc: cphe...@gmail.com
Cc: linux-...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Andrey Smirnov 
---

Changes since [v1]

- Patch rebase on for-next branch of SPI sybsytem's git tree

[v1] lkml.kernel.org/r/20180716062508.7726-1-andrew.smir...@gmail.com

 drivers/spi/spi-fsl-dspi.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/spi/spi-fsl-dspi.c b/drivers/spi/spi-fsl-dspi.c
index 89a1e7a4fe5d..9e598642ca66 100644
--- a/drivers/spi/spi-fsl-dspi.c
+++ b/drivers/spi/spi-fsl-dspi.c
@@ -358,6 +358,7 @@ static int dspi_dma_xfer(struct fsl_dspi *dspi)
 {
struct fsl_dspi_dma *dma = dspi->dma;
struct device *dev = >pdev->dev;
+   struct spi_message *message = dspi->cur_msg;
int curr_remaining_bytes;
int bytes_per_buffer;
int ret = 0;
@@ -377,8 +378,10 @@ static int dspi_dma_xfer(struct fsl_dspi *dspi)
goto exit;
 
} else {
-   curr_remaining_bytes -= dma->curr_xfer_len
-   * dspi->bytes_per_word;
+   const int len =
+   dma->curr_xfer_len * dspi->bytes_per_word;
+   curr_remaining_bytes -= len;
+   message->actual_length += len;
if (curr_remaining_bytes < 0)
curr_remaining_bytes = 0;
}
-- 
2.17.1

[lkp-robot] [afs] 5b86d4ff5d: BUG:unable_to_handle_kernel

2018-07-16 Thread kernel test robot



FYI, we noticed the following commit (built with gcc-7):

commit: 5b86d4ff5dce3271dff54119e06174dc22422903 ("afs: Implement network 
namespacing")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

in testcase: trinity
with following parameters:

runtime: 300s

test-description: Trinity is a linux system call fuzz tester.
test-url: http://codemonkey.org.uk/projects/trinity/


on test machine: qemu-system-x86_64 -enable-kvm -m 512M

caused below changes (please refer to attached dmesg/kmsg for entire 
log/backtrace):


+--+++
|  | 1588def91d 
| 5b86d4ff5d |
+--+++
| boot_successes   | 31 
| 6  |
| boot_failures| 4  
| 35 |
| invoked_oom-killer:gfp_mask=0x   | 4  
| 2  |
| Mem-Info | 4  
| 2  |
| Out_of_memory:Kill_process   | 2  
| 2  |
| Kernel_panic-not_syncing:Out_of_memory_and_no_killable_processes | 2  
| 2  |
| RIP:copy_fpstate_to_sigframe | 0  
| 1  |
| BUG:unable_to_handle_kernel  | 0  
| 33 |
| Oops:#[##]   | 0  
| 33 |
| RIP:rxrpc_rcu_destroy_call   | 0  
| 33 |
| Kernel_panic-not_syncing:Fatal_exception_in_interrupt| 0  
| 33 |
+--+++



[   20.610136] BUG: unable to handle kernel paging request at 88de4068
[   20.610841] PGD 40da067 P4D 40da067 PUD 40db067 PMD 1bffa067 PTE 
80de4060
[   20.611550] Oops: 0002 [#1] PREEMPT SMP DEBUG_PAGEALLOC
[   20.612039] Modules linked in: bochs_drm ttm drm_kms_helper evdev serio_raw 
drm drm_panel_orientation_quirks cfbfillrect cfbimgblt cfbcopyarea fb_sys_fops 
syscopyarea sysfillrect sysimgblt fb intel_agp intel_gtt button fbdev agpgart
[   20.613955] CPU: 0 PID: 12 Comm: rcuc/0 Not tainted 
4.17.0-rc5-00051-g5b86d4f #1
[   20.614655] RIP: 0010:rxrpc_rcu_destroy_call+0x39/0x50
[   20.615136] RSP: 0018:c912be08 EFLAGS: 00010286
[   20.615635] RAX: 8200 RBX: 88d59700 RCX: 0201
[   20.616304] RDX: 8201 RSI: 9f670a10 RDI: 
[   20.627094] RBP: 88de4000 R08: 686a9aac R09: 0001
[   20.627813] R10: 0001 R11:  R12: 88001bddfa00
[   20.628474] R13: 88001bddfa38 R14: 88d59700 R15: 8110333a
[   20.629137] FS:  () GS:88001bc0() 
knlGS:
[   20.629861] CS:  0010 DS:  ES:  CR0: 80050033
[   20.630383] CR2: 88de4068 CR3: 00d67000 CR4: 06f0
[   20.631031] DR0:  DR1:  DR2: 
[   20.631697] DR3:  DR6: fffe0ff0 DR7: 0400
[   20.632358] Call Trace:
[   20.632605]  ? rxrpc_process_call+0x820/0x820
[   20.633015]  rcu_do_batch+0x172/0x370
[   20.633432]  rcu_cpu_kthread+0x165/0x170
[   20.633805]  ? smpboot_thread_fn+0x23/0x280
[   20.634197]  ? smpboot_thread_fn+0x6b/0x280
[   20.634595]  smpboot_thread_fn+0x1d3/0x280
[   20.634981]  ? sort_range+0x20/0x20
[   20.635319]  kthread+0x115/0x130
[   20.635641]  ? kthread_create_on_node+0x60/0x60
[   20.636067]  ret_from_fork+0x35/0x40
[   20.636412] Code: 6e ef 00 00 48 8b bb c8 03 00 00 e8 d2 90 9f ff 48 8b bb 
d0 03 00 00 e8 c6 90 9f ff 48 8b 3d 67 94 91 02 48 89 de e8 97 91 9f ff <3e> ff 
4d 68 74 03 5b 5d c3 48 8d 7d 68 5b 5d e9 e3 0d 92 ff 0f 
[   20.648368] RIP: rxrpc_rcu_destroy_call+0x39/0x50 RSP: c912be08
[   20.648981] CR2: 88de4068
[   20.649307] ---[ end trace 3db688f433278894 ]---


To reproduce:

git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp qemu -k  job-script # job-script is attached in this 
email



Thanks,
Rong, Chen
#
# Automatically generated file; DO NOT EDIT.
# Linux/x86_64 4.17.0-rc5 Kernel Configuration
#
CONFIG_64BIT=y
CONFIG_X86_64=y
CONFIG_X86=y
CONFIG_INSTRUCTION_DECODER=y
CONFIG_OUTPUT_FORMAT="elf64-x86-64"
CONFIG_ARCH_DEFCONFIG="arch/x86/configs/x86_64_defconfig"
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_MMU=y
CONFIG_ARCH_MMAP_RND_BITS_MIN=28
CONFIG_ARCH_MMAP_RND_BITS_MAX=32
CONFIG_ARCH_MMAP_RND_COMPAT_BITS_MIN=8
CONFIG_ARCH_MMAP_RND_COMPAT_BITS_MAX=16

[lkp-robot] [afs] 5b86d4ff5d: BUG:unable_to_handle_kernel

2018-07-16 Thread kernel test robot



FYI, we noticed the following commit (built with gcc-7):

commit: 5b86d4ff5dce3271dff54119e06174dc22422903 ("afs: Implement network 
namespacing")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

in testcase: trinity
with following parameters:

runtime: 300s

test-description: Trinity is a linux system call fuzz tester.
test-url: http://codemonkey.org.uk/projects/trinity/


on test machine: qemu-system-x86_64 -enable-kvm -m 512M

caused below changes (please refer to attached dmesg/kmsg for entire 
log/backtrace):


+--+++
|  | 1588def91d 
| 5b86d4ff5d |
+--+++
| boot_successes   | 31 
| 6  |
| boot_failures| 4  
| 35 |
| invoked_oom-killer:gfp_mask=0x   | 4  
| 2  |
| Mem-Info | 4  
| 2  |
| Out_of_memory:Kill_process   | 2  
| 2  |
| Kernel_panic-not_syncing:Out_of_memory_and_no_killable_processes | 2  
| 2  |
| RIP:copy_fpstate_to_sigframe | 0  
| 1  |
| BUG:unable_to_handle_kernel  | 0  
| 33 |
| Oops:#[##]   | 0  
| 33 |
| RIP:rxrpc_rcu_destroy_call   | 0  
| 33 |
| Kernel_panic-not_syncing:Fatal_exception_in_interrupt| 0  
| 33 |
+--+++



[   20.610136] BUG: unable to handle kernel paging request at 88de4068
[   20.610841] PGD 40da067 P4D 40da067 PUD 40db067 PMD 1bffa067 PTE 
80de4060
[   20.611550] Oops: 0002 [#1] PREEMPT SMP DEBUG_PAGEALLOC
[   20.612039] Modules linked in: bochs_drm ttm drm_kms_helper evdev serio_raw 
drm drm_panel_orientation_quirks cfbfillrect cfbimgblt cfbcopyarea fb_sys_fops 
syscopyarea sysfillrect sysimgblt fb intel_agp intel_gtt button fbdev agpgart
[   20.613955] CPU: 0 PID: 12 Comm: rcuc/0 Not tainted 
4.17.0-rc5-00051-g5b86d4f #1
[   20.614655] RIP: 0010:rxrpc_rcu_destroy_call+0x39/0x50
[   20.615136] RSP: 0018:c912be08 EFLAGS: 00010286
[   20.615635] RAX: 8200 RBX: 88d59700 RCX: 0201
[   20.616304] RDX: 8201 RSI: 9f670a10 RDI: 
[   20.627094] RBP: 88de4000 R08: 686a9aac R09: 0001
[   20.627813] R10: 0001 R11:  R12: 88001bddfa00
[   20.628474] R13: 88001bddfa38 R14: 88d59700 R15: 8110333a
[   20.629137] FS:  () GS:88001bc0() 
knlGS:
[   20.629861] CS:  0010 DS:  ES:  CR0: 80050033
[   20.630383] CR2: 88de4068 CR3: 00d67000 CR4: 06f0
[   20.631031] DR0:  DR1:  DR2: 
[   20.631697] DR3:  DR6: fffe0ff0 DR7: 0400
[   20.632358] Call Trace:
[   20.632605]  ? rxrpc_process_call+0x820/0x820
[   20.633015]  rcu_do_batch+0x172/0x370
[   20.633432]  rcu_cpu_kthread+0x165/0x170
[   20.633805]  ? smpboot_thread_fn+0x23/0x280
[   20.634197]  ? smpboot_thread_fn+0x6b/0x280
[   20.634595]  smpboot_thread_fn+0x1d3/0x280
[   20.634981]  ? sort_range+0x20/0x20
[   20.635319]  kthread+0x115/0x130
[   20.635641]  ? kthread_create_on_node+0x60/0x60
[   20.636067]  ret_from_fork+0x35/0x40
[   20.636412] Code: 6e ef 00 00 48 8b bb c8 03 00 00 e8 d2 90 9f ff 48 8b bb 
d0 03 00 00 e8 c6 90 9f ff 48 8b 3d 67 94 91 02 48 89 de e8 97 91 9f ff <3e> ff 
4d 68 74 03 5b 5d c3 48 8d 7d 68 5b 5d e9 e3 0d 92 ff 0f 
[   20.648368] RIP: rxrpc_rcu_destroy_call+0x39/0x50 RSP: c912be08
[   20.648981] CR2: 88de4068
[   20.649307] ---[ end trace 3db688f433278894 ]---


To reproduce:

git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp qemu -k  job-script # job-script is attached in this 
email



Thanks,
Rong, Chen
#
# Automatically generated file; DO NOT EDIT.
# Linux/x86_64 4.17.0-rc5 Kernel Configuration
#
CONFIG_64BIT=y
CONFIG_X86_64=y
CONFIG_X86=y
CONFIG_INSTRUCTION_DECODER=y
CONFIG_OUTPUT_FORMAT="elf64-x86-64"
CONFIG_ARCH_DEFCONFIG="arch/x86/configs/x86_64_defconfig"
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_MMU=y
CONFIG_ARCH_MMAP_RND_BITS_MIN=28
CONFIG_ARCH_MMAP_RND_BITS_MAX=32
CONFIG_ARCH_MMAP_RND_COMPAT_BITS_MIN=8
CONFIG_ARCH_MMAP_RND_COMPAT_BITS_MAX=16

Re: [patch -mm] mm, oom: remove oom_lock from exit_mmap

2018-07-16 Thread David Rientjes

On Sat, 14 Jul 2018, Tetsuo Handa wrote:

> David is making changes using timeout based back off (in linux-next.git)
> which is inappropriately trying to use MMF_UNSTABLE for two purposes.
> 

If you believe there is a problem with the use of MMF_UNSTABLE as it sits 
in -mm, please follow up directly in the thread that proposed the patch.  
I have seen two replies to that thread from you: one that incorporates it 
into your work, and one that links to a verison of my patch in your 
patchset.  I haven't seen a concern raised about the use of MMF_UNSTABLE, 
but perhaps it's somewhere in the 10,000 other emails about the oom 
killer.

[PATCH v5 02/11] crypto: cbc: Remove VLA usage

2018-07-16 Thread Kees Cook

In the quest to remove all stack VLA usage from the kernel[1], this
uses the upper bounds on blocksize. Since this is always a cipher
blocksize, use the existing cipher max blocksize.

[1] 
https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qpxydaacu1rq...@mail.gmail.com

Signed-off-by: Kees Cook 
---
 include/crypto/cbc.h | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/include/crypto/cbc.h b/include/crypto/cbc.h
index f5b8bfc22e6d..47db0aac2ab9 100644
--- a/include/crypto/cbc.h
+++ b/include/crypto/cbc.h
@@ -113,7 +113,9 @@ static inline int crypto_cbc_decrypt_inplace(
unsigned int bsize = crypto_skcipher_blocksize(tfm);
unsigned int nbytes = walk->nbytes;
u8 *src = walk->src.virt.addr;
-   u8 last_iv[bsize];
+   u8 last_iv[MAX_CIPHER_BLOCKSIZE];
+
+   BUG_ON(bsize > sizeof(last_iv));
 
/* Start of the last block. */
src += nbytes - (nbytes & (bsize - 1)) - bsize;
-- 
2.17.1

[PATCH v5 09/11] crypto: shash: Remove VLA usage in unaligned hashing

2018-07-16 Thread Kees Cook

In the quest to remove all stack VLA usage from the kernel[1], this uses
the newly defined max alignment to perform unaligned hashing to avoid
VLAs, and drops the helper function while adding sanity checks on the
resulting buffer sizes. Additionally, the __aligned_largest macro is
removed since this helper was the only user.

[1] 
https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qpxydaacu1rq...@mail.gmail.com

Signed-off-by: Kees Cook 
---
 crypto/shash.c   | 27 ---
 include/linux/compiler-gcc.h |  1 -
 2 files changed, 16 insertions(+), 12 deletions(-)

diff --git a/crypto/shash.c b/crypto/shash.c
index ab6902c6dae7..8d4746b14dd5 100644
--- a/crypto/shash.c
+++ b/crypto/shash.c
@@ -73,13 +73,6 @@ int crypto_shash_setkey(struct crypto_shash *tfm, const u8 
*key,
 }
 EXPORT_SYMBOL_GPL(crypto_shash_setkey);
 
-static inline unsigned int shash_align_buffer_size(unsigned len,
-  unsigned long mask)
-{
-   typedef u8 __aligned_largest u8_aligned;
-   return len + (mask & ~(__alignof__(u8_aligned) - 1));
-}
-
 static int shash_update_unaligned(struct shash_desc *desc, const u8 *data,
  unsigned int len)
 {
@@ -88,11 +81,17 @@ static int shash_update_unaligned(struct shash_desc *desc, 
const u8 *data,
unsigned long alignmask = crypto_shash_alignmask(tfm);
unsigned int unaligned_len = alignmask + 1 -
 ((unsigned long)data & alignmask);
-   u8 ubuf[shash_align_buffer_size(unaligned_len, alignmask)]
-   __aligned_largest;
+   /*
+* We cannot count on __aligned() working for large values:
+* https://patchwork.kernel.org/patch/9507697/
+*/
+   u8 ubuf[MAX_ALGAPI_ALIGNMASK * 2];
u8 *buf = PTR_ALIGN([0], alignmask + 1);
int err;
 
+   if (WARN_ON(buf + unaligned_len > ubuf + sizeof(ubuf)))
+   return -EINVAL;
+
if (unaligned_len > len)
unaligned_len = len;
 
@@ -124,11 +123,17 @@ static int shash_final_unaligned(struct shash_desc *desc, 
u8 *out)
unsigned long alignmask = crypto_shash_alignmask(tfm);
struct shash_alg *shash = crypto_shash_alg(tfm);
unsigned int ds = crypto_shash_digestsize(tfm);
-   u8 ubuf[shash_align_buffer_size(ds, alignmask)]
-   __aligned_largest;
+   /*
+* We cannot count on __aligned() working for large values:
+* https://patchwork.kernel.org/patch/9507697/
+*/
+   u8 ubuf[MAX_ALGAPI_ALIGNMASK + SHASH_MAX_DIGESTSIZE];
u8 *buf = PTR_ALIGN([0], alignmask + 1);
int err;
 
+   if (WARN_ON(buf + ds > ubuf + sizeof(ubuf)))
+   return -EINVAL;
+
err = shash->final(desc, buf);
if (err)
goto out;
diff --git a/include/linux/compiler-gcc.h b/include/linux/compiler-gcc.h
index f1a7492a5cc8..1f1cdef36a82 100644
--- a/include/linux/compiler-gcc.h
+++ b/include/linux/compiler-gcc.h
@@ -125,7 +125,6 @@
  */
 #define __pure __attribute__((pure))
 #define __aligned(x)   __attribute__((aligned(x)))
-#define __aligned_largest  __attribute__((aligned))
 #define __printf(a, b) __attribute__((format(printf, a, b)))
 #define __scanf(a, b)  __attribute__((format(scanf, a, b)))
 #define __attribute_const____attribute__((__const__))
-- 
2.17.1

Re: [patch -mm] mm, oom: remove oom_lock from exit_mmap

2018-07-16 Thread David Rientjes

On Sat, 14 Jul 2018, Tetsuo Handa wrote:

> David is making changes using timeout based back off (in linux-next.git)
> which is inappropriately trying to use MMF_UNSTABLE for two purposes.
> 

If you believe there is a problem with the use of MMF_UNSTABLE as it sits 
in -mm, please follow up directly in the thread that proposed the patch.  
I have seen two replies to that thread from you: one that incorporates it 
into your work, and one that links to a verison of my patch in your 
patchset.  I haven't seen a concern raised about the use of MMF_UNSTABLE, 
but perhaps it's somewhere in the 10,000 other emails about the oom 
killer.

[PATCH v5 02/11] crypto: cbc: Remove VLA usage

2018-07-16 Thread Kees Cook

In the quest to remove all stack VLA usage from the kernel[1], this
uses the upper bounds on blocksize. Since this is always a cipher
blocksize, use the existing cipher max blocksize.

[1] 
https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qpxydaacu1rq...@mail.gmail.com

Signed-off-by: Kees Cook 
---
 include/crypto/cbc.h | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/include/crypto/cbc.h b/include/crypto/cbc.h
index f5b8bfc22e6d..47db0aac2ab9 100644
--- a/include/crypto/cbc.h
+++ b/include/crypto/cbc.h
@@ -113,7 +113,9 @@ static inline int crypto_cbc_decrypt_inplace(
unsigned int bsize = crypto_skcipher_blocksize(tfm);
unsigned int nbytes = walk->nbytes;
u8 *src = walk->src.virt.addr;
-   u8 last_iv[bsize];
+   u8 last_iv[MAX_CIPHER_BLOCKSIZE];
+
+   BUG_ON(bsize > sizeof(last_iv));
 
/* Start of the last block. */
src += nbytes - (nbytes & (bsize - 1)) - bsize;
-- 
2.17.1

[PATCH v5 09/11] crypto: shash: Remove VLA usage in unaligned hashing

2018-07-16 Thread Kees Cook

In the quest to remove all stack VLA usage from the kernel[1], this uses
the newly defined max alignment to perform unaligned hashing to avoid
VLAs, and drops the helper function while adding sanity checks on the
resulting buffer sizes. Additionally, the __aligned_largest macro is
removed since this helper was the only user.

[1] 
https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qpxydaacu1rq...@mail.gmail.com

Signed-off-by: Kees Cook 
---
 crypto/shash.c   | 27 ---
 include/linux/compiler-gcc.h |  1 -
 2 files changed, 16 insertions(+), 12 deletions(-)

diff --git a/crypto/shash.c b/crypto/shash.c
index ab6902c6dae7..8d4746b14dd5 100644
--- a/crypto/shash.c
+++ b/crypto/shash.c
@@ -73,13 +73,6 @@ int crypto_shash_setkey(struct crypto_shash *tfm, const u8 
*key,
 }
 EXPORT_SYMBOL_GPL(crypto_shash_setkey);
 
-static inline unsigned int shash_align_buffer_size(unsigned len,
-  unsigned long mask)
-{
-   typedef u8 __aligned_largest u8_aligned;
-   return len + (mask & ~(__alignof__(u8_aligned) - 1));
-}
-
 static int shash_update_unaligned(struct shash_desc *desc, const u8 *data,
  unsigned int len)
 {
@@ -88,11 +81,17 @@ static int shash_update_unaligned(struct shash_desc *desc, 
const u8 *data,
unsigned long alignmask = crypto_shash_alignmask(tfm);
unsigned int unaligned_len = alignmask + 1 -
 ((unsigned long)data & alignmask);
-   u8 ubuf[shash_align_buffer_size(unaligned_len, alignmask)]
-   __aligned_largest;
+   /*
+* We cannot count on __aligned() working for large values:
+* https://patchwork.kernel.org/patch/9507697/
+*/
+   u8 ubuf[MAX_ALGAPI_ALIGNMASK * 2];
u8 *buf = PTR_ALIGN([0], alignmask + 1);
int err;
 
+   if (WARN_ON(buf + unaligned_len > ubuf + sizeof(ubuf)))
+   return -EINVAL;
+
if (unaligned_len > len)
unaligned_len = len;
 
@@ -124,11 +123,17 @@ static int shash_final_unaligned(struct shash_desc *desc, 
u8 *out)
unsigned long alignmask = crypto_shash_alignmask(tfm);
struct shash_alg *shash = crypto_shash_alg(tfm);
unsigned int ds = crypto_shash_digestsize(tfm);
-   u8 ubuf[shash_align_buffer_size(ds, alignmask)]
-   __aligned_largest;
+   /*
+* We cannot count on __aligned() working for large values:
+* https://patchwork.kernel.org/patch/9507697/
+*/
+   u8 ubuf[MAX_ALGAPI_ALIGNMASK + SHASH_MAX_DIGESTSIZE];
u8 *buf = PTR_ALIGN([0], alignmask + 1);
int err;
 
+   if (WARN_ON(buf + ds > ubuf + sizeof(ubuf)))
+   return -EINVAL;
+
err = shash->final(desc, buf);
if (err)
goto out;
diff --git a/include/linux/compiler-gcc.h b/include/linux/compiler-gcc.h
index f1a7492a5cc8..1f1cdef36a82 100644
--- a/include/linux/compiler-gcc.h
+++ b/include/linux/compiler-gcc.h
@@ -125,7 +125,6 @@
  */
 #define __pure __attribute__((pure))
 #define __aligned(x)   __attribute__((aligned(x)))
-#define __aligned_largest  __attribute__((aligned))
 #define __printf(a, b) __attribute__((format(printf, a, b)))
 #define __scanf(a, b)  __attribute__((format(scanf, a, b)))
 #define __attribute_const____attribute__((__const__))
-- 
2.17.1

[PATCH v5 01/11] crypto: xcbc: Remove VLA usage

2018-07-16 Thread Kees Cook

In the quest to remove all stack VLA usage from the kernel[1], this uses
the maximum blocksize and adds a sanity check. For xcbc, the blocksize
must always be 16, so use that, since it's already being enforced during
instantiation.

[1] 
https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qpxydaacu1rq...@mail.gmail.com

Signed-off-by: Kees Cook 
---
 crypto/xcbc.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/crypto/xcbc.c b/crypto/xcbc.c
index 25c75af50d3f..7aa03beed795 100644
--- a/crypto/xcbc.c
+++ b/crypto/xcbc.c
@@ -57,6 +57,8 @@ struct xcbc_desc_ctx {
u8 ctx[];
 };
 
+#define XCBC_BLOCKSIZE 16
+
 static int crypto_xcbc_digest_setkey(struct crypto_shash *parent,
 const u8 *inkey, unsigned int keylen)
 {
@@ -65,7 +67,10 @@ static int crypto_xcbc_digest_setkey(struct crypto_shash 
*parent,
int bs = crypto_shash_blocksize(parent);
u8 *consts = PTR_ALIGN(>ctx[0], alignmask + 1);
int err = 0;
-   u8 key1[bs];
+   u8 key1[XCBC_BLOCKSIZE];
+
+   if (WARN_ON(bs > sizeof(key1)))
+   return -EINVAL;
 
if ((err = crypto_cipher_setkey(ctx->child, inkey, keylen)))
return err;
@@ -212,7 +217,7 @@ static int xcbc_create(struct crypto_template *tmpl, struct 
rtattr **tb)
return PTR_ERR(alg);
 
switch(alg->cra_blocksize) {
-   case 16:
+   case XCBC_BLOCKSIZE:
break;
default:
goto out_put_alg;
-- 
2.17.1

[PATCH v5 01/11] crypto: xcbc: Remove VLA usage

2018-07-16 Thread Kees Cook

In the quest to remove all stack VLA usage from the kernel[1], this uses
the maximum blocksize and adds a sanity check. For xcbc, the blocksize
must always be 16, so use that, since it's already being enforced during
instantiation.

[1] 
https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qpxydaacu1rq...@mail.gmail.com

Signed-off-by: Kees Cook 
---
 crypto/xcbc.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/crypto/xcbc.c b/crypto/xcbc.c
index 25c75af50d3f..7aa03beed795 100644
--- a/crypto/xcbc.c
+++ b/crypto/xcbc.c
@@ -57,6 +57,8 @@ struct xcbc_desc_ctx {
u8 ctx[];
 };
 
+#define XCBC_BLOCKSIZE 16
+
 static int crypto_xcbc_digest_setkey(struct crypto_shash *parent,
 const u8 *inkey, unsigned int keylen)
 {
@@ -65,7 +67,10 @@ static int crypto_xcbc_digest_setkey(struct crypto_shash 
*parent,
int bs = crypto_shash_blocksize(parent);
u8 *consts = PTR_ALIGN(>ctx[0], alignmask + 1);
int err = 0;
-   u8 key1[bs];
+   u8 key1[XCBC_BLOCKSIZE];
+
+   if (WARN_ON(bs > sizeof(key1)))
+   return -EINVAL;
 
if ((err = crypto_cipher_setkey(ctx->child, inkey, keylen)))
return err;
@@ -212,7 +217,7 @@ static int xcbc_create(struct crypto_template *tmpl, struct 
rtattr **tb)
return PTR_ERR(alg);
 
switch(alg->cra_blocksize) {
-   case 16:
+   case XCBC_BLOCKSIZE:
break;
default:
goto out_put_alg;
-- 
2.17.1

Re: [patch -mm] mm, oom: remove oom_lock from exit_mmap

2018-07-16 Thread David Rientjes

On Fri, 13 Jul 2018, Michal Hocko wrote:

> > diff --git a/mm/oom_kill.c b/mm/oom_kill.c
> > index 0fe4087d5151..e6328cef090f 100644
> > --- a/mm/oom_kill.c
> > +++ b/mm/oom_kill.c
> > @@ -488,9 +488,11 @@ void __oom_reap_task_mm(struct mm_struct *mm)
> >  * Tell all users of get_user/copy_from_user etc... that the content
> >  * is no longer stable. No barriers really needed because unmapping
> >  * should imply barriers already and the reader would hit a page fault
> > -* if it stumbled over a reaped memory.
> > +* if it stumbled over a reaped memory. If MMF_UNSTABLE is already set,
> > +* reaping as already occurred so nothing left to do.
> >  */
> > -   set_bit(MMF_UNSTABLE, >flags);
> > +   if (test_and_set_bit(MMF_UNSTABLE, >flags))
> > +   return;
> 
> This could lead to pre mature oom victim selection
> oom_reaperexiting victim
> oom_reap_task exit_mmap
>   __oom_reap_task_mm__oom_reap_task_mm
>   test_and_set_bit(MMF_UNSTABLE) # wins the 
> race
>   test_and_set_bit(MMF_UNSTABLE)
> set_bit(MMF_OOM_SKIP) # new victim can be selected now.
> 

This is not the current state of the code in the -mm tree: MMF_OOM_SKIP 
only gets set by the oom reaper when the timeout has expired when the 
victim has failed to free memory in the exit path.

> Besides that, why should we back off in the first place. We can
> race the two without any problems AFAICS. We already do have proper
> synchronization between the two due to mmap_sem and MMF_OOM_SKIP.
> 

test_and_set_bit() here is not strictly required, I thought it was better 
since any unmapping done in this context is going to be handled by 
whichever thread set MMF_UNSTABLE.

Re: [patch -mm] mm, oom: remove oom_lock from exit_mmap

2018-07-16 Thread David Rientjes

On Fri, 13 Jul 2018, Michal Hocko wrote:

> > diff --git a/mm/oom_kill.c b/mm/oom_kill.c
> > index 0fe4087d5151..e6328cef090f 100644
> > --- a/mm/oom_kill.c
> > +++ b/mm/oom_kill.c
> > @@ -488,9 +488,11 @@ void __oom_reap_task_mm(struct mm_struct *mm)
> >  * Tell all users of get_user/copy_from_user etc... that the content
> >  * is no longer stable. No barriers really needed because unmapping
> >  * should imply barriers already and the reader would hit a page fault
> > -* if it stumbled over a reaped memory.
> > +* if it stumbled over a reaped memory. If MMF_UNSTABLE is already set,
> > +* reaping as already occurred so nothing left to do.
> >  */
> > -   set_bit(MMF_UNSTABLE, >flags);
> > +   if (test_and_set_bit(MMF_UNSTABLE, >flags))
> > +   return;
> 
> This could lead to pre mature oom victim selection
> oom_reaperexiting victim
> oom_reap_task exit_mmap
>   __oom_reap_task_mm__oom_reap_task_mm
>   test_and_set_bit(MMF_UNSTABLE) # wins the 
> race
>   test_and_set_bit(MMF_UNSTABLE)
> set_bit(MMF_OOM_SKIP) # new victim can be selected now.
> 

This is not the current state of the code in the -mm tree: MMF_OOM_SKIP 
only gets set by the oom reaper when the timeout has expired when the 
victim has failed to free memory in the exit path.

> Besides that, why should we back off in the first place. We can
> race the two without any problems AFAICS. We already do have proper
> synchronization between the two due to mmap_sem and MMF_OOM_SKIP.
> 

test_and_set_bit() here is not strictly required, I thought it was better 
since any unmapping done in this context is going to be handled by 
whichever thread set MMF_UNSTABLE.

Re: [PATCH v3 1/5] dt-bindings: fsi: Document binding for the fsi-master-ast-cf "device"

2018-07-16 Thread Benjamin Herrenschmidt

On Mon, 2018-07-16 at 09:33 -0600, Rob Herring wrote:
> On Thu, Jul 12, 2018 at 01:48:43PM +1000, Benjamin Herrenschmidt wrote:
> > This isn't per-se a real device, it's a pseudo-device that
> > represents the use of the Aspeed built-in ColdFire to
> > implement the FSI protocol by bitbanging the GPIOs instead
> > of doing it from the ARM core.
> > 
> > Thus it's a drop-in replacement for the existing
> > fsi-master-gpio pseudo-device for use on systems based
> > on the Aspeed chips. It has most of the same properties,
> > plus some more needed to operate the coprocessor.
> > 
> > Signed-off-by: Benjamin Herrenschmidt 
> > ---
> >  .../bindings/fsi/fsi-master-ast-cf.txt| 36 +++
> >  1 file changed, 36 insertions(+)
> >  create mode 100644 
> > Documentation/devicetree/bindings/fsi/fsi-master-ast-cf.txt
> > 
> > diff --git a/Documentation/devicetree/bindings/fsi/fsi-master-ast-cf.txt 
> > b/Documentation/devicetree/bindings/fsi/fsi-master-ast-cf.txt
> > new file mode 100644
> > index ..431bf8a423ce
> > --- /dev/null
> > +++ b/Documentation/devicetree/bindings/fsi/fsi-master-ast-cf.txt
> > @@ -0,0 +1,36 @@
> > +Device-tree bindings for ColdFire offloaded gpio-based FSI master driver
> > +
> > +
> > +Required properties:
> > + - compatible =
> > +   "aspeed,ast2400-cf-fsi-master" for an AST2400 based system
> > +   or
> > +   "aspeed,ast2500-cf-fsi-master" for an AST2500 based system
> > +
> > + - clock-gpios = ;: GPIO for FSI clock
> > + - data-gpios = ; : GPIO for FSI data signal
> > + - enable-gpios = ;   : GPIO for enable signal
> > + - trans-gpios = ;: GPIO for voltage translator 
> > enable
> > + - mux-gpios = ;  : GPIO for pin multiplexing with other
> > +  functions (eg, external FSI 
> > masters)
> > + - memory-region = ;  : Reference to the reserved 
> > memory for
> > +  the ColdFire. Must be 2M aligned 
> > on
> > + AST2400 and 1M aligned on AST2500
> > + - aspeed,sram = ;: Reference to the SRAM node.
> > + - aspeed,cvic = ;: Reference to the CVIC node.
> > +
> > +Examples:
> > +
> > +fsi-master {
> > +compatible = "aspeed,ast2500-cf-fsi-master", "fsi-master";
> > +
> > +   clock-gpios = < 0>;
> > +data-gpios = < 1>;
> > +enable-gpios = < 2>;
> > +trans-gpios = < 3>;
> > +mux-gpios = < 4>;
> > +
> > +   memory-region = <_memory>;
> > +   sram = <>;
> > +   cvic = <>;
> 
> Need to update the example. With that

Ah right, thanks. The next spin will have that fixed and will go into
the fsi tree.

> Reviewed-by: Rob Herring

Re: [PATCH v3 1/5] dt-bindings: fsi: Document binding for the fsi-master-ast-cf "device"

2018-07-16 Thread Benjamin Herrenschmidt

On Mon, 2018-07-16 at 09:33 -0600, Rob Herring wrote:
> On Thu, Jul 12, 2018 at 01:48:43PM +1000, Benjamin Herrenschmidt wrote:
> > This isn't per-se a real device, it's a pseudo-device that
> > represents the use of the Aspeed built-in ColdFire to
> > implement the FSI protocol by bitbanging the GPIOs instead
> > of doing it from the ARM core.
> > 
> > Thus it's a drop-in replacement for the existing
> > fsi-master-gpio pseudo-device for use on systems based
> > on the Aspeed chips. It has most of the same properties,
> > plus some more needed to operate the coprocessor.
> > 
> > Signed-off-by: Benjamin Herrenschmidt 
> > ---
> >  .../bindings/fsi/fsi-master-ast-cf.txt| 36 +++
> >  1 file changed, 36 insertions(+)
> >  create mode 100644 
> > Documentation/devicetree/bindings/fsi/fsi-master-ast-cf.txt
> > 
> > diff --git a/Documentation/devicetree/bindings/fsi/fsi-master-ast-cf.txt 
> > b/Documentation/devicetree/bindings/fsi/fsi-master-ast-cf.txt
> > new file mode 100644
> > index ..431bf8a423ce
> > --- /dev/null
> > +++ b/Documentation/devicetree/bindings/fsi/fsi-master-ast-cf.txt
> > @@ -0,0 +1,36 @@
> > +Device-tree bindings for ColdFire offloaded gpio-based FSI master driver
> > +
> > +
> > +Required properties:
> > + - compatible =
> > +   "aspeed,ast2400-cf-fsi-master" for an AST2400 based system
> > +   or
> > +   "aspeed,ast2500-cf-fsi-master" for an AST2500 based system
> > +
> > + - clock-gpios = ;: GPIO for FSI clock
> > + - data-gpios = ; : GPIO for FSI data signal
> > + - enable-gpios = ;   : GPIO for enable signal
> > + - trans-gpios = ;: GPIO for voltage translator 
> > enable
> > + - mux-gpios = ;  : GPIO for pin multiplexing with other
> > +  functions (eg, external FSI 
> > masters)
> > + - memory-region = ;  : Reference to the reserved 
> > memory for
> > +  the ColdFire. Must be 2M aligned 
> > on
> > + AST2400 and 1M aligned on AST2500
> > + - aspeed,sram = ;: Reference to the SRAM node.
> > + - aspeed,cvic = ;: Reference to the CVIC node.
> > +
> > +Examples:
> > +
> > +fsi-master {
> > +compatible = "aspeed,ast2500-cf-fsi-master", "fsi-master";
> > +
> > +   clock-gpios = < 0>;
> > +data-gpios = < 1>;
> > +enable-gpios = < 2>;
> > +trans-gpios = < 3>;
> > +mux-gpios = < 4>;
> > +
> > +   memory-region = <_memory>;
> > +   sram = <>;
> > +   cvic = <>;
> 
> Need to update the example. With that

Ah right, thanks. The next spin will have that fixed and will go into
the fsi tree.

> Reviewed-by: Rob Herring

Re: [PATCH] spi: spi-fsl-dspi: Fill actual_length when doing DMA transfer

2018-07-16 Thread Andrey Smirnov

On Mon, Jul 16, 2018 at 4:26 AM Mark Brown  wrote:
>
> On Sun, Jul 15, 2018 at 11:25:08PM -0700, Andrey Smirnov wrote:
> > Users of SPI device drivers may rely on 'actual_length', so it is
> > important that information is correctly reported. One example would be
> > spi_mem_exec_op() which will fail if 'actual_length' doesn't match
> > requested transfer length. To fix the problem, add necessary code to
> > populate 'actual_length'.
>
> This doesn't apply against current code, please check and resend.


Sorry about that. Will do in v2.

Thanks,
Andrey Smirnov

[PATCH v2] net: dsa: Remove VLA usage

2018-07-16 Thread Kees Cook

From: Salvatore Mesoraca 

We avoid 2 VLAs by using a pre-allocated field in dsa_switch. We also
try to avoid dynamic allocation whenever possible (when using fewer than
bits-per-long ports, which is the common case).

Link: 
http://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qpxydaacu1rq...@mail.gmail.com
Link: http://lkml.kernel.org/r/20180505185145.gb32...@lunn.ch
Signed-off-by: Salvatore Mesoraca 
[kees: tweak commit subject and message slightly]
Signed-off-by: Kees Cook 
---
 include/net/dsa.h |  3 +++
 net/dsa/dsa2.c| 14 ++
 net/dsa/switch.c  | 22 ++
 3 files changed, 27 insertions(+), 12 deletions(-)

diff --git a/include/net/dsa.h b/include/net/dsa.h
index fdbd6082945d..461e8a7661b7 100644
--- a/include/net/dsa.h
+++ b/include/net/dsa.h
@@ -259,6 +259,9 @@ struct dsa_switch {
/* Number of switch port queues */
unsigned intnum_tx_queues;
 
+   unsigned long   *bitmap;
+   unsigned long   _bitmap;
+
/* Dynamically allocated ports, keep last */
size_t num_ports;
struct dsa_port ports[];
diff --git a/net/dsa/dsa2.c b/net/dsa/dsa2.c
index dc5d9af3dc80..a1917025e155 100644
--- a/net/dsa/dsa2.c
+++ b/net/dsa/dsa2.c
@@ -775,6 +775,20 @@ struct dsa_switch *dsa_switch_alloc(struct device *dev, 
size_t n)
if (!ds)
return NULL;
 
+   /* We avoid allocating memory outside dsa_switch
+* if it is not needed.
+*/
+   if (n <= sizeof(ds->_bitmap) * 8) {
+   ds->bitmap = >_bitmap;
+   } else {
+   ds->bitmap = devm_kcalloc(dev,
+ BITS_TO_LONGS(n),
+ sizeof(unsigned long),
+ GFP_KERNEL);
+   if (unlikely(!ds->bitmap))
+   return NULL;
+   }
+
ds->dev = dev;
ds->num_ports = n;
 
diff --git a/net/dsa/switch.c b/net/dsa/switch.c
index b93511726069..142b294d3446 100644
--- a/net/dsa/switch.c
+++ b/net/dsa/switch.c
@@ -136,21 +136,20 @@ static int dsa_switch_mdb_add(struct dsa_switch *ds,
 {
const struct switchdev_obj_port_mdb *mdb = info->mdb;
struct switchdev_trans *trans = info->trans;
-   DECLARE_BITMAP(group, ds->num_ports);
int port;
 
/* Build a mask of Multicast group members */
-   bitmap_zero(group, ds->num_ports);
+   bitmap_zero(ds->bitmap, ds->num_ports);
if (ds->index == info->sw_index)
-   set_bit(info->port, group);
+   set_bit(info->port, ds->bitmap);
for (port = 0; port < ds->num_ports; port++)
if (dsa_is_dsa_port(ds, port))
-   set_bit(port, group);
+   set_bit(port, ds->bitmap);
 
if (switchdev_trans_ph_prepare(trans))
-   return dsa_switch_mdb_prepare_bitmap(ds, mdb, group);
+   return dsa_switch_mdb_prepare_bitmap(ds, mdb, ds->bitmap);
 
-   dsa_switch_mdb_add_bitmap(ds, mdb, group);
+   dsa_switch_mdb_add_bitmap(ds, mdb, ds->bitmap);
 
return 0;
 }
@@ -204,21 +203,20 @@ static int dsa_switch_vlan_add(struct dsa_switch *ds,
 {
const struct switchdev_obj_port_vlan *vlan = info->vlan;
struct switchdev_trans *trans = info->trans;
-   DECLARE_BITMAP(members, ds->num_ports);
int port;
 
/* Build a mask of VLAN members */
-   bitmap_zero(members, ds->num_ports);
+   bitmap_zero(ds->bitmap, ds->num_ports);
if (ds->index == info->sw_index)
-   set_bit(info->port, members);
+   set_bit(info->port, ds->bitmap);
for (port = 0; port < ds->num_ports; port++)
if (dsa_is_cpu_port(ds, port) || dsa_is_dsa_port(ds, port))
-   set_bit(port, members);
+   set_bit(port, ds->bitmap);
 
if (switchdev_trans_ph_prepare(trans))
-   return dsa_switch_vlan_prepare_bitmap(ds, vlan, members);
+   return dsa_switch_vlan_prepare_bitmap(ds, vlan, ds->bitmap);
 
-   dsa_switch_vlan_add_bitmap(ds, vlan, members);
+   dsa_switch_vlan_add_bitmap(ds, vlan, ds->bitmap);
 
return 0;
 }
-- 
2.17.1


-- 
Kees Cook
Pixel Security

Re: [PATCH] spi: spi-fsl-dspi: Fill actual_length when doing DMA transfer

2018-07-16 Thread Andrey Smirnov

On Mon, Jul 16, 2018 at 4:26 AM Mark Brown  wrote:
>
> On Sun, Jul 15, 2018 at 11:25:08PM -0700, Andrey Smirnov wrote:
> > Users of SPI device drivers may rely on 'actual_length', so it is
> > important that information is correctly reported. One example would be
> > spi_mem_exec_op() which will fail if 'actual_length' doesn't match
> > requested transfer length. To fix the problem, add necessary code to
> > populate 'actual_length'.
>
> This doesn't apply against current code, please check and resend.


Sorry about that. Will do in v2.

Thanks,
Andrey Smirnov

[PATCH v2] net: dsa: Remove VLA usage

2018-07-16 Thread Kees Cook

From: Salvatore Mesoraca 

We avoid 2 VLAs by using a pre-allocated field in dsa_switch. We also
try to avoid dynamic allocation whenever possible (when using fewer than
bits-per-long ports, which is the common case).

Link: 
http://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qpxydaacu1rq...@mail.gmail.com
Link: http://lkml.kernel.org/r/20180505185145.gb32...@lunn.ch
Signed-off-by: Salvatore Mesoraca 
[kees: tweak commit subject and message slightly]
Signed-off-by: Kees Cook 
---
 include/net/dsa.h |  3 +++
 net/dsa/dsa2.c| 14 ++
 net/dsa/switch.c  | 22 ++
 3 files changed, 27 insertions(+), 12 deletions(-)

diff --git a/include/net/dsa.h b/include/net/dsa.h
index fdbd6082945d..461e8a7661b7 100644
--- a/include/net/dsa.h
+++ b/include/net/dsa.h
@@ -259,6 +259,9 @@ struct dsa_switch {
/* Number of switch port queues */
unsigned intnum_tx_queues;
 
+   unsigned long   *bitmap;
+   unsigned long   _bitmap;
+
/* Dynamically allocated ports, keep last */
size_t num_ports;
struct dsa_port ports[];
diff --git a/net/dsa/dsa2.c b/net/dsa/dsa2.c
index dc5d9af3dc80..a1917025e155 100644
--- a/net/dsa/dsa2.c
+++ b/net/dsa/dsa2.c
@@ -775,6 +775,20 @@ struct dsa_switch *dsa_switch_alloc(struct device *dev, 
size_t n)
if (!ds)
return NULL;
 
+   /* We avoid allocating memory outside dsa_switch
+* if it is not needed.
+*/
+   if (n <= sizeof(ds->_bitmap) * 8) {
+   ds->bitmap = >_bitmap;
+   } else {
+   ds->bitmap = devm_kcalloc(dev,
+ BITS_TO_LONGS(n),
+ sizeof(unsigned long),
+ GFP_KERNEL);
+   if (unlikely(!ds->bitmap))
+   return NULL;
+   }
+
ds->dev = dev;
ds->num_ports = n;
 
diff --git a/net/dsa/switch.c b/net/dsa/switch.c
index b93511726069..142b294d3446 100644
--- a/net/dsa/switch.c
+++ b/net/dsa/switch.c
@@ -136,21 +136,20 @@ static int dsa_switch_mdb_add(struct dsa_switch *ds,
 {
const struct switchdev_obj_port_mdb *mdb = info->mdb;
struct switchdev_trans *trans = info->trans;
-   DECLARE_BITMAP(group, ds->num_ports);
int port;
 
/* Build a mask of Multicast group members */
-   bitmap_zero(group, ds->num_ports);
+   bitmap_zero(ds->bitmap, ds->num_ports);
if (ds->index == info->sw_index)
-   set_bit(info->port, group);
+   set_bit(info->port, ds->bitmap);
for (port = 0; port < ds->num_ports; port++)
if (dsa_is_dsa_port(ds, port))
-   set_bit(port, group);
+   set_bit(port, ds->bitmap);
 
if (switchdev_trans_ph_prepare(trans))
-   return dsa_switch_mdb_prepare_bitmap(ds, mdb, group);
+   return dsa_switch_mdb_prepare_bitmap(ds, mdb, ds->bitmap);
 
-   dsa_switch_mdb_add_bitmap(ds, mdb, group);
+   dsa_switch_mdb_add_bitmap(ds, mdb, ds->bitmap);
 
return 0;
 }
@@ -204,21 +203,20 @@ static int dsa_switch_vlan_add(struct dsa_switch *ds,
 {
const struct switchdev_obj_port_vlan *vlan = info->vlan;
struct switchdev_trans *trans = info->trans;
-   DECLARE_BITMAP(members, ds->num_ports);
int port;
 
/* Build a mask of VLAN members */
-   bitmap_zero(members, ds->num_ports);
+   bitmap_zero(ds->bitmap, ds->num_ports);
if (ds->index == info->sw_index)
-   set_bit(info->port, members);
+   set_bit(info->port, ds->bitmap);
for (port = 0; port < ds->num_ports; port++)
if (dsa_is_cpu_port(ds, port) || dsa_is_dsa_port(ds, port))
-   set_bit(port, members);
+   set_bit(port, ds->bitmap);
 
if (switchdev_trans_ph_prepare(trans))
-   return dsa_switch_vlan_prepare_bitmap(ds, vlan, members);
+   return dsa_switch_vlan_prepare_bitmap(ds, vlan, ds->bitmap);
 
-   dsa_switch_vlan_add_bitmap(ds, vlan, members);
+   dsa_switch_vlan_add_bitmap(ds, vlan, ds->bitmap);
 
return 0;
 }
-- 
2.17.1


-- 
Kees Cook
Pixel Security

Re: [tip:sched/core] sched/cputime: Ensure accurate utime and stime ratio in cputime_adjust()

2018-07-16 Thread Xunlei Pang

On 7/17/18 1:41 AM, Ingo Molnar wrote:
> 
> * Peter Zijlstra  wrote:
> 
>> On Sun, Jul 15, 2018 at 04:36:17PM -0700, tip-bot for Xunlei Pang wrote:
>>> Commit-ID:  8d4c00dc38a8aa30dae8402955e55e7b34e74bc8
>>> Gitweb: 
>>> https://git.kernel.org/tip/8d4c00dc38a8aa30dae8402955e55e7b34e74bc8
>>> Author: Xunlei Pang 
>>> AuthorDate: Mon, 9 Jul 2018 22:58:43 +0800
>>> Committer:  Ingo Molnar 
>>> CommitDate: Mon, 16 Jul 2018 00:28:31 +0200
>>>
>>> sched/cputime: Ensure accurate utime and stime ratio in cputime_adjust()
>>>
>>> If users access "/proc/pid/stat", the utime and stime ratio in the
>>> current SAMPLE period are excepted, but currently cputime_adjust()
>>> always calculates with the ratio of the WHOLE lifetime of the process.
>>>
>>> This results in inaccurate utime and stime in "/proc/pid/stat". For
>>> example,  a process runs for a while with "50% usr, 0% sys", then
>>> followed by "100% sys". For later while, the following is excepted:
>>>
>>>   0.0 usr,  100.0 sys
>>>
>>> but we get:
>>>
>>>   10.0 usr,  90.0 sys
>>>
>>> This patch uses the accurate ratio in cputime_adjust() to address the
>>> issue. A new 'task_cputime' type field is added in prev_cputime to record
>>> previous 'task_cputime' so that we can get the elapsed times as the accurate
>>> ratio.
>>
>> Ingo, please make this one go away.
> 
> Sure, I've removed it from sched/core.
> 

Hi Ingo, Peter, Frederic,

I captured some runtime data using trace to explain it, hope this can
illustrate the motive behind my patch. Anyone could help improve my
changelog is appreciated if you think so after reading.


Here are the simple trace_printk I added to capture the real data:

diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
index 0796f938c4f0..b65c1f250941 100644
--- a/kernel/sched/cputime.c
+++ b/kernel/sched/cputime.c
@@ -611,6 +611,9 @@ void cputime_adjust(struct task_cputime *curr,
struct prev_cputime *prev,
stime = curr->stime;
utime = curr->utime;

+   if (!strncmp(current->comm, "cat", 3))
+   trace_printk("task tick-based utime %lld stime %lld,
scheduler rtime %lld\n", utime, stime, rtime);
+
/*
 * If either stime or utime are 0, assume all runtime is userspace.
 * Once a task gets some ticks, the monotonicy code at 'update:'
@@ -651,9 +654,14 @@ void cputime_adjust(struct task_cputime *curr,
struct prev_cputime *prev,
stime = rtime - utime;
}

+   if (!strncmp(current->comm, "cat", 3))
+   trace_printk("result: old utime %lld stime %lld, new
utime %lld stime %lld\n",
+   prev->utime, prev->stime, utime, stime);
+
prev->stime = stime;
prev->utime = utime;
 out:
*ut = prev->utime;
*st = prev->stime;
raw_spin_unlock_irqrestore(>lock, flags);


Using the reproducer I described in the changelog, it runs for a while
with "50% usr, 0% sys", then followed by "100% sys". A shell script
accesses its /proc/pid/stat at the interval of one second, and got:
50.0 usr,  0.0 sys
51.0 usr,  0.0 sys
50.0 usr,  0.0 sys
...
9.0 usr,  91.0 sys
9.0 usr,  91.0 sys


The trace data corresponds to the last sample period:
trace entry 1:
 cat-20755 [022] d...  1370.106496: cputime_adjust: task
tick-based utime 36256000 stime 255100, scheduler rtime 333060702626
 cat-20755 [022] d...  1370.106497: cputime_adjust: result:
old utime 330729718142 stime 2306983867, new utime 330733635372 stime
2327067254

trace entry 2:
 cat-20773 [005] d...  1371.109825: cputime_adjust: task
tick-based utime 36256700 stime 354700, scheduler rtime 334063718912
 cat-20773 [005] d...  1371.109826: cputime_adjust: result:
old utime 330733635372 stime 2327067254, new utime 330827229702 stime
3236489210

1) expected behaviour
Let's compare the last two trace entries(all the data below is in ns):
task tick-based utime: 36256000->36256700 increased 700
task tick-based stime: 255100  ->354700   increased 99600
scheduler rtime:   333060702626->334063718912 increased 1003016286

The application actually runs almost 100%sys at the moment, we can
use the task tick-based utime and stime increased to double check:
99600/(700+99600) > 99%sys

2) the current cputime_adjust() inaccurate result
But for the current cputime_adjust(), we get the following adjusted
utime and stime increase in this sample period:
adjusted utime: 330733635372->330827229702 increased 93594330
adjusted stime: 2327067254  ->3236489210   increased 909421956

so 909421956/(93594330+909421956)=91%sys as the shell script shows above.

3) root cause
The root cause of the issue is that the current cputime_adjust() always
passes the whole times to scale_stime() to split the whole utime and
stime. In this patch, we pass all the increased deltas in 1) within
user's sample period to scale_stime() instead and accumulate the
corresponding results to the

Re: [tip:sched/core] sched/cputime: Ensure accurate utime and stime ratio in cputime_adjust()

2018-07-16 Thread Xunlei Pang

On 7/17/18 1:41 AM, Ingo Molnar wrote:
> 
> * Peter Zijlstra  wrote:
> 
>> On Sun, Jul 15, 2018 at 04:36:17PM -0700, tip-bot for Xunlei Pang wrote:
>>> Commit-ID:  8d4c00dc38a8aa30dae8402955e55e7b34e74bc8
>>> Gitweb: 
>>> https://git.kernel.org/tip/8d4c00dc38a8aa30dae8402955e55e7b34e74bc8
>>> Author: Xunlei Pang 
>>> AuthorDate: Mon, 9 Jul 2018 22:58:43 +0800
>>> Committer:  Ingo Molnar 
>>> CommitDate: Mon, 16 Jul 2018 00:28:31 +0200
>>>
>>> sched/cputime: Ensure accurate utime and stime ratio in cputime_adjust()
>>>
>>> If users access "/proc/pid/stat", the utime and stime ratio in the
>>> current SAMPLE period are excepted, but currently cputime_adjust()
>>> always calculates with the ratio of the WHOLE lifetime of the process.
>>>
>>> This results in inaccurate utime and stime in "/proc/pid/stat". For
>>> example,  a process runs for a while with "50% usr, 0% sys", then
>>> followed by "100% sys". For later while, the following is excepted:
>>>
>>>   0.0 usr,  100.0 sys
>>>
>>> but we get:
>>>
>>>   10.0 usr,  90.0 sys
>>>
>>> This patch uses the accurate ratio in cputime_adjust() to address the
>>> issue. A new 'task_cputime' type field is added in prev_cputime to record
>>> previous 'task_cputime' so that we can get the elapsed times as the accurate
>>> ratio.
>>
>> Ingo, please make this one go away.
> 
> Sure, I've removed it from sched/core.
> 

Hi Ingo, Peter, Frederic,

I captured some runtime data using trace to explain it, hope this can
illustrate the motive behind my patch. Anyone could help improve my
changelog is appreciated if you think so after reading.


Here are the simple trace_printk I added to capture the real data:

diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
index 0796f938c4f0..b65c1f250941 100644
--- a/kernel/sched/cputime.c
+++ b/kernel/sched/cputime.c
@@ -611,6 +611,9 @@ void cputime_adjust(struct task_cputime *curr,
struct prev_cputime *prev,
stime = curr->stime;
utime = curr->utime;

+   if (!strncmp(current->comm, "cat", 3))
+   trace_printk("task tick-based utime %lld stime %lld,
scheduler rtime %lld\n", utime, stime, rtime);
+
/*
 * If either stime or utime are 0, assume all runtime is userspace.
 * Once a task gets some ticks, the monotonicy code at 'update:'
@@ -651,9 +654,14 @@ void cputime_adjust(struct task_cputime *curr,
struct prev_cputime *prev,
stime = rtime - utime;
}

+   if (!strncmp(current->comm, "cat", 3))
+   trace_printk("result: old utime %lld stime %lld, new
utime %lld stime %lld\n",
+   prev->utime, prev->stime, utime, stime);
+
prev->stime = stime;
prev->utime = utime;
 out:
*ut = prev->utime;
*st = prev->stime;
raw_spin_unlock_irqrestore(>lock, flags);


Using the reproducer I described in the changelog, it runs for a while
with "50% usr, 0% sys", then followed by "100% sys". A shell script
accesses its /proc/pid/stat at the interval of one second, and got:
50.0 usr,  0.0 sys
51.0 usr,  0.0 sys
50.0 usr,  0.0 sys
...
9.0 usr,  91.0 sys
9.0 usr,  91.0 sys


The trace data corresponds to the last sample period:
trace entry 1:
 cat-20755 [022] d...  1370.106496: cputime_adjust: task
tick-based utime 36256000 stime 255100, scheduler rtime 333060702626
 cat-20755 [022] d...  1370.106497: cputime_adjust: result:
old utime 330729718142 stime 2306983867, new utime 330733635372 stime
2327067254

trace entry 2:
 cat-20773 [005] d...  1371.109825: cputime_adjust: task
tick-based utime 36256700 stime 354700, scheduler rtime 334063718912
 cat-20773 [005] d...  1371.109826: cputime_adjust: result:
old utime 330733635372 stime 2327067254, new utime 330827229702 stime
3236489210

1) expected behaviour
Let's compare the last two trace entries(all the data below is in ns):
task tick-based utime: 36256000->36256700 increased 700
task tick-based stime: 255100  ->354700   increased 99600
scheduler rtime:   333060702626->334063718912 increased 1003016286

The application actually runs almost 100%sys at the moment, we can
use the task tick-based utime and stime increased to double check:
99600/(700+99600) > 99%sys

2) the current cputime_adjust() inaccurate result
But for the current cputime_adjust(), we get the following adjusted
utime and stime increase in this sample period:
adjusted utime: 330733635372->330827229702 increased 93594330
adjusted stime: 2327067254  ->3236489210   increased 909421956

so 909421956/(93594330+909421956)=91%sys as the shell script shows above.

3) root cause
The root cause of the issue is that the current cputime_adjust() always
passes the whole times to scale_stime() to split the whole utime and
stime. In this patch, we pass all the increased deltas in 1) within
user's sample period to scale_stime() instead and accumulate the
corresponding results to the

[PATCH] ARM: dts: vf610: Add ZII SSMB SPU3 board

2018-07-16 Thread Andrey Smirnov

Add support for Zodiac Inflight Innovations SSMB SPU3
board (VF610-based).

Cc: Shawn Guo 
Cc: Fabio Estevam 
Cc: cphe...@gmail.com
Cc: linux-arm-ker...@lists.infradead.org
Cc: devicet...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Andrew Lunn 
Signed-off-by: Andrey Smirnov 
---
 arch/arm/boot/dts/Makefile|   3 +-
 arch/arm/boot/dts/vf610-zii-ssmb-spu3.dts | 343 ++
 2 files changed, 345 insertions(+), 1 deletion(-)
 create mode 100644 arch/arm/boot/dts/vf610-zii-ssmb-spu3.dts

diff --git a/arch/arm/boot/dts/Makefile b/arch/arm/boot/dts/Makefile
index bea41b129493..e331b2c16539 100644
--- a/arch/arm/boot/dts/Makefile
+++ b/arch/arm/boot/dts/Makefile
@@ -570,7 +570,8 @@ dtb-$(CONFIG_SOC_VF610) += \
vf610m4-cosmic.dtb \
vf610-twr.dtb \
vf610-zii-dev-rev-b.dtb \
-   vf610-zii-dev-rev-c.dtb
+   vf610-zii-dev-rev-c.dtb \
+   vf610-zii-ssmb-spu3.dtb
 dtb-$(CONFIG_ARCH_MXS) += \
imx23-evk.dtb \
imx23-olinuxino.dtb \
diff --git a/arch/arm/boot/dts/vf610-zii-ssmb-spu3.dts 
b/arch/arm/boot/dts/vf610-zii-ssmb-spu3.dts
new file mode 100644
index ..b692117d7839
--- /dev/null
+++ b/arch/arm/boot/dts/vf610-zii-ssmb-spu3.dts
@@ -0,0 +1,343 @@
+// SPDX-License-Identifier: (GPL-2.0 OR MIT)
+
+/*
+ * Device tree file for ZII's SSMB SPU3 board
+ *
+ * SSMB - SPU3 Switch Management Board
+ * SPU - Seat Power Unit
+ *
+ * Copyright (C) 2015, 2016 Zodiac Inflight Innovations
+ *
+ * Based on an original 'vf610-twr.dts' which is Copyright 2015,
+ * Freescale Semiconductor, Inc.
+ */
+
+/dts-v1/;
+#include "vf610.dtsi"
+
+/ {
+   model = "ZII VF610 SSMB SPU3 Board";
+   compatible = "zii,vf610spu3", "zii,vf610dev", "fsl,vf610";
+
+   chosen {
+   stdout-path = 
+   };
+
+   memory {
+   reg = <0x8000 0x2000>;
+   };
+
+   gpio-leds {
+   compatible = "gpio-leds";
+   pinctrl-0 = <_leds_debug>;
+   pinctrl-names = "default";
+
+   led-debug {
+   label = "zii:green:debug1";
+   gpios = < 18 GPIO_ACTIVE_HIGH>;
+   linux,default-trigger = "heartbeat";
+   max-brightness = <1>;
+   };
+   };
+
+   reg_vcc_3v3_mcu: regulator {
+   compatible = "regulator-fixed";
+   regulator-name = "vcc_3v3_mcu";
+   regulator-min-microvolt = <330>;
+   regulator-max-microvolt = <330>;
+   };
+};
+
+ {
+   vref-supply = <_vcc_3v3_mcu>;
+   status = "okay";
+};
+
+ {
+   vref-supply = <_vcc_3v3_mcu>;
+   status = "okay";
+};
+
+ {
+   bus-num = <1>;
+   pinctrl-names = "default";
+   pinctrl-0 = <_dspi1>;
+   /*
+* Some SPU3s come with SPI-NOR chip DNPed, so we leave this
+* node disabled by default and rely on bootloader to enable
+* it when appropriate.
+*/
+   status = "disabled";
+
+   m25p128@0 {
+   #address-cells = <1>;
+   #size-cells = <1>;
+   compatible = "m25p128", "jedec,spi-nor";
+   reg = <0>;
+   spi-max-frequency = <5000>;
+
+   partition@0 {
+   label = "m25p128-0";
+   reg = <0x0 0x0100>;
+   };
+   };
+};
+
+ {
+   status = "okay";
+};
+
+ {
+   status = "okay";
+};
+
+ {
+   pinctrl-names = "default";
+   pinctrl-0 = <_esdhc0>;
+   bus-width = <8>;
+   non-removable;
+   no-1-8-v;
+   keep-power-in-suspend;
+   status = "okay";
+};
+
+ {
+   pinctrl-names = "default";
+   pinctrl-0 = <_esdhc1>;
+   bus-width = <4>;
+   status = "okay";
+};
+
+ {
+   phy-mode = "rmii";
+   pinctrl-names = "default";
+   pinctrl-0 = <_fec1>;
+   status = "okay";
+
+   fixed-link {
+   speed = <100>;
+   full-duplex;
+   };
+
+   mdio1: mdio {
+   #address-cells = <1>;
+   #size-cells = <0>;
+   status = "okay";
+
+   switch0: switch0@0 {
+   compatible = "marvell,mv88e6190";
+   pinctrl-0 = <_gpio_switch0>;
+   pinctrl-names = "default";
+   #address-cells = <1>;
+   #size-cells = <0>;
+   reg = <0>;
+   eeprom-length = <65536>;
+   reset-gpios = < 11 GPIO_ACTIVE_LOW>;
+   interrupt-parent = <>;
+   interrupts = <2 IRQ_TYPE_LEVEL_LOW>;
+   interrupt-controller;
+   #interrupt-cells = <2>;
+
+   ports {
+   #address-cells = <1>;
+   #size-cells = <0>;
+
+   port@0 {
+

[PATCH] ARM: dts: vf610: Add ZII SSMB SPU3 board

2018-07-16 Thread Andrey Smirnov

Add support for Zodiac Inflight Innovations SSMB SPU3
board (VF610-based).

Cc: Shawn Guo 
Cc: Fabio Estevam 
Cc: cphe...@gmail.com
Cc: linux-arm-ker...@lists.infradead.org
Cc: devicet...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Andrew Lunn 
Signed-off-by: Andrey Smirnov 
---
 arch/arm/boot/dts/Makefile|   3 +-
 arch/arm/boot/dts/vf610-zii-ssmb-spu3.dts | 343 ++
 2 files changed, 345 insertions(+), 1 deletion(-)
 create mode 100644 arch/arm/boot/dts/vf610-zii-ssmb-spu3.dts

diff --git a/arch/arm/boot/dts/Makefile b/arch/arm/boot/dts/Makefile
index bea41b129493..e331b2c16539 100644
--- a/arch/arm/boot/dts/Makefile
+++ b/arch/arm/boot/dts/Makefile
@@ -570,7 +570,8 @@ dtb-$(CONFIG_SOC_VF610) += \
vf610m4-cosmic.dtb \
vf610-twr.dtb \
vf610-zii-dev-rev-b.dtb \
-   vf610-zii-dev-rev-c.dtb
+   vf610-zii-dev-rev-c.dtb \
+   vf610-zii-ssmb-spu3.dtb
 dtb-$(CONFIG_ARCH_MXS) += \
imx23-evk.dtb \
imx23-olinuxino.dtb \
diff --git a/arch/arm/boot/dts/vf610-zii-ssmb-spu3.dts 
b/arch/arm/boot/dts/vf610-zii-ssmb-spu3.dts
new file mode 100644
index ..b692117d7839
--- /dev/null
+++ b/arch/arm/boot/dts/vf610-zii-ssmb-spu3.dts
@@ -0,0 +1,343 @@
+// SPDX-License-Identifier: (GPL-2.0 OR MIT)
+
+/*
+ * Device tree file for ZII's SSMB SPU3 board
+ *
+ * SSMB - SPU3 Switch Management Board
+ * SPU - Seat Power Unit
+ *
+ * Copyright (C) 2015, 2016 Zodiac Inflight Innovations
+ *
+ * Based on an original 'vf610-twr.dts' which is Copyright 2015,
+ * Freescale Semiconductor, Inc.
+ */
+
+/dts-v1/;
+#include "vf610.dtsi"
+
+/ {
+   model = "ZII VF610 SSMB SPU3 Board";
+   compatible = "zii,vf610spu3", "zii,vf610dev", "fsl,vf610";
+
+   chosen {
+   stdout-path = 
+   };
+
+   memory {
+   reg = <0x8000 0x2000>;
+   };
+
+   gpio-leds {
+   compatible = "gpio-leds";
+   pinctrl-0 = <_leds_debug>;
+   pinctrl-names = "default";
+
+   led-debug {
+   label = "zii:green:debug1";
+   gpios = < 18 GPIO_ACTIVE_HIGH>;
+   linux,default-trigger = "heartbeat";
+   max-brightness = <1>;
+   };
+   };
+
+   reg_vcc_3v3_mcu: regulator {
+   compatible = "regulator-fixed";
+   regulator-name = "vcc_3v3_mcu";
+   regulator-min-microvolt = <330>;
+   regulator-max-microvolt = <330>;
+   };
+};
+
+ {
+   vref-supply = <_vcc_3v3_mcu>;
+   status = "okay";
+};
+
+ {
+   vref-supply = <_vcc_3v3_mcu>;
+   status = "okay";
+};
+
+ {
+   bus-num = <1>;
+   pinctrl-names = "default";
+   pinctrl-0 = <_dspi1>;
+   /*
+* Some SPU3s come with SPI-NOR chip DNPed, so we leave this
+* node disabled by default and rely on bootloader to enable
+* it when appropriate.
+*/
+   status = "disabled";
+
+   m25p128@0 {
+   #address-cells = <1>;
+   #size-cells = <1>;
+   compatible = "m25p128", "jedec,spi-nor";
+   reg = <0>;
+   spi-max-frequency = <5000>;
+
+   partition@0 {
+   label = "m25p128-0";
+   reg = <0x0 0x0100>;
+   };
+   };
+};
+
+ {
+   status = "okay";
+};
+
+ {
+   status = "okay";
+};
+
+ {
+   pinctrl-names = "default";
+   pinctrl-0 = <_esdhc0>;
+   bus-width = <8>;
+   non-removable;
+   no-1-8-v;
+   keep-power-in-suspend;
+   status = "okay";
+};
+
+ {
+   pinctrl-names = "default";
+   pinctrl-0 = <_esdhc1>;
+   bus-width = <4>;
+   status = "okay";
+};
+
+ {
+   phy-mode = "rmii";
+   pinctrl-names = "default";
+   pinctrl-0 = <_fec1>;
+   status = "okay";
+
+   fixed-link {
+   speed = <100>;
+   full-duplex;
+   };
+
+   mdio1: mdio {
+   #address-cells = <1>;
+   #size-cells = <0>;
+   status = "okay";
+
+   switch0: switch0@0 {
+   compatible = "marvell,mv88e6190";
+   pinctrl-0 = <_gpio_switch0>;
+   pinctrl-names = "default";
+   #address-cells = <1>;
+   #size-cells = <0>;
+   reg = <0>;
+   eeprom-length = <65536>;
+   reset-gpios = < 11 GPIO_ACTIVE_LOW>;
+   interrupt-parent = <>;
+   interrupts = <2 IRQ_TYPE_LEVEL_LOW>;
+   interrupt-controller;
+   #interrupt-cells = <2>;
+
+   ports {
+   #address-cells = <1>;
+   #size-cells = <0>;
+
+   port@0 {
+

Re: [patch v3 -mm 3/6] mm, memcg: add hierarchical usage oom policy

2018-07-16 Thread David Rientjes

On Mon, 16 Jul 2018, Roman Gushchin wrote:

> Hello, David!
> 
> I think that there is an inconsistency in the memory.oom_policy definition.
> "none" and "cgroup" policies defining how the OOM scoped to this particular
> memory cgroup (or system, if set on root) is handled. And all sub-tree
> settings do not matter at all, right? Also, if a memory cgroup has no
> memory.max set, there is no meaning in setting memory.oom_policy.
> 

Hi Roman,

The effective oom policy is based on the mem cgroup that is oom.  That can 
occur when memory.max is set, yes.

If a mem cgroup does not become oom itself, its oom policy doesn't do 
anything until, well, it's oom :)

> And "tree" is different. It actually changes how the selection algorithm 
> works,
> and sub-tree settings do matter in this case.
> 

"Tree" is considering the entity as a single indivisible memory consumer, 
it is compared with siblings based on its hierarhical usage.  It has 
cgroup oom policy.

It would be possible to separate this out, if you'd prefer, to account 
an intermediate cgroup as the largest descendant or the sum of all 
descendants.  I hadn't found a usecase for that, however, but it doesn't 
mean there isn't one.  If you'd like, I can introduce another tunable.

Re: [patch v3 -mm 3/6] mm, memcg: add hierarchical usage oom policy

2018-07-16 Thread David Rientjes

On Mon, 16 Jul 2018, Roman Gushchin wrote:

> Hello, David!
> 
> I think that there is an inconsistency in the memory.oom_policy definition.
> "none" and "cgroup" policies defining how the OOM scoped to this particular
> memory cgroup (or system, if set on root) is handled. And all sub-tree
> settings do not matter at all, right? Also, if a memory cgroup has no
> memory.max set, there is no meaning in setting memory.oom_policy.
> 

Hi Roman,

The effective oom policy is based on the mem cgroup that is oom.  That can 
occur when memory.max is set, yes.

If a mem cgroup does not become oom itself, its oom policy doesn't do 
anything until, well, it's oom :)

> And "tree" is different. It actually changes how the selection algorithm 
> works,
> and sub-tree settings do matter in this case.
> 

"Tree" is considering the entity as a single indivisible memory consumer, 
it is compared with siblings based on its hierarhical usage.  It has 
cgroup oom policy.

It would be possible to separate this out, if you'd prefer, to account 
an intermediate cgroup as the largest descendant or the sum of all 
descendants.  I hadn't found a usecase for that, however, but it doesn't 
mean there isn't one.  If you'd like, I can introduce another tunable.

Re: [RFC PATCH rcu] rcutorture: err_segs_recorded can be static

2018-07-16 Thread Paul E. McKenney

On Tue, Jul 17, 2018 at 09:44:52AM +0800, kbuild test robot wrote:
> 
> Fixes: 07809ecaa895 ("rcutorture: Dump reader protection sequence if failures 
> or close calls")
> Signed-off-by: kbuild test robot 

Good catch!  I have folded this in with attribution.

Thanx, Paul

> ---
>  rcutorture.c |6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c
> index ef85bde..a7a947e 100644
> --- a/kernel/rcu/rcutorture.c
> +++ b/kernel/rcu/rcutorture.c
> @@ -209,9 +209,9 @@ struct rt_read_seg {
>   unsigned long rt_delay_us;
>   bool rt_preempted;
>  };
> -int err_segs_recorded;
> -struct rt_read_seg err_segs[RCUTORTURE_RDR_MAX_SEGS];
> -int rt_read_nsegs;
> +static int err_segs_recorded;
> +static struct rt_read_seg err_segs[RCUTORTURE_RDR_MAX_SEGS];
> +static int rt_read_nsegs;
> 
>  static const char *rcu_torture_writer_state_getname(void)
>  {
>

Re: [RFC PATCH rcu] rcutorture: err_segs_recorded can be static

2018-07-16 Thread Paul E. McKenney

On Tue, Jul 17, 2018 at 09:44:52AM +0800, kbuild test robot wrote:
> 
> Fixes: 07809ecaa895 ("rcutorture: Dump reader protection sequence if failures 
> or close calls")
> Signed-off-by: kbuild test robot 

Good catch!  I have folded this in with attribution.

Thanx, Paul

> ---
>  rcutorture.c |6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c
> index ef85bde..a7a947e 100644
> --- a/kernel/rcu/rcutorture.c
> +++ b/kernel/rcu/rcutorture.c
> @@ -209,9 +209,9 @@ struct rt_read_seg {
>   unsigned long rt_delay_us;
>   bool rt_preempted;
>  };
> -int err_segs_recorded;
> -struct rt_read_seg err_segs[RCUTORTURE_RDR_MAX_SEGS];
> -int rt_read_nsegs;
> +static int err_segs_recorded;
> +static struct rt_read_seg err_segs[RCUTORTURE_RDR_MAX_SEGS];
> +static int rt_read_nsegs;
> 
>  static const char *rcu_torture_writer_state_getname(void)
>  {
>

Re: [PATCH v13 2/2] Add oom victim's memcg to the oom context information

2018-07-16 Thread David Rientjes

On Sun, 15 Jul 2018, 禹舟键 wrote:

> Hi David
> Could I use use plain old %d? Just like this,
> pr_cont(",task=%s,pid=%d,uid=%d\n", p->comm, p->pid,
> from_kuid(_user_ns, task_uid(p)));
> 

Yes please!

Re: [PATCH v13 2/2] Add oom victim's memcg to the oom context information

2018-07-16 Thread David Rientjes

On Sun, 15 Jul 2018, 禹舟键 wrote:

> Hi David
> Could I use use plain old %d? Just like this,
> pr_cont(",task=%s,pid=%d,uid=%d\n", p->comm, p->pid,
> from_kuid(_user_ns, task_uid(p)));
> 

Yes please!

Re: [PATCH 1/2] ARM: mxs_defconfig: use MXSFB DRM driver

2018-07-16 Thread Shawn Guo

On Thu, Jul 12, 2018 at 12:59:41PM +0200, Stefan Agner wrote:
> Use the the DRM driver for MXSFB LCD controller (used in i.MX23/
> i.MX28/i.MX6SX or i.MX7). Remove CONFIG_FB_MXS which will soon be
> removed.
> 
> Note that this does not remove CONFIG_FB. CONFIG_FB gets selected
> implicity by CONFIG_DRM/CONFIG_DRM_KMS_FB_HELPER.
> 
> Signed-off-by: Stefan Agner 

Applied both, thanks.

Re: [PATCH 1/2] ARM: mxs_defconfig: use MXSFB DRM driver

2018-07-16 Thread Shawn Guo

On Thu, Jul 12, 2018 at 12:59:41PM +0200, Stefan Agner wrote:
> Use the the DRM driver for MXSFB LCD controller (used in i.MX23/
> i.MX28/i.MX6SX or i.MX7). Remove CONFIG_FB_MXS which will soon be
> removed.
> 
> Note that this does not remove CONFIG_FB. CONFIG_FB gets selected
> implicity by CONFIG_DRM/CONFIG_DRM_KMS_FB_HELPER.
> 
> Signed-off-by: Stefan Agner 

Applied both, thanks.

Re: [PATCH 1/2] ata: libahci: Correct setting of DEVSLP register

2018-07-16 Thread Srinivas Pandruvada

On Tue, 2018-07-17 at 11:26 +0800, AceLan Kao wrote:
> Tested-by: AceLan Kao 

Thanks Kao for the test.

-Srinivas

> 
> The patches help the power consumption a little bit on my test
> system,
> and no obvious issue I can observe. 
> 
> 2018-07-03 3:01 GMT+08:00 Srinivas Pandruvada  nux.intel.com>:
> > We have seen that on some platforms, SATA device never show any
> > DEVSLP
> > residency. This prevent power gating of SATA IP, which prevent
> > system
> > to transition to low power mode in systems with SLP_S0 aka modern
> > standby systems. The PHY logic is off only in DEVSLP not in
> > slumber.
> > Reference:
> > https://www.intel.com/content/dam/www/public/us/en/documents/datash
> > eets
> > /332995-skylake-i-o-platform-datasheet-volume-1.pdf
> > Section 28.7.6.1
> > 
> > Here driver is trying to do read-modify-write the devslp register.
> > But
> > not resetting the bits for which this driver will modify values
> > (DITO,
> > MDAT and DETO). So simply reset those bits before updating to new
> > values.
> > 
> > Signed-off-by: Srinivas Pandruvada  > .com>
> > Reviewed-by: Rafael J. Wysocki 
> > ---
> >  drivers/ata/libahci.c | 2 ++
> >  1 file changed, 2 insertions(+)
> > 
> > diff --git a/drivers/ata/libahci.c b/drivers/ata/libahci.c
> > index 965842a08743..f6795d261869 100644
> > --- a/drivers/ata/libahci.c
> > +++ b/drivers/ata/libahci.c
> > @@ -2159,6 +2159,8 @@ static void ahci_set_aggressive_devslp(struct
> > ata_port *ap, bool sleep)
> > deto = 20;
> > }
> > 
> > +   /* Make dito, mdat, deto bits to 0s */
> > +   devslp &= ~GENMASK_ULL(24, 2);
> > devslp |= ((dito << PORT_DEVSLP_DITO_OFFSET) |
> >(mdat << PORT_DEVSLP_MDAT_OFFSET) |
> >(deto << PORT_DEVSLP_DETO_OFFSET) |
> > -- 
> > 2.17.1
> > 
> 
>

Re: [PATCH 1/2] ata: libahci: Correct setting of DEVSLP register

2018-07-16 Thread Srinivas Pandruvada

On Tue, 2018-07-17 at 11:26 +0800, AceLan Kao wrote:
> Tested-by: AceLan Kao 

Thanks Kao for the test.

-Srinivas

> 
> The patches help the power consumption a little bit on my test
> system,
> and no obvious issue I can observe. 
> 
> 2018-07-03 3:01 GMT+08:00 Srinivas Pandruvada  nux.intel.com>:
> > We have seen that on some platforms, SATA device never show any
> > DEVSLP
> > residency. This prevent power gating of SATA IP, which prevent
> > system
> > to transition to low power mode in systems with SLP_S0 aka modern
> > standby systems. The PHY logic is off only in DEVSLP not in
> > slumber.
> > Reference:
> > https://www.intel.com/content/dam/www/public/us/en/documents/datash
> > eets
> > /332995-skylake-i-o-platform-datasheet-volume-1.pdf
> > Section 28.7.6.1
> > 
> > Here driver is trying to do read-modify-write the devslp register.
> > But
> > not resetting the bits for which this driver will modify values
> > (DITO,
> > MDAT and DETO). So simply reset those bits before updating to new
> > values.
> > 
> > Signed-off-by: Srinivas Pandruvada  > .com>
> > Reviewed-by: Rafael J. Wysocki 
> > ---
> >  drivers/ata/libahci.c | 2 ++
> >  1 file changed, 2 insertions(+)
> > 
> > diff --git a/drivers/ata/libahci.c b/drivers/ata/libahci.c
> > index 965842a08743..f6795d261869 100644
> > --- a/drivers/ata/libahci.c
> > +++ b/drivers/ata/libahci.c
> > @@ -2159,6 +2159,8 @@ static void ahci_set_aggressive_devslp(struct
> > ata_port *ap, bool sleep)
> > deto = 20;
> > }
> > 
> > +   /* Make dito, mdat, deto bits to 0s */
> > +   devslp &= ~GENMASK_ULL(24, 2);
> > devslp |= ((dito << PORT_DEVSLP_DITO_OFFSET) |
> >(mdat << PORT_DEVSLP_MDAT_OFFSET) |
> >(deto << PORT_DEVSLP_DETO_OFFSET) |
> > -- 
> > 2.17.1
> > 
> 
>

Re: [PATCH v2 0/2] Add support for the ConnectCore 6UL SOM and SBC Express

2018-07-16 Thread Shawn Guo

On Thu, Jul 12, 2018 at 10:20:34AM +0200, Alex Gonzalez wrote:
> Alex Gonzalez (2):
>   ARM: dts: imx6ul: Add DTS for ConnectCore 6UL System-On-Module (SOM)
>   ARM: dts: imx6ul: Add DTS for ConnectCore 6UL SBC Express

Applied both, thanks.

Re: [PATCH v2 0/2] Add support for the ConnectCore 6UL SOM and SBC Express

2018-07-16 Thread Shawn Guo

On Thu, Jul 12, 2018 at 10:20:34AM +0200, Alex Gonzalez wrote:
> Alex Gonzalez (2):
>   ARM: dts: imx6ul: Add DTS for ConnectCore 6UL System-On-Module (SOM)
>   ARM: dts: imx6ul: Add DTS for ConnectCore 6UL SBC Express

Applied both, thanks.

[PATCH V2] soc: imx: gpc: restrict register range for regmap access

2018-07-16 Thread Anson Huang

GPC registers are NOT continuous, some registers are
reserved and accessing them from userspace will trigger
external abort, add regmap register access table to
avoid below abort:

root@imx6slevk:~# cat /sys/kernel/debug/regmap/20dc000.gpc/registers
[  108.480477] Unhandled fault: imprecise external abort (0x1406) at 0xb6db5004
[  108.487985] pgd = 42b54bfd
[  108.490741] [b6db5004] *pgd=ba1b7831
[  108.494386] Internal error: : 1406 [#1] SMP ARM
[  108.498943] Modules linked in:
[  108.502043] CPU: 0 PID: 389 Comm: cat Not tainted 
4.18.0-rc1-00074-gc9f1f60-dirty #482
[  108.509982] Hardware name: Freescale i.MX6 SoloLite (Device Tree)
[  108.516123] PC is at regmap_mmio_read32le+0x20/0x24
[  108.521031] LR is at regmap_mmio_read+0x40/0x60
[  108.525586] pc : []lr : []psr: 20060093
[  108.531875] sp : eccf1d98  ip : eccf1da8  fp : eccf1da4
[  108.537122] r10: ec2d3800  r9 : eccf1f60  r8 : ecfc
[  108.542370] r7 : eccf1e2c  r6 : eccf1e2c  r5 : 0028  r4 : ec338e00
[  108.548920] r3 :   r2 : eccf1e2c  r1 : f0980028  r0 : 
[  108.555474] Flags: nzCv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  Segment 
none
[  108.562720] Control: 10c5387d  Table: acf4004a  DAC: 0051
[  108.568491] Process cat (pid: 389, stack limit = 0xd4318a65)
[  108.574174] Stack: (0xeccf1d98 to 0xeccf2000)

Fixes: 721cabf6c660 ("soc: imx: move PGC handling to a new GPC driver")
Signed-off-by: Anson Huang 
Reviewed-by: Fabio Estevam 
---
change since V1:
add fix tag.
 drivers/soc/imx/gpc.c | 21 +
 1 file changed, 21 insertions(+)

diff --git a/drivers/soc/imx/gpc.c b/drivers/soc/imx/gpc.c
index 32f0748..0097a93 100644
--- a/drivers/soc/imx/gpc.c
+++ b/drivers/soc/imx/gpc.c
@@ -27,9 +27,16 @@
 #define GPC_PGC_SW2ISO_SHIFT   0x8
 #define GPC_PGC_SW_SHIFT   0x0
 
+#define GPC_PGC_PCI_PDN0x200
+#define GPC_PGC_PCI_SR 0x20c
+
 #define GPC_PGC_GPU_PDN0x260
 #define GPC_PGC_GPU_PUPSCR 0x264
 #define GPC_PGC_GPU_PDNSCR 0x268
+#define GPC_PGC_GPU_SR 0x26c
+
+#define GPC_PGC_DISP_PDN   0x240
+#define GPC_PGC_DISP_SR0x24c
 
 #define GPU_VPU_PUP_REQBIT(1)
 #define GPU_VPU_PDN_REQBIT(0)
@@ -318,10 +325,24 @@ static const struct of_device_id imx_gpc_dt_ids[] = {
{ }
 };
 
+static const struct regmap_range yes_ranges[] = {
+   regmap_reg_range(GPC_CNTR, GPC_CNTR),
+   regmap_reg_range(GPC_PGC_PCI_PDN, GPC_PGC_PCI_SR),
+   regmap_reg_range(GPC_PGC_GPU_PDN, GPC_PGC_GPU_SR),
+   regmap_reg_range(GPC_PGC_DISP_PDN, GPC_PGC_DISP_SR),
+};
+
+static const struct regmap_access_table access_table = {
+   .yes_ranges = yes_ranges,
+   .n_yes_ranges   = ARRAY_SIZE(yes_ranges),
+};
+
 static const struct regmap_config imx_gpc_regmap_config = {
.reg_bits = 32,
.val_bits = 32,
.reg_stride = 4,
+   .rd_table = _table,
+   .wr_table = _table,
.max_register = 0x2ac,
 };
 
-- 
2.7.4

[PATCH V2] soc: imx: gpc: restrict register range for regmap access

2018-07-16 Thread Anson Huang

GPC registers are NOT continuous, some registers are
reserved and accessing them from userspace will trigger
external abort, add regmap register access table to
avoid below abort:

root@imx6slevk:~# cat /sys/kernel/debug/regmap/20dc000.gpc/registers
[  108.480477] Unhandled fault: imprecise external abort (0x1406) at 0xb6db5004
[  108.487985] pgd = 42b54bfd
[  108.490741] [b6db5004] *pgd=ba1b7831
[  108.494386] Internal error: : 1406 [#1] SMP ARM
[  108.498943] Modules linked in:
[  108.502043] CPU: 0 PID: 389 Comm: cat Not tainted 
4.18.0-rc1-00074-gc9f1f60-dirty #482
[  108.509982] Hardware name: Freescale i.MX6 SoloLite (Device Tree)
[  108.516123] PC is at regmap_mmio_read32le+0x20/0x24
[  108.521031] LR is at regmap_mmio_read+0x40/0x60
[  108.525586] pc : []lr : []psr: 20060093
[  108.531875] sp : eccf1d98  ip : eccf1da8  fp : eccf1da4
[  108.537122] r10: ec2d3800  r9 : eccf1f60  r8 : ecfc
[  108.542370] r7 : eccf1e2c  r6 : eccf1e2c  r5 : 0028  r4 : ec338e00
[  108.548920] r3 :   r2 : eccf1e2c  r1 : f0980028  r0 : 
[  108.555474] Flags: nzCv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  Segment 
none
[  108.562720] Control: 10c5387d  Table: acf4004a  DAC: 0051
[  108.568491] Process cat (pid: 389, stack limit = 0xd4318a65)
[  108.574174] Stack: (0xeccf1d98 to 0xeccf2000)

Fixes: 721cabf6c660 ("soc: imx: move PGC handling to a new GPC driver")
Signed-off-by: Anson Huang 
Reviewed-by: Fabio Estevam 
---
change since V1:
add fix tag.
 drivers/soc/imx/gpc.c | 21 +
 1 file changed, 21 insertions(+)

diff --git a/drivers/soc/imx/gpc.c b/drivers/soc/imx/gpc.c
index 32f0748..0097a93 100644
--- a/drivers/soc/imx/gpc.c
+++ b/drivers/soc/imx/gpc.c
@@ -27,9 +27,16 @@
 #define GPC_PGC_SW2ISO_SHIFT   0x8
 #define GPC_PGC_SW_SHIFT   0x0
 
+#define GPC_PGC_PCI_PDN0x200
+#define GPC_PGC_PCI_SR 0x20c
+
 #define GPC_PGC_GPU_PDN0x260
 #define GPC_PGC_GPU_PUPSCR 0x264
 #define GPC_PGC_GPU_PDNSCR 0x268
+#define GPC_PGC_GPU_SR 0x26c
+
+#define GPC_PGC_DISP_PDN   0x240
+#define GPC_PGC_DISP_SR0x24c
 
 #define GPU_VPU_PUP_REQBIT(1)
 #define GPU_VPU_PDN_REQBIT(0)
@@ -318,10 +325,24 @@ static const struct of_device_id imx_gpc_dt_ids[] = {
{ }
 };
 
+static const struct regmap_range yes_ranges[] = {
+   regmap_reg_range(GPC_CNTR, GPC_CNTR),
+   regmap_reg_range(GPC_PGC_PCI_PDN, GPC_PGC_PCI_SR),
+   regmap_reg_range(GPC_PGC_GPU_PDN, GPC_PGC_GPU_SR),
+   regmap_reg_range(GPC_PGC_DISP_PDN, GPC_PGC_DISP_SR),
+};
+
+static const struct regmap_access_table access_table = {
+   .yes_ranges = yes_ranges,
+   .n_yes_ranges   = ARRAY_SIZE(yes_ranges),
+};
+
 static const struct regmap_config imx_gpc_regmap_config = {
.reg_bits = 32,
.val_bits = 32,
.reg_stride = 4,
+   .rd_table = _table,
+   .wr_table = _table,
.max_register = 0x2ac,
 };
 
-- 
2.7.4

Re: [PATCH v2] tg: show the sum wait time of an task group

2018-07-16 Thread 王贇


Hi, folks

On 2018/7/4 上午11:27, 王贇 wrote:

Although we can rely on cpuacct to present the cpu usage of task
group, it is hard to tell how intense the competition is between
these groups on cpu resources.

Monitoring the wait time of each process or sched_debug could cost
too much, and there is no good way to accurately represent the
conflict with these info, we need the wait time on group dimension.

Thus we introduced group's wait_sum represent the conflict between
task groups, which is simply sum the wait time of group's cfs_rq.

The 'cpu.stat' is modified to show the statistic, like:

   nr_periods 0
   nr_throttled 0
   throttled_time 0
   wait_sum 2035098795584

Now we can monitor the changing on wait_sum to tell how suffering
a task group is in the fight of cpu resources.

For example:
   (wait_sum - last_wait_sum) * 100 / (nr_cpu * period_ns) == X%

means the task group paid X percentage of period on waiting
for the cpu.


Any comments please?

Regards,
Michael Wang




Signed-off-by: Michael Wang 
---

Since v1:
   Use schedstat_val to avoid compile error
   Check and skip root_task_group

  kernel/sched/core.c | 8 
  1 file changed, 8 insertions(+)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 78d8fac..80ab995 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -6781,6 +6781,8 @@ static int __cfs_schedulable(struct task_group *tg, u64 
period, u64 quota)

  static int cpu_cfs_stat_show(struct seq_file *sf, void *v)
  {
+    int i;
+    u64 ws = 0;
  struct task_group *tg = css_tg(seq_css(sf));
  struct cfs_bandwidth *cfs_b = >cfs_bandwidth;

@@ -6788,6 +6790,12 @@ static int cpu_cfs_stat_show(struct seq_file *sf, void 
*v)
  seq_printf(sf, "nr_throttled %d\n", cfs_b->nr_throttled);
  seq_printf(sf, "throttled_time %llu\n", cfs_b->throttled_time);

+    if (schedstat_enabled() && tg != _task_group) {
+    for_each_possible_cpu(i)
+    ws += schedstat_val(tg->se[i]->statistics.wait_sum);
+    seq_printf(sf, "wait_sum %llu\n", ws);
+    }
+
  return 0;
  }
  #endif /* CONFIG_CFS_BANDWIDTH */

Re: [PATCH v2] tg: show the sum wait time of an task group

2018-07-16 Thread 王贇


Hi, folks

On 2018/7/4 上午11:27, 王贇 wrote:

Although we can rely on cpuacct to present the cpu usage of task
group, it is hard to tell how intense the competition is between
these groups on cpu resources.

Monitoring the wait time of each process or sched_debug could cost
too much, and there is no good way to accurately represent the
conflict with these info, we need the wait time on group dimension.

Thus we introduced group's wait_sum represent the conflict between
task groups, which is simply sum the wait time of group's cfs_rq.

The 'cpu.stat' is modified to show the statistic, like:

   nr_periods 0
   nr_throttled 0
   throttled_time 0
   wait_sum 2035098795584

Now we can monitor the changing on wait_sum to tell how suffering
a task group is in the fight of cpu resources.

For example:
   (wait_sum - last_wait_sum) * 100 / (nr_cpu * period_ns) == X%

means the task group paid X percentage of period on waiting
for the cpu.


Any comments please?

Regards,
Michael Wang




Signed-off-by: Michael Wang 
---

Since v1:
   Use schedstat_val to avoid compile error
   Check and skip root_task_group

  kernel/sched/core.c | 8 
  1 file changed, 8 insertions(+)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 78d8fac..80ab995 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -6781,6 +6781,8 @@ static int __cfs_schedulable(struct task_group *tg, u64 
period, u64 quota)

  static int cpu_cfs_stat_show(struct seq_file *sf, void *v)
  {
+    int i;
+    u64 ws = 0;
  struct task_group *tg = css_tg(seq_css(sf));
  struct cfs_bandwidth *cfs_b = >cfs_bandwidth;

@@ -6788,6 +6790,12 @@ static int cpu_cfs_stat_show(struct seq_file *sf, void 
*v)
  seq_printf(sf, "nr_throttled %d\n", cfs_b->nr_throttled);
  seq_printf(sf, "throttled_time %llu\n", cfs_b->throttled_time);

+    if (schedstat_enabled() && tg != _task_group) {
+    for_each_possible_cpu(i)
+    ws += schedstat_val(tg->se[i]->statistics.wait_sum);
+    seq_printf(sf, "wait_sum %llu\n", ws);
+    }
+
  return 0;
  }
  #endif /* CONFIG_CFS_BANDWIDTH */

[PATCH 0/4] spi: introduce SPI_CS_WORD mode flag

2018-07-16 Thread David Lechner

This series introduces a new SPI mode flag, SPI_CS_WORD, that indicates that
the chip select line should be toggled after each word sent. This series
includes examples of how this can be implemented for both an SPI controller
and an SPI device.

The motivation here is to take advantage of DMA transfers with an analog/digital
convert chip. This chip requires that the chip select line be toggled after
each word send in order to drive the internal circuitry of the chip.

The way we were accomplishing this was, for example, to read 16 channels, we
would create 16 SPI _transfer_s in one SPI _message_. Although this works, it
uses quite a bit of CPU because we have to do work at the end of each transfer
to get the next transfer ready. When you are polling the chip at 100Hz, this
CPU usage adds up.

The SPI controller being used has DMA support, but only on the _transfer_ level
and not on the _message_ level. So, to take advantage of DMA, we need to read
all of the A/DC channels in a single _transfer_. The SPI controller is capable
automatically toggling the chip select line during a DMA transfer, so we are
adding a new SPI flag in order to take advantage of this.

Dependency: the iio patch applies on top of "iio: adc: ti-ads7950: allow
simultaneous use of buffer and direct mode"[1]

[1]: https://lore.kernel.org/lkml/20180716233550.6449-1-da...@lechnology.com/

David Lechner (4):
  spi: spi-bitbang: change flags from u8 to u16
  spi: add new SPI_CS_WORD flag
  spi: spi-davinci: Add support for SPI_CS_WORD
  iio: adc: ti-ads7950: use SPI_CS_WORD to reduce CPU usage

 drivers/iio/adc/ti-ads7950.c| 53 +++--
 drivers/spi/spi-davinci.c   | 11 +--
 include/linux/spi/spi.h |  2 +-
 include/linux/spi/spi_bitbang.h |  2 +-
 4 files changed, 41 insertions(+), 27 deletions(-)

-- 
2.17.1

[PATCH 0/4] spi: introduce SPI_CS_WORD mode flag

2018-07-16 Thread David Lechner

This series introduces a new SPI mode flag, SPI_CS_WORD, that indicates that
the chip select line should be toggled after each word sent. This series
includes examples of how this can be implemented for both an SPI controller
and an SPI device.

The motivation here is to take advantage of DMA transfers with an analog/digital
convert chip. This chip requires that the chip select line be toggled after
each word send in order to drive the internal circuitry of the chip.

The way we were accomplishing this was, for example, to read 16 channels, we
would create 16 SPI _transfer_s in one SPI _message_. Although this works, it
uses quite a bit of CPU because we have to do work at the end of each transfer
to get the next transfer ready. When you are polling the chip at 100Hz, this
CPU usage adds up.

The SPI controller being used has DMA support, but only on the _transfer_ level
and not on the _message_ level. So, to take advantage of DMA, we need to read
all of the A/DC channels in a single _transfer_. The SPI controller is capable
automatically toggling the chip select line during a DMA transfer, so we are
adding a new SPI flag in order to take advantage of this.

Dependency: the iio patch applies on top of "iio: adc: ti-ads7950: allow
simultaneous use of buffer and direct mode"[1]

[1]: https://lore.kernel.org/lkml/20180716233550.6449-1-da...@lechnology.com/

David Lechner (4):
  spi: spi-bitbang: change flags from u8 to u16
  spi: add new SPI_CS_WORD flag
  spi: spi-davinci: Add support for SPI_CS_WORD
  iio: adc: ti-ads7950: use SPI_CS_WORD to reduce CPU usage

 drivers/iio/adc/ti-ads7950.c| 53 +++--
 drivers/spi/spi-davinci.c   | 11 +--
 include/linux/spi/spi.h |  2 +-
 include/linux/spi/spi_bitbang.h |  2 +-
 4 files changed, 41 insertions(+), 27 deletions(-)

-- 
2.17.1

[PATCH 1/4] spi: spi-bitbang: change flags from u8 to u16

2018-07-16 Thread David Lechner

This changes the data type of the flags field in struct spi_bitbang from
u8 to u16. This matches the size of the mode field of struct spi_device
where these flags are also used.

This is done in preparation of adding a new SPI mode flag that will be
used with this field that would otherwise not fit in 8 bits.

Signed-off-by: David Lechner 
---
 include/linux/spi/spi_bitbang.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/spi/spi_bitbang.h b/include/linux/spi/spi_bitbang.h
index 51d8c060e513..c1e61641a7f1 100644
--- a/include/linux/spi/spi_bitbang.h
+++ b/include/linux/spi/spi_bitbang.h
@@ -8,7 +8,7 @@ struct spi_bitbang {
struct mutexlock;
u8  busy;
u8  use_dma;
-   u8  flags;  /* extra spi->mode support */
+   u16 flags;  /* extra spi->mode support */
 
struct spi_master   *master;
 
-- 
2.17.1

[PATCH 1/4] spi: spi-bitbang: change flags from u8 to u16

2018-07-16 Thread David Lechner

This changes the data type of the flags field in struct spi_bitbang from
u8 to u16. This matches the size of the mode field of struct spi_device
where these flags are also used.

This is done in preparation of adding a new SPI mode flag that will be
used with this field that would otherwise not fit in 8 bits.

Signed-off-by: David Lechner 
---
 include/linux/spi/spi_bitbang.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/spi/spi_bitbang.h b/include/linux/spi/spi_bitbang.h
index 51d8c060e513..c1e61641a7f1 100644
--- a/include/linux/spi/spi_bitbang.h
+++ b/include/linux/spi/spi_bitbang.h
@@ -8,7 +8,7 @@ struct spi_bitbang {
struct mutexlock;
u8  busy;
u8  use_dma;
-   u8  flags;  /* extra spi->mode support */
+   u16 flags;  /* extra spi->mode support */
 
struct spi_master   *master;
 
-- 
2.17.1

[PATCH 4/4] iio: adc: ti-ads7950: use SPI_CS_WORD to reduce CPU usage

2018-07-16 Thread David Lechner

This changes how the SPI message for the triggered buffer is setup in
the TI ADS7950 A/DC driver. By using the SPI_CS_WORD flag, we can read
multiple samples in a single SPI transfer. If the SPI controller
supports DMA transfers, we can see a significant reduction in CPU usage.

For example, on an ARM9 system running at 456MHz reading just 4 channels
at 100Hz: before this change, top shows the CPU usage of the IRQ thread
of this driver to be ~7.7%. After this change, the CPU usage drops to
~3.8%.

Signed-off-by: David Lechner 
---

Dependency: this patch applies on top of "iio: adc: ti-ads7950: allow
simultaneous use of buffer and direct mode"[1]

[1]: https://lore.kernel.org/lkml/20180716233550.6449-1-da...@lechnology.com/


 drivers/iio/adc/ti-ads7950.c | 53 +---
 1 file changed, 31 insertions(+), 22 deletions(-)

diff --git a/drivers/iio/adc/ti-ads7950.c b/drivers/iio/adc/ti-ads7950.c
index ba7e5a027490..60de4cbbd5fc 100644
--- a/drivers/iio/adc/ti-ads7950.c
+++ b/drivers/iio/adc/ti-ads7950.c
@@ -60,7 +60,7 @@
 struct ti_ads7950_state {
struct iio_dev  *indio_dev;
struct spi_device   *spi;
-   struct spi_transfer ring_xfer[TI_ADS7950_MAX_CHAN + 2];
+   struct spi_transfer ring_xfer;
struct spi_transfer scan_single_xfer[3];
struct spi_message  ring_msg;
struct spi_message  scan_single_msg;
@@ -69,16 +69,16 @@ struct ti_ads7950_state {
unsigned intvref_mv;
 
unsigned intsettings;
-   __be16  single_tx;
-   __be16  single_rx;
+   u16 single_tx;
+   u16 single_rx;
 
/*
 * DMA (thus cache coherency maintenance) requires the
 * transfer buffers to live in their own cache lines.
 */
-   __be16  rx_buf[TI_ADS7950_MAX_CHAN + TI_ADS7950_TIMESTAMP_SIZE]
+   u16 rx_buf[TI_ADS7950_MAX_CHAN + 2 + TI_ADS7950_TIMESTAMP_SIZE]
cacheline_aligned;
-   __be16  tx_buf[TI_ADS7950_MAX_CHAN];
+   u16 tx_buf[TI_ADS7950_MAX_CHAN + 2];
 };
 
 struct ti_ads7950_chip_info {
@@ -116,7 +116,7 @@ enum ti_ads7950_id {
.realbits = bits,   \
.storagebits = 16,  \
.shift = 12 - (bits),   \
-   .endianness = IIO_BE,   \
+   .endianness = IIO_CPU,  \
},  \
 }
 
@@ -257,23 +257,14 @@ static int ti_ads7950_update_scan_mode(struct iio_dev 
*indio_dev,
len = 0;
for_each_set_bit(i, active_scan_mask, indio_dev->num_channels) {
cmd = TI_ADS7950_CR_WRITE | TI_ADS7950_CR_CHAN(i) | 
st->settings;
-   st->tx_buf[len++] = cpu_to_be16(cmd);
+   st->tx_buf[len++] = cmd;
}
 
/* Data for the 1st channel is not returned until the 3rd transfer */
-   len += 2;
-   for (i = 0; i < len; i++) {
-   if ((i + 2) < len)
-   st->ring_xfer[i].tx_buf = >tx_buf[i];
-   if (i >= 2)
-   st->ring_xfer[i].rx_buf = >rx_buf[i - 2];
-   st->ring_xfer[i].len = 2;
-   st->ring_xfer[i].cs_change = 1;
-   }
-   /* make sure last transfer's cs_change is not set */
-   st->ring_xfer[len - 1].cs_change = 0;
+   st->tx_buf[len++] = 0;
+   st->tx_buf[len++] = 0;
 
-   spi_message_init_with_transfers(>ring_msg, st->ring_xfer, len);
+   st->ring_xfer.len = len * 2;
 
return 0;
 }
@@ -289,7 +280,7 @@ static irqreturn_t ti_ads7950_trigger_handler(int irq, void 
*p)
if (ret < 0)
goto out;
 
-   iio_push_to_buffers_with_timestamp(indio_dev, st->rx_buf,
+   iio_push_to_buffers_with_timestamp(indio_dev, >rx_buf[2],
   iio_get_time_ns(indio_dev));
 
 out:
@@ -305,13 +296,13 @@ static int ti_ads7950_scan_direct(struct ti_ads7950_state 
*st, unsigned int ch)
mutex_lock(>indio_dev->mlock);
 
cmd = TI_ADS7950_CR_WRITE | TI_ADS7950_CR_CHAN(ch) | st->settings;
-   st->single_tx = cpu_to_be16(cmd);
+   st->single_tx = cmd;
 
ret = spi_sync(st->spi, >scan_single_msg);
if (ret)
goto out;
 
-   ret = be16_to_cpu(st->single_rx);
+   ret = st->single_rx;
 
 out:
mutex_unlock(>indio_dev->mlock);
@@ -385,6 +376,14 @@ static int ti_ads7950_probe(struct spi_device *spi)
const struct ti_ads7950_chip_info *info;
int ret;
 
+   spi->bits_per_word = 16;
+   spi->mode |= SPI_CS_WORD;
+   ret = spi_setup(spi);
+   if (ret < 0) {
+   dev_err(>dev, "Error in spi setup\n");
+   return ret;
+   }
+
indio_dev =

[PATCH 2/4] spi: add new SPI_CS_WORD flag

2018-07-16 Thread David Lechner

This adds a new SPI mode flag, SPI_CS_WORD, that is used to indicate
that a SPI device requires the chip select to be toggled after each
word that is transferred.

Signed-off-by: David Lechner 
---
 include/linux/spi/spi.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/spi/spi.h b/include/linux/spi/spi.h
index a64235e05321..7cc1466111f5 100644
--- a/include/linux/spi/spi.h
+++ b/include/linux/spi/spi.h
@@ -163,6 +163,7 @@ struct spi_device {
 #defineSPI_TX_QUAD 0x200   /* transmit with 4 
wires */
 #defineSPI_RX_DUAL 0x400   /* receive with 2 wires 
*/
 #defineSPI_RX_QUAD 0x800   /* receive with 4 wires 
*/
+#define SPI_CS_WORD0x1000  /* toggle cs after each word */
int irq;
void*controller_state;
void*controller_data;
@@ -177,7 +178,6 @@ struct spi_device {
 * the controller talks to each chip, like:
 *  - memory packing (12 bit samples into low bits, others zeroed)
 *  - priority
-*  - drop chipselect after each word
 *  - chipselect delays
 *  - ...
 */
-- 
2.17.1

[PATCH 3/4] spi: spi-davinci: Add support for SPI_CS_WORD

2018-07-16 Thread David Lechner

This adds support for the SPI_CS_WORD flag to the TI DaVinci SPI
driver. This mode can be used as long as we are using the hardware
chip select and not a GPIO chip select.

Signed-off-by: David Lechner 
---
 drivers/spi/spi-davinci.c | 11 ---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/drivers/spi/spi-davinci.c b/drivers/spi/spi-davinci.c
index 577084bb911b..8e4b869b50fd 100644
--- a/drivers/spi/spi-davinci.c
+++ b/drivers/spi/spi-davinci.c
@@ -232,7 +232,8 @@ static void davinci_spi_chipselect(struct spi_device *spi, 
int value)
!(spi->mode & SPI_CS_HIGH));
} else {
if (value == BITBANG_CS_ACTIVE) {
-   spidat1 |= SPIDAT1_CSHOLD_MASK;
+   if (!(spi->mode & SPI_CS_WORD))
+   spidat1 |= SPIDAT1_CSHOLD_MASK;
spidat1 &= ~(0x1 << chip_sel);
}
}
@@ -449,8 +450,12 @@ static int davinci_spi_setup(struct spi_device *spi)
return retval;
}
 
-   if (internal_cs)
+   if (internal_cs) {
set_io_bits(dspi->base + SPIPC0, 1 << spi->chip_select);
+   } else if (spi->mode & SPI_CS_WORD) {
+   dev_err(>dev, "SPI_CS_WORD can't be use with GPIO 
CS\n");
+   return -EINVAL;
+   }
}
 
if (spi->mode & SPI_READY)
@@ -985,7 +990,7 @@ static int davinci_spi_probe(struct platform_device *pdev)
dspi->prescaler_limit = pdata->prescaler_limit;
dspi->version = pdata->version;
 
-   dspi->bitbang.flags = SPI_NO_CS | SPI_LSB_FIRST | SPI_LOOP;
+   dspi->bitbang.flags = SPI_NO_CS | SPI_LSB_FIRST | SPI_LOOP | 
SPI_CS_WORD;
if (dspi->version == SPI_VERSION_2)
dspi->bitbang.flags |= SPI_READY;
 
-- 
2.17.1

[PATCH 4/4] iio: adc: ti-ads7950: use SPI_CS_WORD to reduce CPU usage

2018-07-16 Thread David Lechner

This changes how the SPI message for the triggered buffer is setup in
the TI ADS7950 A/DC driver. By using the SPI_CS_WORD flag, we can read
multiple samples in a single SPI transfer. If the SPI controller
supports DMA transfers, we can see a significant reduction in CPU usage.

For example, on an ARM9 system running at 456MHz reading just 4 channels
at 100Hz: before this change, top shows the CPU usage of the IRQ thread
of this driver to be ~7.7%. After this change, the CPU usage drops to
~3.8%.

Signed-off-by: David Lechner 
---

Dependency: this patch applies on top of "iio: adc: ti-ads7950: allow
simultaneous use of buffer and direct mode"[1]

[1]: https://lore.kernel.org/lkml/20180716233550.6449-1-da...@lechnology.com/


 drivers/iio/adc/ti-ads7950.c | 53 +---
 1 file changed, 31 insertions(+), 22 deletions(-)

diff --git a/drivers/iio/adc/ti-ads7950.c b/drivers/iio/adc/ti-ads7950.c
index ba7e5a027490..60de4cbbd5fc 100644
--- a/drivers/iio/adc/ti-ads7950.c
+++ b/drivers/iio/adc/ti-ads7950.c
@@ -60,7 +60,7 @@
 struct ti_ads7950_state {
struct iio_dev  *indio_dev;
struct spi_device   *spi;
-   struct spi_transfer ring_xfer[TI_ADS7950_MAX_CHAN + 2];
+   struct spi_transfer ring_xfer;
struct spi_transfer scan_single_xfer[3];
struct spi_message  ring_msg;
struct spi_message  scan_single_msg;
@@ -69,16 +69,16 @@ struct ti_ads7950_state {
unsigned intvref_mv;
 
unsigned intsettings;
-   __be16  single_tx;
-   __be16  single_rx;
+   u16 single_tx;
+   u16 single_rx;
 
/*
 * DMA (thus cache coherency maintenance) requires the
 * transfer buffers to live in their own cache lines.
 */
-   __be16  rx_buf[TI_ADS7950_MAX_CHAN + TI_ADS7950_TIMESTAMP_SIZE]
+   u16 rx_buf[TI_ADS7950_MAX_CHAN + 2 + TI_ADS7950_TIMESTAMP_SIZE]
cacheline_aligned;
-   __be16  tx_buf[TI_ADS7950_MAX_CHAN];
+   u16 tx_buf[TI_ADS7950_MAX_CHAN + 2];
 };
 
 struct ti_ads7950_chip_info {
@@ -116,7 +116,7 @@ enum ti_ads7950_id {
.realbits = bits,   \
.storagebits = 16,  \
.shift = 12 - (bits),   \
-   .endianness = IIO_BE,   \
+   .endianness = IIO_CPU,  \
},  \
 }
 
@@ -257,23 +257,14 @@ static int ti_ads7950_update_scan_mode(struct iio_dev 
*indio_dev,
len = 0;
for_each_set_bit(i, active_scan_mask, indio_dev->num_channels) {
cmd = TI_ADS7950_CR_WRITE | TI_ADS7950_CR_CHAN(i) | 
st->settings;
-   st->tx_buf[len++] = cpu_to_be16(cmd);
+   st->tx_buf[len++] = cmd;
}
 
/* Data for the 1st channel is not returned until the 3rd transfer */
-   len += 2;
-   for (i = 0; i < len; i++) {
-   if ((i + 2) < len)
-   st->ring_xfer[i].tx_buf = >tx_buf[i];
-   if (i >= 2)
-   st->ring_xfer[i].rx_buf = >rx_buf[i - 2];
-   st->ring_xfer[i].len = 2;
-   st->ring_xfer[i].cs_change = 1;
-   }
-   /* make sure last transfer's cs_change is not set */
-   st->ring_xfer[len - 1].cs_change = 0;
+   st->tx_buf[len++] = 0;
+   st->tx_buf[len++] = 0;
 
-   spi_message_init_with_transfers(>ring_msg, st->ring_xfer, len);
+   st->ring_xfer.len = len * 2;
 
return 0;
 }
@@ -289,7 +280,7 @@ static irqreturn_t ti_ads7950_trigger_handler(int irq, void 
*p)
if (ret < 0)
goto out;
 
-   iio_push_to_buffers_with_timestamp(indio_dev, st->rx_buf,
+   iio_push_to_buffers_with_timestamp(indio_dev, >rx_buf[2],
   iio_get_time_ns(indio_dev));
 
 out:
@@ -305,13 +296,13 @@ static int ti_ads7950_scan_direct(struct ti_ads7950_state 
*st, unsigned int ch)
mutex_lock(>indio_dev->mlock);
 
cmd = TI_ADS7950_CR_WRITE | TI_ADS7950_CR_CHAN(ch) | st->settings;
-   st->single_tx = cpu_to_be16(cmd);
+   st->single_tx = cmd;
 
ret = spi_sync(st->spi, >scan_single_msg);
if (ret)
goto out;
 
-   ret = be16_to_cpu(st->single_rx);
+   ret = st->single_rx;
 
 out:
mutex_unlock(>indio_dev->mlock);
@@ -385,6 +376,14 @@ static int ti_ads7950_probe(struct spi_device *spi)
const struct ti_ads7950_chip_info *info;
int ret;
 
+   spi->bits_per_word = 16;
+   spi->mode |= SPI_CS_WORD;
+   ret = spi_setup(spi);
+   if (ret < 0) {
+   dev_err(>dev, "Error in spi setup\n");
+   return ret;
+   }
+
indio_dev =

[PATCH 2/4] spi: add new SPI_CS_WORD flag

2018-07-16 Thread David Lechner

This adds a new SPI mode flag, SPI_CS_WORD, that is used to indicate
that a SPI device requires the chip select to be toggled after each
word that is transferred.

Signed-off-by: David Lechner 
---
 include/linux/spi/spi.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/spi/spi.h b/include/linux/spi/spi.h
index a64235e05321..7cc1466111f5 100644
--- a/include/linux/spi/spi.h
+++ b/include/linux/spi/spi.h
@@ -163,6 +163,7 @@ struct spi_device {
 #defineSPI_TX_QUAD 0x200   /* transmit with 4 
wires */
 #defineSPI_RX_DUAL 0x400   /* receive with 2 wires 
*/
 #defineSPI_RX_QUAD 0x800   /* receive with 4 wires 
*/
+#define SPI_CS_WORD0x1000  /* toggle cs after each word */
int irq;
void*controller_state;
void*controller_data;
@@ -177,7 +178,6 @@ struct spi_device {
 * the controller talks to each chip, like:
 *  - memory packing (12 bit samples into low bits, others zeroed)
 *  - priority
-*  - drop chipselect after each word
 *  - chipselect delays
 *  - ...
 */
-- 
2.17.1

[PATCH 3/4] spi: spi-davinci: Add support for SPI_CS_WORD

2018-07-16 Thread David Lechner

This adds support for the SPI_CS_WORD flag to the TI DaVinci SPI
driver. This mode can be used as long as we are using the hardware
chip select and not a GPIO chip select.

Signed-off-by: David Lechner 
---
 drivers/spi/spi-davinci.c | 11 ---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/drivers/spi/spi-davinci.c b/drivers/spi/spi-davinci.c
index 577084bb911b..8e4b869b50fd 100644
--- a/drivers/spi/spi-davinci.c
+++ b/drivers/spi/spi-davinci.c
@@ -232,7 +232,8 @@ static void davinci_spi_chipselect(struct spi_device *spi, 
int value)
!(spi->mode & SPI_CS_HIGH));
} else {
if (value == BITBANG_CS_ACTIVE) {
-   spidat1 |= SPIDAT1_CSHOLD_MASK;
+   if (!(spi->mode & SPI_CS_WORD))
+   spidat1 |= SPIDAT1_CSHOLD_MASK;
spidat1 &= ~(0x1 << chip_sel);
}
}
@@ -449,8 +450,12 @@ static int davinci_spi_setup(struct spi_device *spi)
return retval;
}
 
-   if (internal_cs)
+   if (internal_cs) {
set_io_bits(dspi->base + SPIPC0, 1 << spi->chip_select);
+   } else if (spi->mode & SPI_CS_WORD) {
+   dev_err(>dev, "SPI_CS_WORD can't be use with GPIO 
CS\n");
+   return -EINVAL;
+   }
}
 
if (spi->mode & SPI_READY)
@@ -985,7 +990,7 @@ static int davinci_spi_probe(struct platform_device *pdev)
dspi->prescaler_limit = pdata->prescaler_limit;
dspi->version = pdata->version;
 
-   dspi->bitbang.flags = SPI_NO_CS | SPI_LSB_FIRST | SPI_LOOP;
+   dspi->bitbang.flags = SPI_NO_CS | SPI_LSB_FIRST | SPI_LOOP | 
SPI_CS_WORD;
if (dspi->version == SPI_VERSION_2)
dspi->bitbang.flags |= SPI_READY;
 
-- 
2.17.1

[PATCH] module: Add the print format of Elf_Addr for 32/64-bit compatibly

2018-07-16 Thread Zong Li

Use a similar way like fixed width integer types in inttypes.h.

For the ELF, the Elf32_Addr is int type and Elf64_Addr is long long
type. It is opposite to definition of inttypes.h, so the Elf_Addr cannot
re-use the header.

In many architectures, the module loader only print the message of
module name and relocation type except the address information. We can
print the address correctly without compile warning by using PRIdEA,
PRIxEA and so on.

Signed-off-by: Zong Li 
---
 arch/riscv/kernel/module.c   | 13 +++--
 include/asm-generic/module.h |  9 +
 2 files changed, 16 insertions(+), 6 deletions(-)

diff --git a/arch/riscv/kernel/module.c b/arch/riscv/kernel/module.c
index 3303ed2cd419..d839528e491a 100644
--- a/arch/riscv/kernel/module.c
+++ b/arch/riscv/kernel/module.c
@@ -16,11 +16,12 @@
 #include 
 #include 
 #include 
+#include 
 
 static int apply_r_riscv_32_rela(struct module *me, u32 *location, Elf_Addr v)
 {
if (v != (u32)v) {
-   pr_err("%s: value %016llx out of range for 32-bit field\n",
+   pr_err("%s: value %016" PRIxEA "out of range for 32-bit 
field\n",
   me->name, v);
return -EINVAL;
}
@@ -101,7 +102,7 @@ static int apply_r_riscv_pcrel_hi20_rela(struct module *me, 
u32 *location,
 
if (offset != (s32)offset) {
pr_err(
- "%s: target %016llx can not be addressed by the 32-bit offset 
from PC = %p\n",
+ "%s: target %016" PRIxEA "can not be addressed by the 32-bit 
offset from PC = %p\n",
  me->name, v, location);
return -EINVAL;
}
@@ -143,7 +144,7 @@ static int apply_r_riscv_hi20_rela(struct module *me, u32 
*location,
 
if (IS_ENABLED(CMODEL_MEDLOW)) {
pr_err(
- "%s: target %016llx can not be addressed by the 32-bit offset 
from PC = %p\n",
+ "%s: target %016" PRIxEA "can not be addressed by the 32-bit 
offset from PC = %p\n",
  me->name, v, location);
return -EINVAL;
}
@@ -187,7 +188,7 @@ static int apply_r_riscv_got_hi20_rela(struct module *me, 
u32 *location,
offset = (void *)offset - (void *)location;
} else {
pr_err(
- "%s: can not generate the GOT entry for symbol = %016llx from 
PC = %p\n",
+ "%s: can not generate the GOT entry for symbol = %016" PRIxEA 
"from PC = %p\n",
  me->name, v, location);
return -EINVAL;
}
@@ -211,7 +212,7 @@ static int apply_r_riscv_call_plt_rela(struct module *me, 
u32 *location,
offset = (void *)offset - (void *)location;
} else {
pr_err(
- "%s: target %016llx can not be addressed by the 
32-bit offset from PC = %p\n",
+ "%s: target %016" PRIxEA "can not be addressed by the 
32-bit offset from PC = %p\n",
  me->name, v, location);
return -EINVAL;
}
@@ -233,7 +234,7 @@ static int apply_r_riscv_call_rela(struct module *me, u32 
*location,
 
if (offset != fill_v) {
pr_err(
- "%s: target %016llx can not be addressed by the 32-bit offset 
from PC = %p\n",
+ "%s: target %016" PRIxEA "can not be addressed by the 32-bit 
offset from PC = %p\n",
  me->name, v, location);
return -EINVAL;
}
diff --git a/include/asm-generic/module.h b/include/asm-generic/module.h
index 98e1541b72b7..d358f2414e5e 100644
--- a/include/asm-generic/module.h
+++ b/include/asm-generic/module.h
@@ -27,6 +27,7 @@ struct mod_arch_specific
 #endif
 #define ELF_R_TYPE(X)  ELF64_R_TYPE(X)
 #define ELF_R_SYM(X)   ELF64_R_SYM(X)
+#define ELF_ADDR_PREFIX "ll"
 
 #else /* CONFIG_64BIT */
 
@@ -44,6 +45,14 @@ struct mod_arch_specific
 #endif
 #define ELF_R_TYPE(X)  ELF32_R_TYPE(X)
 #define ELF_R_SYM(X)   ELF32_R_SYM(X)
+#define ELF_ADDR_PREFIX
 #endif
 
+#define PRIdEA ELF_ADDR_PREFIX "d" /* signed decimal */
+#define PRIiEA ELF_ADDR_PREFIX "i" /* signed decimal */
+#define PRIuEA ELF_ADDR_PREFIX "u" /* unsigned decimal */
+#define PRIoEA ELF_ADDR_PREFIX "o" /* unsigned octal */
+#define PRIxEA ELF_ADDR_PREFIX "x" /* unsigned lowercase hexadecimal */
+#define PRIXEA ELF_ADDR_PREFIX "X" /* unsigned uppercase hexadecimal */
+
 #endif /* __ASM_GENERIC_MODULE_H */
-- 
2.16.1

Re: [PATCH v2 2/2] pinctrl: meson-g12a: add pinctrl driver support

2018-07-16 Thread Yixun Lan



HI Jerome


On 07/16/18 17:54, Jerome Brunet wrote:
> 
 +/* uart_ao_a_ee */
 +static const unsigned int uart_ao_rx_a_c2_pins[]= { GPIOC_2 };
 +static const unsigned int uart_ao_tx_a_c3_pins[]= { GPIOC_3 };
>>>
>>> Same comment as Martin about naming consistency ... drop c2 and c3 here.
>>>
>>
>> there is already uart_ao_rx_a_pins[]  uart_ao_tx_a_pins[] , see
>>
>>  794 static const unsigned int uart_ao_tx_a_pins[]   = {
>> GPIOAO_0 };
>>  795 static const unsigned int uart_ao_rx_a_pins[]   = {
>> GPIOAO_1 };
>>
>> in the G12A ASIC design, some AO device (from function perspective)
>> route the pin to EE domain, for maximize pin mux utilization.
>>
>> if you don't like this naming scheme, I could rename it into
>>   uart_ao_rx_a_ee_pins[]
>>   uart_ao_tx_a_ee_pins[]
>>
> 
> What we are asking when requesting consistency is to respect a scheme.
> 
> 1) If the pin function is available only once:
> ${FUNCTION}_${PINFUNC}
> 2) If the pin function is available on the several banks
> ${FUNCTION}_${PINFUNC}_${BANK}
> 3)  If the pin function is available on the several pins of the same bank
> ${FUNCTION}_${PINfFUNC}_${BANK}${PINNUN}
> 

to be more accurate, I extend the syntax to

${FUNCTION}_${DOMAIN}_${PORT}_${PINFUNC}_${BANK}${PINNUM}

take " uart_ao_a_tx_c" as an example
FUNCTION = uart
DOMAIN= ao (may omit if it's belong to EE domain)
PORT=a (may omit if only one port)
PINFUNC = tx
BANK = C (may omit if only one BANK)
PINNUM = ? (only if two more same function in one BANK)


previous in AXG driver we follow scheme
${FUNCTION}_${DOMAIN}_${PINFUNC}_${PORT}_${BANK}${PINNUM}

which may bring confusion, since both PORT and BANK may use alphabet
character, it's hard to tell which is PORT or BANK at first glance.
even worse, sometimes either of PORT or BANK may be omitted..


> Either your function is uart_ao_a_ee and it is available only once then
> you should drop  c2 and c3
> 
> uart_ao_a_ee_rx and uart_ao_a_ee_tx
> 
> or the function is uart_ao_a which is available on ao and c bank then name
> should be 
> 
> uart_ao_a_rx_c, uart_ao_a_tx_c,
> 
we will take this way, has a note says it in the BANK-C sounds more
accurate,

> 
>> which mean uart_ao rx pin at port A route to EE domain's physical pin.
>>
> 
> [...]
> 
>>
 c const unsigned int pwm_f_h_pins[]= { GPIOH_5 };
 +
 +/* cec_ao_ee */
 +static const unsigned int cec_ao_a_ee_pins[]= { GPIOH_3 };
 +static const unsigned int cec_ao_b_ee_pins[]= { GPIOH_3 };
>>>
>>> Naming consistency : cec_ao_ee_a ? cec_ao_ee_b ?
>>>
>>
>> I'd prefer the original version, which mean cec_ao controller at port a
>> route to EE domain's physical pin.
>>
>> I would check this driver to see if there is inconsistency.
> 
> Then the function is CEC_AO not CEC_AO_EE.
> 
> Either the function is cec_ao_ee of cell A and B then name should be 
> 
> cec_ao_ee_a and cec_ao_ee_b
> 
> or function is cec_ao on bank H (also available on bank ao)
> 
> Then name should be cec_ao_a_h, cec_ao_b_h
> 
Ok, we will take this way


> Please choose.
> 
>>
>>
 +
 +/* jtag_b */
 +static const unsigned int jtag_b_tdo_pins[] = { GPIOC_0 };
 +static const unsigned int jtag_b_tdi_pins[] = { GPIOC_1 };
 +static const unsigned int jtag_b_clk_pins[] = { GPIOC_4 };
 +static const unsigned int jtag_b_tms_pins[] = { GPIOC_5 };
 +
 +/* bt565 */
 +static const unsigned int bt565_a_vs_pins[] = { GPIOZ_0 };
 +static const unsigned int bt565_a_hs_pins[] = { GPIOZ_1 };
 +static const unsigned int bt565_a_clk_pins[]= { GPIOZ_3 };
 +static const unsigned int bt565_a_din0_pins[]   = { GPIOZ_4 };
 +static const unsigned int bt565_a_din1_pins[]   = { GPIOZ_5 };
 +static const unsigned int bt565_a_din2_pins[]   = { GPIOZ_6 };
 +static const unsigned int bt565_a_din3_pins[]   = { GPIOZ_7 };
 +static const unsigned int bt565_a_din4_pins[]   = { GPIOZ_8 };
 +static const unsigned int bt565_a_din5_pins[]   = { GPIOZ_9 };
 +static const unsigned int bt565_a_din6_pins[]   = { GPIOZ_10 
 };
 +static const unsigned int bt565_a_din7_pins[]   = { GPIOZ_11 
 };
>>>
>>> Why bt565_a and no bt565 only ?
>>>
>>
>> After talking to Xingyu, this naming is actually taken from the pin mux
>> documentation, it's BT565_A there.
>>
>> I'm not sure if you insist to drop the _a suffix, personally I'd just
>> leave it as it is, for better consistence with documentation.
> 
> Then function name should be bt565_a
> 
sure, will fix this
>>
>>
 +
 +/* tsin_a */
 +static const unsigned int tsin_a_valid_pins[]   = { GPIOX_2 };
 +static const unsigned int tsin_a_sop_pins[] = { GPIOX_1 };
 +static const unsigned int tsin_a_din0_pins[]= { GPIOX_0 };

[PATCH] module: Add the print format of Elf_Addr for 32/64-bit compatibly

2018-07-16 Thread Zong Li

Use a similar way like fixed width integer types in inttypes.h.

For the ELF, the Elf32_Addr is int type and Elf64_Addr is long long
type. It is opposite to definition of inttypes.h, so the Elf_Addr cannot
re-use the header.

In many architectures, the module loader only print the message of
module name and relocation type except the address information. We can
print the address correctly without compile warning by using PRIdEA,
PRIxEA and so on.

Signed-off-by: Zong Li 
---
 arch/riscv/kernel/module.c   | 13 +++--
 include/asm-generic/module.h |  9 +
 2 files changed, 16 insertions(+), 6 deletions(-)

diff --git a/arch/riscv/kernel/module.c b/arch/riscv/kernel/module.c
index 3303ed2cd419..d839528e491a 100644
--- a/arch/riscv/kernel/module.c
+++ b/arch/riscv/kernel/module.c
@@ -16,11 +16,12 @@
 #include 
 #include 
 #include 
+#include 
 
 static int apply_r_riscv_32_rela(struct module *me, u32 *location, Elf_Addr v)
 {
if (v != (u32)v) {
-   pr_err("%s: value %016llx out of range for 32-bit field\n",
+   pr_err("%s: value %016" PRIxEA "out of range for 32-bit 
field\n",
   me->name, v);
return -EINVAL;
}
@@ -101,7 +102,7 @@ static int apply_r_riscv_pcrel_hi20_rela(struct module *me, 
u32 *location,
 
if (offset != (s32)offset) {
pr_err(
- "%s: target %016llx can not be addressed by the 32-bit offset 
from PC = %p\n",
+ "%s: target %016" PRIxEA "can not be addressed by the 32-bit 
offset from PC = %p\n",
  me->name, v, location);
return -EINVAL;
}
@@ -143,7 +144,7 @@ static int apply_r_riscv_hi20_rela(struct module *me, u32 
*location,
 
if (IS_ENABLED(CMODEL_MEDLOW)) {
pr_err(
- "%s: target %016llx can not be addressed by the 32-bit offset 
from PC = %p\n",
+ "%s: target %016" PRIxEA "can not be addressed by the 32-bit 
offset from PC = %p\n",
  me->name, v, location);
return -EINVAL;
}
@@ -187,7 +188,7 @@ static int apply_r_riscv_got_hi20_rela(struct module *me, 
u32 *location,
offset = (void *)offset - (void *)location;
} else {
pr_err(
- "%s: can not generate the GOT entry for symbol = %016llx from 
PC = %p\n",
+ "%s: can not generate the GOT entry for symbol = %016" PRIxEA 
"from PC = %p\n",
  me->name, v, location);
return -EINVAL;
}
@@ -211,7 +212,7 @@ static int apply_r_riscv_call_plt_rela(struct module *me, 
u32 *location,
offset = (void *)offset - (void *)location;
} else {
pr_err(
- "%s: target %016llx can not be addressed by the 
32-bit offset from PC = %p\n",
+ "%s: target %016" PRIxEA "can not be addressed by the 
32-bit offset from PC = %p\n",
  me->name, v, location);
return -EINVAL;
}
@@ -233,7 +234,7 @@ static int apply_r_riscv_call_rela(struct module *me, u32 
*location,
 
if (offset != fill_v) {
pr_err(
- "%s: target %016llx can not be addressed by the 32-bit offset 
from PC = %p\n",
+ "%s: target %016" PRIxEA "can not be addressed by the 32-bit 
offset from PC = %p\n",
  me->name, v, location);
return -EINVAL;
}
diff --git a/include/asm-generic/module.h b/include/asm-generic/module.h
index 98e1541b72b7..d358f2414e5e 100644
--- a/include/asm-generic/module.h
+++ b/include/asm-generic/module.h
@@ -27,6 +27,7 @@ struct mod_arch_specific
 #endif
 #define ELF_R_TYPE(X)  ELF64_R_TYPE(X)
 #define ELF_R_SYM(X)   ELF64_R_SYM(X)
+#define ELF_ADDR_PREFIX "ll"
 
 #else /* CONFIG_64BIT */
 
@@ -44,6 +45,14 @@ struct mod_arch_specific
 #endif
 #define ELF_R_TYPE(X)  ELF32_R_TYPE(X)
 #define ELF_R_SYM(X)   ELF32_R_SYM(X)
+#define ELF_ADDR_PREFIX
 #endif
 
+#define PRIdEA ELF_ADDR_PREFIX "d" /* signed decimal */
+#define PRIiEA ELF_ADDR_PREFIX "i" /* signed decimal */
+#define PRIuEA ELF_ADDR_PREFIX "u" /* unsigned decimal */
+#define PRIoEA ELF_ADDR_PREFIX "o" /* unsigned octal */
+#define PRIxEA ELF_ADDR_PREFIX "x" /* unsigned lowercase hexadecimal */
+#define PRIXEA ELF_ADDR_PREFIX "X" /* unsigned uppercase hexadecimal */
+
 #endif /* __ASM_GENERIC_MODULE_H */
-- 
2.16.1

Re: [PATCH v2 2/2] pinctrl: meson-g12a: add pinctrl driver support

2018-07-16 Thread Yixun Lan



HI Jerome


On 07/16/18 17:54, Jerome Brunet wrote:
> 
 +/* uart_ao_a_ee */
 +static const unsigned int uart_ao_rx_a_c2_pins[]= { GPIOC_2 };
 +static const unsigned int uart_ao_tx_a_c3_pins[]= { GPIOC_3 };
>>>
>>> Same comment as Martin about naming consistency ... drop c2 and c3 here.
>>>
>>
>> there is already uart_ao_rx_a_pins[]  uart_ao_tx_a_pins[] , see
>>
>>  794 static const unsigned int uart_ao_tx_a_pins[]   = {
>> GPIOAO_0 };
>>  795 static const unsigned int uart_ao_rx_a_pins[]   = {
>> GPIOAO_1 };
>>
>> in the G12A ASIC design, some AO device (from function perspective)
>> route the pin to EE domain, for maximize pin mux utilization.
>>
>> if you don't like this naming scheme, I could rename it into
>>   uart_ao_rx_a_ee_pins[]
>>   uart_ao_tx_a_ee_pins[]
>>
> 
> What we are asking when requesting consistency is to respect a scheme.
> 
> 1) If the pin function is available only once:
> ${FUNCTION}_${PINFUNC}
> 2) If the pin function is available on the several banks
> ${FUNCTION}_${PINFUNC}_${BANK}
> 3)  If the pin function is available on the several pins of the same bank
> ${FUNCTION}_${PINfFUNC}_${BANK}${PINNUN}
> 

to be more accurate, I extend the syntax to

${FUNCTION}_${DOMAIN}_${PORT}_${PINFUNC}_${BANK}${PINNUM}

take " uart_ao_a_tx_c" as an example
FUNCTION = uart
DOMAIN= ao (may omit if it's belong to EE domain)
PORT=a (may omit if only one port)
PINFUNC = tx
BANK = C (may omit if only one BANK)
PINNUM = ? (only if two more same function in one BANK)


previous in AXG driver we follow scheme
${FUNCTION}_${DOMAIN}_${PINFUNC}_${PORT}_${BANK}${PINNUM}

which may bring confusion, since both PORT and BANK may use alphabet
character, it's hard to tell which is PORT or BANK at first glance.
even worse, sometimes either of PORT or BANK may be omitted..


> Either your function is uart_ao_a_ee and it is available only once then
> you should drop  c2 and c3
> 
> uart_ao_a_ee_rx and uart_ao_a_ee_tx
> 
> or the function is uart_ao_a which is available on ao and c bank then name
> should be 
> 
> uart_ao_a_rx_c, uart_ao_a_tx_c,
> 
we will take this way, has a note says it in the BANK-C sounds more
accurate,

> 
>> which mean uart_ao rx pin at port A route to EE domain's physical pin.
>>
> 
> [...]
> 
>>
 c const unsigned int pwm_f_h_pins[]= { GPIOH_5 };
 +
 +/* cec_ao_ee */
 +static const unsigned int cec_ao_a_ee_pins[]= { GPIOH_3 };
 +static const unsigned int cec_ao_b_ee_pins[]= { GPIOH_3 };
>>>
>>> Naming consistency : cec_ao_ee_a ? cec_ao_ee_b ?
>>>
>>
>> I'd prefer the original version, which mean cec_ao controller at port a
>> route to EE domain's physical pin.
>>
>> I would check this driver to see if there is inconsistency.
> 
> Then the function is CEC_AO not CEC_AO_EE.
> 
> Either the function is cec_ao_ee of cell A and B then name should be 
> 
> cec_ao_ee_a and cec_ao_ee_b
> 
> or function is cec_ao on bank H (also available on bank ao)
> 
> Then name should be cec_ao_a_h, cec_ao_b_h
> 
Ok, we will take this way


> Please choose.
> 
>>
>>
 +
 +/* jtag_b */
 +static const unsigned int jtag_b_tdo_pins[] = { GPIOC_0 };
 +static const unsigned int jtag_b_tdi_pins[] = { GPIOC_1 };
 +static const unsigned int jtag_b_clk_pins[] = { GPIOC_4 };
 +static const unsigned int jtag_b_tms_pins[] = { GPIOC_5 };
 +
 +/* bt565 */
 +static const unsigned int bt565_a_vs_pins[] = { GPIOZ_0 };
 +static const unsigned int bt565_a_hs_pins[] = { GPIOZ_1 };
 +static const unsigned int bt565_a_clk_pins[]= { GPIOZ_3 };
 +static const unsigned int bt565_a_din0_pins[]   = { GPIOZ_4 };
 +static const unsigned int bt565_a_din1_pins[]   = { GPIOZ_5 };
 +static const unsigned int bt565_a_din2_pins[]   = { GPIOZ_6 };
 +static const unsigned int bt565_a_din3_pins[]   = { GPIOZ_7 };
 +static const unsigned int bt565_a_din4_pins[]   = { GPIOZ_8 };
 +static const unsigned int bt565_a_din5_pins[]   = { GPIOZ_9 };
 +static const unsigned int bt565_a_din6_pins[]   = { GPIOZ_10 
 };
 +static const unsigned int bt565_a_din7_pins[]   = { GPIOZ_11 
 };
>>>
>>> Why bt565_a and no bt565 only ?
>>>
>>
>> After talking to Xingyu, this naming is actually taken from the pin mux
>> documentation, it's BT565_A there.
>>
>> I'm not sure if you insist to drop the _a suffix, personally I'd just
>> leave it as it is, for better consistence with documentation.
> 
> Then function name should be bt565_a
> 
sure, will fix this
>>
>>
 +
 +/* tsin_a */
 +static const unsigned int tsin_a_valid_pins[]   = { GPIOX_2 };
 +static const unsigned int tsin_a_sop_pins[] = { GPIOX_1 };
 +static const unsigned int tsin_a_din0_pins[]= { GPIOX_0 };

Re: [PATCH v5 2/2] cpufreq: qcom-hw: Add support for QCOM cpufreq HW driver

2018-07-16 Thread Taniya Das





On 7/16/2018 10:42 AM, Viresh Kumar wrote:

On 12-07-18, 23:35, Taniya Das wrote:

+static int qcom_cpu_resources_init(struct platform_device *pdev,
+  struct device_node *np, unsigned int cpu)
+{
+   struct cpufreq_qcom *c;
+   struct resource res;
+   struct device *dev = >dev;
+   unsigned int offset, cpu_r;
+   int ret;
+
+   c = devm_kzalloc(dev, sizeof(*c), GFP_KERNEL);
+   if (!c)
+   return -ENOMEM;
+
+   c->reg_offset = of_device_get_match_data(>dev);
+   if (!c->reg_offset)
+   return -EINVAL;
+
+   if (of_address_to_resource(np, 0, ))
+   return -ENOMEM;
+
+   c->base = devm_ioremap(dev, res.start, resource_size());
+   if (!c->base) {
+   dev_err(dev, "Unable to map %s base\n", np->name);
+   return -ENOMEM;
+   }
+
+   offset = c->reg_offset[REG_ENABLE];
+
+   /* HW should be in enabled state to proceed */
+   if (!(readl_relaxed(c->base + offset) & 0x1)) {
+   dev_err(dev, "%s cpufreq hardware not enabled\n", np->name);
+   return -ENODEV;
+   }
+
+   ret = qcom_get_related_cpus(np, >related_cpus);
+   if (ret) {
+   dev_err(dev, "%s failed to get related CPUs\n", np->name);
+   return ret;
+   }
+
+   c->max_cores = cpumask_weight(>related_cpus);
+   if (!c->max_cores)
+   return -ENOENT;
+
+   ret = qcom_read_lut(pdev, c);
+   if (ret) {
+   dev_err(dev, "%s failed to read LUT\n", np->name);
+   return ret;
+   }
+
+   qcom_freq_domain_map[cpu] = c;
+
+   /* Related CPUs to keep a single copy */
+   cpu_r = cpumask_first(>related_cpus);
+   if (cpu != cpu_r) {
+   qcom_freq_domain_map[cpu] = qcom_freq_domain_map[cpu_r];
+   devm_kfree(dev, c);
+   }


Sorry about missing this, you have actually worked on my comments.

But I think this isn't the clever way of doing it. You allocate the
structures, fill everything and then finally free them because we were
related. Why can't we have similar check at the top of this routine
and skip everything then ?



Thanks Viresh, I actually replied to all the reviewers with the changes 
I have made in v5 series, sorry for not replying individually to all 
reviewers. Yes, I agree I should have put this check at the beginning of 
the function.


Please help review of the new series[v5] which takes care of the below.

- Remove mapping different register regions of perf/lut/enable,
  instead map the entire HW region.
- Add reg_offset/cpufreq_qcom_std_offsets to be supplied as device data.
- Check of src == 0 during lut read.
- Add of_node_put(cpu_np) in qcom_get_related_cpus
- Update the qcom_cpu_resources_init for register offset data,
  and cleanup the related cpus to keep a single copy of CPUfreq.
- Replace FW with HW, update Kconfig, rename filename qcom-cpufreq-hw.c

--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation.

--

Re: [PATCH v5 2/2] cpufreq: qcom-hw: Add support for QCOM cpufreq HW driver

2018-07-16 Thread Taniya Das





On 7/16/2018 10:42 AM, Viresh Kumar wrote:

On 12-07-18, 23:35, Taniya Das wrote:

+static int qcom_cpu_resources_init(struct platform_device *pdev,
+  struct device_node *np, unsigned int cpu)
+{
+   struct cpufreq_qcom *c;
+   struct resource res;
+   struct device *dev = >dev;
+   unsigned int offset, cpu_r;
+   int ret;
+
+   c = devm_kzalloc(dev, sizeof(*c), GFP_KERNEL);
+   if (!c)
+   return -ENOMEM;
+
+   c->reg_offset = of_device_get_match_data(>dev);
+   if (!c->reg_offset)
+   return -EINVAL;
+
+   if (of_address_to_resource(np, 0, ))
+   return -ENOMEM;
+
+   c->base = devm_ioremap(dev, res.start, resource_size());
+   if (!c->base) {
+   dev_err(dev, "Unable to map %s base\n", np->name);
+   return -ENOMEM;
+   }
+
+   offset = c->reg_offset[REG_ENABLE];
+
+   /* HW should be in enabled state to proceed */
+   if (!(readl_relaxed(c->base + offset) & 0x1)) {
+   dev_err(dev, "%s cpufreq hardware not enabled\n", np->name);
+   return -ENODEV;
+   }
+
+   ret = qcom_get_related_cpus(np, >related_cpus);
+   if (ret) {
+   dev_err(dev, "%s failed to get related CPUs\n", np->name);
+   return ret;
+   }
+
+   c->max_cores = cpumask_weight(>related_cpus);
+   if (!c->max_cores)
+   return -ENOENT;
+
+   ret = qcom_read_lut(pdev, c);
+   if (ret) {
+   dev_err(dev, "%s failed to read LUT\n", np->name);
+   return ret;
+   }
+
+   qcom_freq_domain_map[cpu] = c;
+
+   /* Related CPUs to keep a single copy */
+   cpu_r = cpumask_first(>related_cpus);
+   if (cpu != cpu_r) {
+   qcom_freq_domain_map[cpu] = qcom_freq_domain_map[cpu_r];
+   devm_kfree(dev, c);
+   }


Sorry about missing this, you have actually worked on my comments.

But I think this isn't the clever way of doing it. You allocate the
structures, fill everything and then finally free them because we were
related. Why can't we have similar check at the top of this routine
and skip everything then ?



Thanks Viresh, I actually replied to all the reviewers with the changes 
I have made in v5 series, sorry for not replying individually to all 
reviewers. Yes, I agree I should have put this check at the beginning of 
the function.


Please help review of the new series[v5] which takes care of the below.

- Remove mapping different register regions of perf/lut/enable,
  instead map the entire HW region.
- Add reg_offset/cpufreq_qcom_std_offsets to be supplied as device data.
- Check of src == 0 during lut read.
- Add of_node_put(cpu_np) in qcom_get_related_cpus
- Update the qcom_cpu_resources_init for register offset data,
  and cleanup the related cpus to keep a single copy of CPUfreq.
- Replace FW with HW, update Kconfig, rename filename qcom-cpufreq-hw.c

--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation.

--

Re: [PATCH v5 2/2] cpufreq: qcom-hw: Add support for QCOM cpufreq HW driver

2018-07-16 Thread Taniya Das


Hello Evan,

Thank you for testing the patch.

On 7/17/2018 4:32 AM, Evan Green wrote:

Hi Taniya,
On Thu, Jul 12, 2018 at 11:06 AM Taniya Das  wrote:


The CPUfreq HW present in some QCOM chipsets offloads the steps necessary
for changing the frequency of CPUs. The driver implements the cpufreq
driver interface for this hardware engine.

Signed-off-by: Saravana Kannan 
Signed-off-by: Taniya Das 
---
  drivers/cpufreq/Kconfig.arm   |  10 ++
  drivers/cpufreq/Makefile  |   1 +
  drivers/cpufreq/qcom-cpufreq-hw.c | 344 ++
  3 files changed, 355 insertions(+)
  create mode 100644 drivers/cpufreq/qcom-cpufreq-hw.c


...

diff --git a/drivers/cpufreq/qcom-cpufreq-hw.c 
b/drivers/cpufreq/qcom-cpufreq-hw.c
new file mode 100644
index 000..fa25a95
--- /dev/null
+++ b/drivers/cpufreq/qcom-cpufreq-hw.c

...

+static int qcom_cpufreq_hw_cpu_init(struct cpufreq_policy *policy)
+{
+   struct cpufreq_qcom *c;
+
+   c = qcom_freq_domain_map[policy->cpu];
+   if (!c) {
+   pr_err("No scaling support for CPU%d\n", policy->cpu);
+   return -ENODEV;
+   }
+
+   cpumask_copy(policy->cpus, >related_cpus);
+
+   policy->fast_switch_possible = true;
+   policy->freq_table = c->table;
+   policy->driver_data = c;
+
+   return 0;

I haven't looked at this driver in detail, but I have tested it.
Instead of the line above, I needed:

return cpufreq_table_validate_and_show(policy, c->table);

Without this the framework thinks that the min and max frequencies are
zero, and then you get a warning about an invalid table. I also
removed "policy->freq_table = c->table", since validate_and_show does
this.
-Evan



The API is no longer supported,

https://patchwork.kernel.org/patch/10320897/

If you are using the 4.14 kernel, you do need to replace

-policy->freq_table = c->table;
+cpufreq_table_validate_and_show(policy, c->table);

--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation.

--

Re: [PATCH v5 2/2] cpufreq: qcom-hw: Add support for QCOM cpufreq HW driver

2018-07-16 Thread Taniya Das


Hello Evan,

Thank you for testing the patch.

On 7/17/2018 4:32 AM, Evan Green wrote:

Hi Taniya,
On Thu, Jul 12, 2018 at 11:06 AM Taniya Das  wrote:


The CPUfreq HW present in some QCOM chipsets offloads the steps necessary
for changing the frequency of CPUs. The driver implements the cpufreq
driver interface for this hardware engine.

Signed-off-by: Saravana Kannan 
Signed-off-by: Taniya Das 
---
  drivers/cpufreq/Kconfig.arm   |  10 ++
  drivers/cpufreq/Makefile  |   1 +
  drivers/cpufreq/qcom-cpufreq-hw.c | 344 ++
  3 files changed, 355 insertions(+)
  create mode 100644 drivers/cpufreq/qcom-cpufreq-hw.c


...

diff --git a/drivers/cpufreq/qcom-cpufreq-hw.c 
b/drivers/cpufreq/qcom-cpufreq-hw.c
new file mode 100644
index 000..fa25a95
--- /dev/null
+++ b/drivers/cpufreq/qcom-cpufreq-hw.c

...

+static int qcom_cpufreq_hw_cpu_init(struct cpufreq_policy *policy)
+{
+   struct cpufreq_qcom *c;
+
+   c = qcom_freq_domain_map[policy->cpu];
+   if (!c) {
+   pr_err("No scaling support for CPU%d\n", policy->cpu);
+   return -ENODEV;
+   }
+
+   cpumask_copy(policy->cpus, >related_cpus);
+
+   policy->fast_switch_possible = true;
+   policy->freq_table = c->table;
+   policy->driver_data = c;
+
+   return 0;

I haven't looked at this driver in detail, but I have tested it.
Instead of the line above, I needed:

return cpufreq_table_validate_and_show(policy, c->table);

Without this the framework thinks that the min and max frequencies are
zero, and then you get a warning about an invalid table. I also
removed "policy->freq_table = c->table", since validate_and_show does
this.
-Evan



The API is no longer supported,

https://patchwork.kernel.org/patch/10320897/

If you are using the 4.14 kernel, you do need to replace

-policy->freq_table = c->table;
+cpufreq_table_validate_and_show(policy, c->table);

--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation.

--

Re: [PATCH V2] ARM: dts: make pfuze switch always-on for imx platforms

2018-07-16 Thread Shawn Guo

On Fri, Jul 13, 2018 at 02:16:55AM +, Anson Huang wrote:
> Hi, Shawn
>   Although the commit 5fe156f1cab4 ("regulator: pfuze100: add 
> enable/disable for switch") is reverted to avoid the boot failure on some 
> i.MX platforms, but adding the "regulator-always-on" property for those 
> pfuze's critical switches are the right way and making sense, no matter how 
> the pfuze regulator's switch ON/OFF function will be implemented, below 
> patches should can be applied anyway?
> 
> ARM: dts: imx6sll-evk: make pfuze100 sw4 always on
> ARM: dts: make pfuze switch always-on for imx platforms
> ARM: dts: imx6sl-evk: keep sw4 always on
> 
>   Let me know your thoughts, thanks!

Okay, resend them as a series with commit log updated, since the commit
log doesn't reflect the current situation any longer.

Shawn

Re: [PATCH V2] ARM: dts: make pfuze switch always-on for imx platforms

2018-07-16 Thread Shawn Guo

On Fri, Jul 13, 2018 at 02:16:55AM +, Anson Huang wrote:
> Hi, Shawn
>   Although the commit 5fe156f1cab4 ("regulator: pfuze100: add 
> enable/disable for switch") is reverted to avoid the boot failure on some 
> i.MX platforms, but adding the "regulator-always-on" property for those 
> pfuze's critical switches are the right way and making sense, no matter how 
> the pfuze regulator's switch ON/OFF function will be implemented, below 
> patches should can be applied anyway?
> 
> ARM: dts: imx6sll-evk: make pfuze100 sw4 always on
> ARM: dts: make pfuze switch always-on for imx platforms
> ARM: dts: imx6sl-evk: keep sw4 always on
> 
>   Let me know your thoughts, thanks!

Okay, resend them as a series with commit log updated, since the commit
log doesn't reflect the current situation any longer.

Shawn

Re: [PATCH 00/39 v7] PTI support for x86-32

2018-07-16 Thread David H. Gutteridge

On Wed, 2018-07-11 at 13:29 +0200, Joerg Roedel wrote:
> Hi,
> 
> here is version 7 of my patches to enable PTI on x86-32.
> Changes to the previous version are:
> 
>   * Rebased to v4.18-rc4
> 
>   * Introduced pti_finalize() which is called after
> mark_readonly() and used to update the kernel
> mappings in the user page-table after RO/NX
> protections are in place.
> 
> The patches need the vmalloc/ioremap fixes in tip/x86/mm to
> work correctly, because this enablement makes the issues
> fixed there more likely to happen.

Hi Joerg & *,

I redid my testing on bare metal and in a VM (as with my previous
testing
efforts: https://lkml.org/lkml/2018/2/19/844, same setups
and coverage,
plus CONFIG_X86_DEBUG_ENTRY_CR3 enabled too) with the
pti-x32-v7 branch,
and I didn't encounter any issues. The two DRM
drivers that were
triggering bugs in some of the prior iterations
are both behaving
properly for me.

Regards,

Dave

Re: [PATCH 00/39 v7] PTI support for x86-32

2018-07-16 Thread David H. Gutteridge

On Wed, 2018-07-11 at 13:29 +0200, Joerg Roedel wrote:
> Hi,
> 
> here is version 7 of my patches to enable PTI on x86-32.
> Changes to the previous version are:
> 
>   * Rebased to v4.18-rc4
> 
>   * Introduced pti_finalize() which is called after
> mark_readonly() and used to update the kernel
> mappings in the user page-table after RO/NX
> protections are in place.
> 
> The patches need the vmalloc/ioremap fixes in tip/x86/mm to
> work correctly, because this enablement makes the issues
> fixed there more likely to happen.

Hi Joerg & *,

I redid my testing on bare metal and in a VM (as with my previous
testing
efforts: https://lkml.org/lkml/2018/2/19/844, same setups
and coverage,
plus CONFIG_X86_DEBUG_ENTRY_CR3 enabled too) with the
pti-x32-v7 branch,
and I didn't encounter any issues. The two DRM
drivers that were
triggering bugs in some of the prior iterations
are both behaving
properly for me.

Regards,

Dave

[PATCH 02/32] staging: gasket: fix typo in apex_enter_reset

2018-07-16 Thread Todd Poynor

From: Todd Poynor 

Fix typo in log message.

Signed-off-by: Zhongze Hu 
Signed-off-by: Todd Poynor 
---
 drivers/staging/gasket/apex_driver.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/staging/gasket/apex_driver.c 
b/drivers/staging/gasket/apex_driver.c
index cca4cf491a583..a31937dfff836 100644
--- a/drivers/staging/gasket/apex_driver.c
+++ b/drivers/staging/gasket/apex_driver.c
@@ -488,7 +488,7 @@ static int apex_enter_reset(struct gasket_dev *gasket_dev, 
uint type)
APEX_BAR2_REG_USER_HIB_DMA_PAUSED, 1, 1,
APEX_RESET_DELAY, APEX_RESET_RETRY)) {
gasket_log_error(gasket_dev,
-"DMAs did not quiece within timeout (%d ms)",
+"DMAs did not quiesce within timeout (%d ms)",
 APEX_RESET_RETRY * APEX_RESET_DELAY);
return -EINVAL;
}
-- 
2.18.0.203.gfac676dfb9-goog

[PATCH 02/32] staging: gasket: fix typo in apex_enter_reset

2018-07-16 Thread Todd Poynor

From: Todd Poynor 

Fix typo in log message.

Signed-off-by: Zhongze Hu 
Signed-off-by: Todd Poynor 
---
 drivers/staging/gasket/apex_driver.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/staging/gasket/apex_driver.c 
b/drivers/staging/gasket/apex_driver.c
index cca4cf491a583..a31937dfff836 100644
--- a/drivers/staging/gasket/apex_driver.c
+++ b/drivers/staging/gasket/apex_driver.c
@@ -488,7 +488,7 @@ static int apex_enter_reset(struct gasket_dev *gasket_dev, 
uint type)
APEX_BAR2_REG_USER_HIB_DMA_PAUSED, 1, 1,
APEX_RESET_DELAY, APEX_RESET_RETRY)) {
gasket_log_error(gasket_dev,
-"DMAs did not quiece within timeout (%d ms)",
+"DMAs did not quiesce within timeout (%d ms)",
 APEX_RESET_RETRY * APEX_RESET_DELAY);
return -EINVAL;
}
-- 
2.18.0.203.gfac676dfb9-goog

[PATCH 10/32] staging: gasket: fix gasket_wait_with_reschedule timeout return code

2018-07-16 Thread Todd Poynor

From: Todd Poynor 

Return -ETIMEDOUT, not -EINVAL, on timeout, including callers.

Reported-by: Dmitry Torokhov 
Signed-off-by: Zhongze Hu 
Signed-off-by: Todd Poynor 
---
 drivers/staging/gasket/apex_driver.c | 8 
 drivers/staging/gasket/gasket_core.c | 2 +-
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/staging/gasket/apex_driver.c 
b/drivers/staging/gasket/apex_driver.c
index a31937dfff836..3a83c3d4d5561 100644
--- a/drivers/staging/gasket/apex_driver.c
+++ b/drivers/staging/gasket/apex_driver.c
@@ -490,7 +490,7 @@ static int apex_enter_reset(struct gasket_dev *gasket_dev, 
uint type)
gasket_log_error(gasket_dev,
 "DMAs did not quiesce within timeout (%d ms)",
 APEX_RESET_RETRY * APEX_RESET_DELAY);
-   return -EINVAL;
+   return -ETIMEDOUT;
}
 
/*  - Enable GCB reset (0x1 to rg_rst_gcb) */
@@ -513,7 +513,7 @@ static int apex_enter_reset(struct gasket_dev *gasket_dev, 
uint type)
gasket_dev,
"RAM did not shut down within timeout (%d ms)",
APEX_RESET_RETRY * APEX_RESET_DELAY);
-   return -EINVAL;
+   return -ETIMEDOUT;
}
 
return 0;
@@ -562,7 +562,7 @@ static int apex_quit_reset(struct gasket_dev *gasket_dev, 
uint type)
gasket_dev,
"RAM did not enable within timeout (%d ms)",
APEX_RESET_RETRY * APEX_RESET_DELAY);
-   return -EINVAL;
+   return -ETIMEDOUT;
}
 
/*- Wait for Reset complete. */
@@ -574,7 +574,7 @@ static int apex_quit_reset(struct gasket_dev *gasket_dev, 
uint type)
gasket_dev,
"GCB did not leave reset within timeout (%d ms)",
APEX_RESET_RETRY * APEX_RESET_DELAY);
-   return -EINVAL;
+   return -ETIMEDOUT;
}
 
if (!allow_hw_clock_gating) {
diff --git a/drivers/staging/gasket/gasket_core.c 
b/drivers/staging/gasket/gasket_core.c
index 248d717e1df65..803566229bfcb 100644
--- a/drivers/staging/gasket/gasket_core.c
+++ b/drivers/staging/gasket/gasket_core.c
@@ -2106,7 +2106,7 @@ int gasket_wait_with_reschedule(
"%s timeout: reg %llx timeout (%llu ms)",
__func__,
offset, max_retries * delay_ms);
-   return -EINVAL;
+   return -ETIMEDOUT;
}
return 0;
 }
-- 
2.18.0.203.gfac676dfb9-goog

[PATCH 07/32] staging: gasket: Return EBUSY on mapping create when already in use

2018-07-16 Thread Todd Poynor

From: Todd Poynor 

gasket_sysfs_create_mapping() return EBUSY if sysfs mapping already in
use.

Signed-off-by: Zhongze Hu 
Signed-off-by: Todd Poynor 
---
 drivers/staging/gasket/gasket_sysfs.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/staging/gasket/gasket_sysfs.c 
b/drivers/staging/gasket/gasket_sysfs.c
index e3d770630961b..dd4d3aaa57e2f 100644
--- a/drivers/staging/gasket/gasket_sysfs.c
+++ b/drivers/staging/gasket/gasket_sysfs.c
@@ -194,7 +194,7 @@ int gasket_sysfs_create_mapping(
"0x%p.", device);
put_mapping(mapping);
mutex_unlock(_mutex);
-   return -EINVAL;
+   return -EBUSY;
}
 
/* Find the first empty entry in the array. */
-- 
2.18.0.203.gfac676dfb9-goog

[PATCH 05/32] staging: gasket: remove driver registration on class creation failure

2018-07-16 Thread Todd Poynor

From: Todd Poynor 

If class_create() fails, remove the gasket driver from the global
registration table.

Signed-off-by: Zhongze Hu 
Signed-off-by: Todd Poynor 
---
 drivers/staging/gasket/gasket_core.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/staging/gasket/gasket_core.c 
b/drivers/staging/gasket/gasket_core.c
index b69b630f1b79b..cbadab7544c81 100644
--- a/drivers/staging/gasket/gasket_core.c
+++ b/drivers/staging/gasket/gasket_core.c
@@ -335,7 +335,8 @@ int gasket_register_device(const struct gasket_driver_desc 
*driver_desc)
if (IS_ERR_OR_NULL(internal->class)) {
gasket_nodev_error("Cannot register %s class [ret=%ld]",
   driver_desc->name, PTR_ERR(internal->class));
-   return PTR_ERR(internal->class);
+   ret = PTR_ERR(internal->class);
+   goto unregister_gasket_driver;
}
 
/*
@@ -369,6 +370,7 @@ int gasket_register_device(const struct gasket_driver_desc 
*driver_desc)
 fail1:
class_destroy(internal->class);
 
+unregister_gasket_driver:
g_descs[desc_idx].driver_desc = NULL;
return ret;
 }
-- 
2.18.0.203.gfac676dfb9-goog

[PATCH 10/32] staging: gasket: fix gasket_wait_with_reschedule timeout return code

2018-07-16 Thread Todd Poynor

From: Todd Poynor 

Return -ETIMEDOUT, not -EINVAL, on timeout, including callers.

Reported-by: Dmitry Torokhov 
Signed-off-by: Zhongze Hu 
Signed-off-by: Todd Poynor 
---
 drivers/staging/gasket/apex_driver.c | 8 
 drivers/staging/gasket/gasket_core.c | 2 +-
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/staging/gasket/apex_driver.c 
b/drivers/staging/gasket/apex_driver.c
index a31937dfff836..3a83c3d4d5561 100644
--- a/drivers/staging/gasket/apex_driver.c
+++ b/drivers/staging/gasket/apex_driver.c
@@ -490,7 +490,7 @@ static int apex_enter_reset(struct gasket_dev *gasket_dev, 
uint type)
gasket_log_error(gasket_dev,
 "DMAs did not quiesce within timeout (%d ms)",
 APEX_RESET_RETRY * APEX_RESET_DELAY);
-   return -EINVAL;
+   return -ETIMEDOUT;
}
 
/*  - Enable GCB reset (0x1 to rg_rst_gcb) */
@@ -513,7 +513,7 @@ static int apex_enter_reset(struct gasket_dev *gasket_dev, 
uint type)
gasket_dev,
"RAM did not shut down within timeout (%d ms)",
APEX_RESET_RETRY * APEX_RESET_DELAY);
-   return -EINVAL;
+   return -ETIMEDOUT;
}
 
return 0;
@@ -562,7 +562,7 @@ static int apex_quit_reset(struct gasket_dev *gasket_dev, 
uint type)
gasket_dev,
"RAM did not enable within timeout (%d ms)",
APEX_RESET_RETRY * APEX_RESET_DELAY);
-   return -EINVAL;
+   return -ETIMEDOUT;
}
 
/*- Wait for Reset complete. */
@@ -574,7 +574,7 @@ static int apex_quit_reset(struct gasket_dev *gasket_dev, 
uint type)
gasket_dev,
"GCB did not leave reset within timeout (%d ms)",
APEX_RESET_RETRY * APEX_RESET_DELAY);
-   return -EINVAL;
+   return -ETIMEDOUT;
}
 
if (!allow_hw_clock_gating) {
diff --git a/drivers/staging/gasket/gasket_core.c 
b/drivers/staging/gasket/gasket_core.c
index 248d717e1df65..803566229bfcb 100644
--- a/drivers/staging/gasket/gasket_core.c
+++ b/drivers/staging/gasket/gasket_core.c
@@ -2106,7 +2106,7 @@ int gasket_wait_with_reschedule(
"%s timeout: reg %llx timeout (%llu ms)",
__func__,
offset, max_retries * delay_ms);
-   return -EINVAL;
+   return -ETIMEDOUT;
}
return 0;
 }
-- 
2.18.0.203.gfac676dfb9-goog

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 1772 matches

Mail list logo