Re: [PATCH 1/2] mm: introduce put_user_page*(), placeholder versions

2018-12-03 Thread Mike Rapoport
Hi John,

Thanks for having documentation as a part of the patch. Some kernel-doc
nits below.

On Mon, Dec 03, 2018 at 04:17:19PM -0800, john.hubb...@gmail.com wrote:
> From: John Hubbard 
> 
> Introduces put_user_page(), which simply calls put_page().
> This provides a way to update all get_user_pages*() callers,
> so that they call put_user_page(), instead of put_page().
> 
> Also introduces put_user_pages(), and a few dirty/locked variations,
> as a replacement for release_pages(), and also as a replacement
> for open-coded loops that release multiple pages.
> These may be used for subsequent performance improvements,
> via batching of pages to be released.
> 
> This is the first step of fixing the problem described in [1]. The steps
> are:
> 
> 1) (This patch): provide put_user_page*() routines, intended to be used
>for releasing pages that were pinned via get_user_pages*().
> 
> 2) Convert all of the call sites for get_user_pages*(), to
>invoke put_user_page*(), instead of put_page(). This involves dozens of
>call sites, and will take some time.
> 
> 3) After (2) is complete, use get_user_pages*() and put_user_page*() to
>implement tracking of these pages. This tracking will be separate from
>the existing struct page refcounting.
> 
> 4) Use the tracking and identification of these pages, to implement
>special handling (especially in writeback paths) when the pages are
>backed by a filesystem. Again, [1] provides details as to why that is
>desirable.
> 
> [1] https://lwn.net/Articles/753027/ : "The Trouble with get_user_pages()"
> 
> Reviewed-by: Jan Kara 
> 
> Cc: Matthew Wilcox 
> Cc: Michal Hocko 
> Cc: Christopher Lameter 
> Cc: Jason Gunthorpe 
> Cc: Dan Williams 
> Cc: Jan Kara 
> Cc: Al Viro 
> Cc: Jerome Glisse 
> Cc: Christoph Hellwig 
> Cc: Ralph Campbell 
> Signed-off-by: John Hubbard 
> ---
>  include/linux/mm.h | 20 
>  mm/swap.c  | 80 ++
>  2 files changed, 100 insertions(+)
> 
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index 5411de93a363..09fbb2c81aba 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -963,6 +963,26 @@ static inline void put_page(struct page *page)
>   __put_page(page);
>  }
> 
> +/*
> + * put_user_page() - release a page that had previously been acquired via
> + * a call to one of the get_user_pages*() functions.

Please add @page parameter description, otherwise kernel-doc is unhappy

> + *
> + * Pages that were pinned via get_user_pages*() must be released via
> + * either put_user_page(), or one of the put_user_pages*() routines
> + * below. This is so that eventually, pages that are pinned via
> + * get_user_pages*() can be separately tracked and uniquely handled. In
> + * particular, interactions with RDMA and filesystems need special
> + * handling.
> + */
> +static inline void put_user_page(struct page *page)
> +{
> + put_page(page);
> +}
> +
> +void put_user_pages_dirty(struct page **pages, unsigned long npages);
> +void put_user_pages_dirty_lock(struct page **pages, unsigned long npages);
> +void put_user_pages(struct page **pages, unsigned long npages);
> +
>  #if defined(CONFIG_SPARSEMEM) && !defined(CONFIG_SPARSEMEM_VMEMMAP)
>  #define SECTION_IN_PAGE_FLAGS
>  #endif
> diff --git a/mm/swap.c b/mm/swap.c
> index aa483719922e..bb8c32595e5f 100644
> --- a/mm/swap.c
> +++ b/mm/swap.c
> @@ -133,6 +133,86 @@ void put_pages_list(struct list_head *pages)
>  }
>  EXPORT_SYMBOL(put_pages_list);
> 
> +typedef int (*set_dirty_func)(struct page *page);
> +
> +static void __put_user_pages_dirty(struct page **pages,
> +unsigned long npages,
> +set_dirty_func sdf)
> +{
> + unsigned long index;
> +
> + for (index = 0; index < npages; index++) {
> + struct page *page = compound_head(pages[index]);
> +
> + if (!PageDirty(page))
> + sdf(page);
> +
> + put_user_page(page);
> + }
> +}
> +
> +/*
> + * put_user_pages_dirty() - for each page in the @pages array, make
> + * that page (or its head page, if a compound page) dirty, if it was
> + * previously listed as clean. Then, release the page using
> + * put_user_page().
> + *
> + * Please see the put_user_page() documentation for details.
> + *
> + * set_page_dirty(), which does not lock the page, is used here.
> + * Therefore, it is the caller's responsibility to ensure that this is
> + * safe. If not, then put_user_pages_dirty_lock() should be called instead.
> + *
> + * @pages:  array of pages to be marked dirty and released.
> + * @npages: number of pages in the @pages array.

Please put the parameters description next to the brief function
description, as described in [1]

[1] 
https://www.kernel.org/doc/html/latest/doc-guide/kernel-doc.html#function-documentation


> + *
> + */
> +void put_user_pages_dirty(struct page **pages, unsigned long npages)
> +{
> 

RE: rcu_preempt caused oom

2018-12-03 Thread He, Bo
Hi, Paul:
the enclosed is the log trigger the 120s hung_task_panic without other debug 
patches, the hung task is blocked at __wait_rcu_gp, it means the rcu_cpu_stall 
can't detect the scenario:
echo 1 > /proc/sys/kernel/panic_on_rcu_stall
echo 7 > /sys/module/rcupdate/parameters/rcu_cpu_stall_timeout


-Original Message-
From: Paul E. McKenney  
Sent: Monday, December 3, 2018 9:57 PM
To: He, Bo 
Cc: Steven Rostedt ; linux-kernel@vger.kernel.org; 
j...@joshtriplett.org; mathieu.desnoy...@efficios.com; jiangshan...@gmail.com; 
Zhang, Jun ; Xiao, Jin ; Zhang, Yanmin 

Subject: Re: rcu_preempt caused oom

On Mon, Dec 03, 2018 at 07:44:03AM +, He, Bo wrote:
> Thanks, we have run the test for the whole weekend and not reproduce the 
> issue,  so we confirm the CONFIG_RCU_BOOST can fix the issue.

Very good, that is encouraging.  Perhaps I should think about making 
CONFIG_RCU_BOOST=y the default for CONFIG_PREEMPT in mainline, at least for 
architectures for which rt_mutexes are implemented.

> We have enabled the rcupdate.rcu_cpu_stall_timeout=7 and also set panic on 
> rcu stall and will see if we can see the panic, will keep you posed with the 
> test results.
> echo 1 > /proc/sys/kernel/panic_on_rcu_stall

Looking forward to seeing what is going on!  Of course, to reproduce, you will 
need to again build with CONFIG_RCU_BOOST=n.

Thanx, Paul

> -Original Message-
> From: Paul E. McKenney 
> Sent: Saturday, December 1, 2018 12:49 AM
> To: He, Bo 
> Cc: Steven Rostedt ; 
> linux-kernel@vger.kernel.org; j...@joshtriplett.org; 
> mathieu.desnoy...@efficios.com; jiangshan...@gmail.com; Zhang, Jun 
> ; Xiao, Jin ; Zhang, Yanmin 
> 
> Subject: Re: rcu_preempt caused oom
> 
> On Fri, Nov 30, 2018 at 03:18:58PM +, He, Bo wrote:
> > Here is the kernel cmdline:
> 
> Thank you!
> 
> > Kernel command line: androidboot.acpio_idx=0
> > androidboot.bootloader=efiwrapper-02_03-userdebug_kernelflinger-06_0
> > 3- userdebug androidboot.diskbus=00.0 
> > androidboot.verifiedbootstate=green
> > androidboot.bootreason=power-on androidboot.serialno=R1J56L6006a7bb
> > g_ffs.iSerialNumber=R1J56L6006a7bb no_timer_check noxsaves 
> > reboot_panic=p,w i915.hpd_sense_invert=0x7 mem=2G nokaslr nopti 
> > ftrace_dump_on_oops trace_buf_size=1024K intel_iommu=off gpt
> > loglevel=4 androidboot.hardware=gordon_peak 
> > firmware_class.path=/vendor/firmware relative_sleep_states=1
> > enforcing=0 androidboot.selinux=permissive cpu_init_udelay=10 
> > androidboot.android_dt_dir=/sys/bus/platform/devices/ANDR0001:00/pro
> > pe rties/android/ pstore.backend=ramoops memmap=0x140$0x5000
> > ramoops.mem_address=0x5000 ramoops.mem_size=0x140
> > ramoops.record_size=0x4000 ramoops.console_size=0x100
> > ramoops.ftrace_size=0x1 ramoops.dump_oops=1 vga=current
> > i915.modeset=1 drm.atomic=1 i915.nuclear_pageflip=1 
> > drm.vblankoffdelay=
> 
> And no sign of any suppression of RCU CPU stall warnings.  Hmmm...
> It does take more than 21 seconds to OOM?  Or do things happen faster than 
> that?  If they do happen faster than that, then on approach would be to add 
> something like this to the kernel command line:
> 
>   rcupdate.rcu_cpu_stall_timeout=7
> 
> This would set the stall timeout to seven seconds.  Note that timeouts less 
> than three seconds are silently interpreted as three seconds.
> 
>   Thanx, Paul
> 
> > -Original Message-
> > From: Steven Rostedt 
> > Sent: Friday, November 30, 2018 11:17 PM
> > To: Paul E. McKenney 
> > Cc: He, Bo ; linux-kernel@vger.kernel.org; 
> > j...@joshtriplett.org; mathieu.desnoy...@efficios.com; 
> > jiangshan...@gmail.com; Zhang, Jun ; Xiao, Jin 
> > ; Zhang, Yanmin 
> > Subject: Re: rcu_preempt caused oom
> > 
> > On Fri, 30 Nov 2018 06:43:17 -0800
> > "Paul E. McKenney"  wrote:
> > 
> > > Could you please send me your list of kernel boot parameters?  
> > > They usually appear near the start of your console output.
> > 
> > Or just: cat /proc/cmdline
> > 
> > -- Steve
> > 
> 



apanic_console
Description: apanic_console


[PATCH 1/2] mm: introduce put_user_page*(), placeholder versions

2018-12-03 Thread john . hubbard
From: John Hubbard 

Introduces put_user_page(), which simply calls put_page().
This provides a way to update all get_user_pages*() callers,
so that they call put_user_page(), instead of put_page().

Also introduces put_user_pages(), and a few dirty/locked variations,
as a replacement for release_pages(), and also as a replacement
for open-coded loops that release multiple pages.
These may be used for subsequent performance improvements,
via batching of pages to be released.

This is the first step of fixing the problem described in [1]. The steps
are:

1) (This patch): provide put_user_page*() routines, intended to be used
   for releasing pages that were pinned via get_user_pages*().

2) Convert all of the call sites for get_user_pages*(), to
   invoke put_user_page*(), instead of put_page(). This involves dozens of
   call sites, and will take some time.

3) After (2) is complete, use get_user_pages*() and put_user_page*() to
   implement tracking of these pages. This tracking will be separate from
   the existing struct page refcounting.

4) Use the tracking and identification of these pages, to implement
   special handling (especially in writeback paths) when the pages are
   backed by a filesystem. Again, [1] provides details as to why that is
   desirable.

[1] https://lwn.net/Articles/753027/ : "The Trouble with get_user_pages()"

Reviewed-by: Jan Kara 

Cc: Matthew Wilcox 
Cc: Michal Hocko 
Cc: Christopher Lameter 
Cc: Jason Gunthorpe 
Cc: Dan Williams 
Cc: Jan Kara 
Cc: Al Viro 
Cc: Jerome Glisse 
Cc: Christoph Hellwig 
Cc: Ralph Campbell 
Signed-off-by: John Hubbard 
---
 include/linux/mm.h | 20 
 mm/swap.c  | 80 ++
 2 files changed, 100 insertions(+)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 5411de93a363..09fbb2c81aba 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -963,6 +963,26 @@ static inline void put_page(struct page *page)
__put_page(page);
 }
 
+/*
+ * put_user_page() - release a page that had previously been acquired via
+ * a call to one of the get_user_pages*() functions.
+ *
+ * Pages that were pinned via get_user_pages*() must be released via
+ * either put_user_page(), or one of the put_user_pages*() routines
+ * below. This is so that eventually, pages that are pinned via
+ * get_user_pages*() can be separately tracked and uniquely handled. In
+ * particular, interactions with RDMA and filesystems need special
+ * handling.
+ */
+static inline void put_user_page(struct page *page)
+{
+   put_page(page);
+}
+
+void put_user_pages_dirty(struct page **pages, unsigned long npages);
+void put_user_pages_dirty_lock(struct page **pages, unsigned long npages);
+void put_user_pages(struct page **pages, unsigned long npages);
+
 #if defined(CONFIG_SPARSEMEM) && !defined(CONFIG_SPARSEMEM_VMEMMAP)
 #define SECTION_IN_PAGE_FLAGS
 #endif
diff --git a/mm/swap.c b/mm/swap.c
index aa483719922e..bb8c32595e5f 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -133,6 +133,86 @@ void put_pages_list(struct list_head *pages)
 }
 EXPORT_SYMBOL(put_pages_list);
 
+typedef int (*set_dirty_func)(struct page *page);
+
+static void __put_user_pages_dirty(struct page **pages,
+  unsigned long npages,
+  set_dirty_func sdf)
+{
+   unsigned long index;
+
+   for (index = 0; index < npages; index++) {
+   struct page *page = compound_head(pages[index]);
+
+   if (!PageDirty(page))
+   sdf(page);
+
+   put_user_page(page);
+   }
+}
+
+/*
+ * put_user_pages_dirty() - for each page in the @pages array, make
+ * that page (or its head page, if a compound page) dirty, if it was
+ * previously listed as clean. Then, release the page using
+ * put_user_page().
+ *
+ * Please see the put_user_page() documentation for details.
+ *
+ * set_page_dirty(), which does not lock the page, is used here.
+ * Therefore, it is the caller's responsibility to ensure that this is
+ * safe. If not, then put_user_pages_dirty_lock() should be called instead.
+ *
+ * @pages:  array of pages to be marked dirty and released.
+ * @npages: number of pages in the @pages array.
+ *
+ */
+void put_user_pages_dirty(struct page **pages, unsigned long npages)
+{
+   __put_user_pages_dirty(pages, npages, set_page_dirty);
+}
+EXPORT_SYMBOL(put_user_pages_dirty);
+
+/*
+ * put_user_pages_dirty_lock() - for each page in the @pages array, make
+ * that page (or its head page, if a compound page) dirty, if it was
+ * previously listed as clean. Then, release the page using
+ * put_user_page().
+ *
+ * Please see the put_user_page() documentation for details.
+ *
+ * This is just like put_user_pages_dirty(), except that it invokes
+ * set_page_dirty_lock(), instead of set_page_dirty().
+ *
+ * @pages:  array of pages to be marked dirty and released.
+ * @npages: number of pages in the @pages array.
+ *
+ */
+void 

[PATCH 2/2] infiniband/mm: convert put_page() to put_user_page*()

2018-12-03 Thread john . hubbard
From: John Hubbard 

For infiniband code that retains pages via get_user_pages*(),
release those pages via the new put_user_page(), or
put_user_pages*(), instead of put_page()

This is a tiny part of the second step of fixing the problem described
in [1]. The steps are:

1) Provide put_user_page*() routines, intended to be used
   for releasing pages that were pinned via get_user_pages*().

2) Convert all of the call sites for get_user_pages*(), to
   invoke put_user_page*(), instead of put_page(). This involves dozens of
   call sites, and will take some time.

3) After (2) is complete, use get_user_pages*() and put_user_page*() to
   implement tracking of these pages. This tracking will be separate from
   the existing struct page refcounting.

4) Use the tracking and identification of these pages, to implement
   special handling (especially in writeback paths) when the pages are
   backed by a filesystem. Again, [1] provides details as to why that is
   desirable.

[1] https://lwn.net/Articles/753027/ : "The Trouble with get_user_pages()"

Reviewed-by: Jan Kara 
Reviewed-by: Dennis Dalessandro 
Acked-by: Jason Gunthorpe 

Cc: Doug Ledford 
Cc: Jason Gunthorpe 
Cc: Mike Marciniszyn 
Cc: Dennis Dalessandro 
Cc: Christian Benvenuti 
Signed-off-by: John Hubbard 
---
 drivers/infiniband/core/umem.c  |  7 ---
 drivers/infiniband/core/umem_odp.c  |  2 +-
 drivers/infiniband/hw/hfi1/user_pages.c | 11 ---
 drivers/infiniband/hw/mthca/mthca_memfree.c |  6 +++---
 drivers/infiniband/hw/qib/qib_user_pages.c  | 11 ---
 drivers/infiniband/hw/qib/qib_user_sdma.c   |  6 +++---
 drivers/infiniband/hw/usnic/usnic_uiom.c|  7 ---
 7 files changed, 23 insertions(+), 27 deletions(-)

diff --git a/drivers/infiniband/core/umem.c b/drivers/infiniband/core/umem.c
index c6144df47ea4..c2898bc7b3b2 100644
--- a/drivers/infiniband/core/umem.c
+++ b/drivers/infiniband/core/umem.c
@@ -58,9 +58,10 @@ static void __ib_umem_release(struct ib_device *dev, struct 
ib_umem *umem, int d
for_each_sg(umem->sg_head.sgl, sg, umem->npages, i) {
 
page = sg_page(sg);
-   if (!PageDirty(page) && umem->writable && dirty)
-   set_page_dirty_lock(page);
-   put_page(page);
+   if (umem->writable && dirty)
+   put_user_pages_dirty_lock(, 1);
+   else
+   put_user_page(page);
}
 
sg_free_table(>sg_head);
diff --git a/drivers/infiniband/core/umem_odp.c 
b/drivers/infiniband/core/umem_odp.c
index 676c1fd1119d..99715049cd3b 100644
--- a/drivers/infiniband/core/umem_odp.c
+++ b/drivers/infiniband/core/umem_odp.c
@@ -659,7 +659,7 @@ int ib_umem_odp_map_dma_pages(struct ib_umem_odp *umem_odp, 
u64 user_virt,
ret = -EFAULT;
break;
}
-   put_page(local_page_list[j]);
+   put_user_page(local_page_list[j]);
continue;
}
 
diff --git a/drivers/infiniband/hw/hfi1/user_pages.c 
b/drivers/infiniband/hw/hfi1/user_pages.c
index e341e6dcc388..99ccc0483711 100644
--- a/drivers/infiniband/hw/hfi1/user_pages.c
+++ b/drivers/infiniband/hw/hfi1/user_pages.c
@@ -121,13 +121,10 @@ int hfi1_acquire_user_pages(struct mm_struct *mm, 
unsigned long vaddr, size_t np
 void hfi1_release_user_pages(struct mm_struct *mm, struct page **p,
 size_t npages, bool dirty)
 {
-   size_t i;
-
-   for (i = 0; i < npages; i++) {
-   if (dirty)
-   set_page_dirty_lock(p[i]);
-   put_page(p[i]);
-   }
+   if (dirty)
+   put_user_pages_dirty_lock(p, npages);
+   else
+   put_user_pages(p, npages);
 
if (mm) { /* during close after signal, mm can be NULL */
down_write(>mmap_sem);
diff --git a/drivers/infiniband/hw/mthca/mthca_memfree.c 
b/drivers/infiniband/hw/mthca/mthca_memfree.c
index cc9c0c8ccba3..b8b12effd009 100644
--- a/drivers/infiniband/hw/mthca/mthca_memfree.c
+++ b/drivers/infiniband/hw/mthca/mthca_memfree.c
@@ -481,7 +481,7 @@ int mthca_map_user_db(struct mthca_dev *dev, struct 
mthca_uar *uar,
 
ret = pci_map_sg(dev->pdev, _tab->page[i].mem, 1, PCI_DMA_TODEVICE);
if (ret < 0) {
-   put_page(pages[0]);
+   put_user_page(pages[0]);
goto out;
}
 
@@ -489,7 +489,7 @@ int mthca_map_user_db(struct mthca_dev *dev, struct 
mthca_uar *uar,
 mthca_uarc_virt(dev, uar, i));
if (ret) {
pci_unmap_sg(dev->pdev, _tab->page[i].mem, 1, 
PCI_DMA_TODEVICE);
-   put_page(sg_page(_tab->page[i].mem));
+   put_user_page(sg_page(_tab->page[i].mem));
goto out;
}
 
@@ -555,7 +555,7 @@ void 

[PATCH 0/2] put_user_page*(): start converting the call sites

2018-12-03 Thread john . hubbard
From: John Hubbard 

Hi,

Summary: I'd like these two patches to go into the next convenient cycle.
I *think* that means 4.21.

Details

At the Linux Plumbers Conference, we talked about this approach [1], and
the primary lingering concern was over performance. Tom Talpey helped me
through a much more accurate run of the fio performance test, and now
it's looking like an under 1% performance cost, to add and remove pages
from the LRU (this is only paid when dealing with get_user_pages) [2]. So
we should be fine to start converting call sites.

This patchset gets the conversion started. Both patches already had a fair
amount of review.

(Tom, I'll add you Tested-by to the actual implementation that moves
pages on and off the LRU. These first two patches don't do that.)

[1] https://linuxplumbersconf.org/event/2/contributions/126/
"RDMA and get_user_pages"

[2] https://lore.kernel.org/r/79d1ee27-9ea0-3d15-3fc4-97c1bd79c...@talpey.com

John Hubbard (2):
  mm: introduce put_user_page*(), placeholder versions
  infiniband/mm: convert put_page() to put_user_page*()

 drivers/infiniband/core/umem.c  |  7 +-
 drivers/infiniband/core/umem_odp.c  |  2 +-
 drivers/infiniband/hw/hfi1/user_pages.c | 11 ++-
 drivers/infiniband/hw/mthca/mthca_memfree.c |  6 +-
 drivers/infiniband/hw/qib/qib_user_pages.c  | 11 ++-
 drivers/infiniband/hw/qib/qib_user_sdma.c   |  6 +-
 drivers/infiniband/hw/usnic/usnic_uiom.c|  7 +-
 include/linux/mm.h  | 20 ++
 mm/swap.c   | 80 +
 9 files changed, 123 insertions(+), 27 deletions(-)

-- 
2.19.2



RE: [PATCH V6 0/9] clk: add imx7ulp clk support

2018-12-03 Thread Aisheng DONG
> -Original Message-
> From: Stephen Boyd [mailto:sb...@kernel.org]
> Sent: Tuesday, December 4, 2018 3:32 AM
> To: Aisheng DONG ; linux-...@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org; linux-arm-ker...@lists.infradead.org;
> mturque...@baylibre.com; shawn...@kernel.org; Anson Huang
> ; Jacky Bai ; dl-linux-imx
> ; Aisheng DONG 
> Subject: Re: [PATCH V6 0/9] clk: add imx7ulp clk support
> 
> Quoting A.s. Dong (2018-11-14 05:01:31)
> > This patch series intends to add imx7ulp clk support.
> >
> > i.MX7ULP Clock functions are under joint control of the System Clock
> > Generation (SCG) modules, Peripheral Clock Control (PCC) modules, and
> > Core Mode Controller (CMC)1 blocks
> >
> > The clocking scheme provides clear separation between M4 domain and A7
> > domain. Except for a few clock sources shared between two domains,
> > such as the System Oscillator clock, the Slow IRC (SIRC), and and the
> > Fast IRC clock (FIRCLK), clock sources and clock management are
> > separated and contained within each domain.
> >
> > M4 clock management consists of SCG0, PCC0, PCC1, and CMC0 modules.
> > A7 clock management consists of SCG1, PCC2, PCC3, and CMC1 modules.
> >
> > Note: this series only adds A7 clock domain support as M4 clock domain
> > will be handled by M4 seperately.
> >
> 
> I got:
> 
> drivers/clk/imx/clk-pllv4.c:152:15: warning: symbol 'imx_clk_pllv4' was not
> declared. Should it be static?
> drivers/clk/imx/clk-pfdv2.c:166:15: warning: symbol 'imx_clk_pfdv2' was not
> declared. Should it be static?
> drivers/clk/imx/clk-divider-gate.c:174:15: warning: symbol
> 'imx_clk_divider_gate' was not declared. Should it be static?
> drivers/clk/imx/clk-composite-7ulp.c:22:15: warning: symbol
> 'imx7ulp_clk_composite' was not declared. Should it be static?
> 
> which I can fix easily by throwing in clk.h into each file.

Thanks, I will double check it when I back to office.

Regards
Dong Aisheng



Re: [PATCH] net: stmmac: convert to DEFINE_SHOW_ATTRIBUTE

2018-12-03 Thread David Miller
From: Yangtao Li 
Date: Mon,  3 Dec 2018 09:22:09 -0500

> Use DEFINE_SHOW_ATTRIBUTE macro to simplify the code.
> 
> Signed-off-by: Yangtao Li 

Applied.


Re: [PATCH v9 2/4] seccomp: switch system call argument type to void *

2018-12-03 Thread Tycho Andersen
On Mon, Dec 03, 2018 at 07:17:26PM -0700, Tycho Andersen wrote:
> On Tue, Dec 04, 2018 at 10:07:38AM +0800, kbuild test robot wrote:
> > Hi Tycho,
> > 
> > I love your patch! Yet something to improve:
> > 
> > [auto build test ERROR on linus/master]
> > [also build test ERROR on v4.20-rc5 next-20181203]
> > [if your patch is applied to the wrong git tree, please drop us a note to 
> > help improve the system]
> > 
> > url:
> > https://github.com/0day-ci/linux/commits/Tycho-Andersen/seccomp-hoist-struct-seccomp_data-recalculation-higher/20181204-013450
> > config: i386-randconfig-x005-201848 (attached as .config)
> > compiler: gcc-7 (Debian 7.3.0-1) 7.3.0
> > reproduce:
> > # save the attached .config to linux build tree
> > make ARCH=i386 
> > 
> > All errors (new ones prefixed by >>):
> > 
> >In file included from kernel/seccomp.c:28:0:
> > >> include/linux/syscalls.h:239:18: error: conflicting types for 
> > >> 'sys_seccomp'
> >  asmlinkage long sys##name(__MAP(x,__SC_DECL,__VA_ARGS__)) \
> >  ^
> >include/linux/syscalls.h:225:2: note: in expansion of macro 
> > '__SYSCALL_DEFINEx'
> >  __SYSCALL_DEFINEx(x, sname, __VA_ARGS__)
> >  ^
> >include/linux/syscalls.h:216:36: note: in expansion of macro 
> > 'SYSCALL_DEFINEx'
> > #define SYSCALL_DEFINE3(name, ...) SYSCALL_DEFINEx(3, _##name, 
> > __VA_ARGS__)
> >^~~
> >kernel/seccomp.c:946:1: note: in expansion of macro 'SYSCALL_DEFINE3'
> > SYSCALL_DEFINE3(seccomp, unsigned int, op, unsigned int, flags,
> > ^~~
> >In file included from kernel/seccomp.c:28:0:
> >include/linux/syscalls.h:881:17: note: previous declaration of 
> > 'sys_seccomp' was here
> > asmlinkage long sys_seccomp(unsigned int op, unsigned int flags,
> > ^~~
> 
> Huh, I have no idea why I don't see this, but even with the attached
> config it still doesn't cause a problem for me. Anyway, I'll fix it up
> and do some more investigating...

Oh, because it's "make ARCH=i386". Whoosh :)

Anyway, it's fixed for v10.

Tycho


Query regarding Spectre fixes for qemu-kvm...4.4 LTS Kernel.

2018-12-03 Thread Arackal, Paulose Kuriakose (STSD)
Hi,

I have few queries regarding qemu-kvm support of Spectre related fixes at 4.4.* 
LTS Kernel.

I see that in upstream kernels, svm_vcpu_run() calls  x86_spec_ctrl_set_guest() 
and  x86_spec_ctrl_restore_host().
And calling into x86_virt_spec_ctrl(), that sets IBRS/IBPB/SSBD bits 
accordingly for guest context.

Related commit IDs below:
commit 5cf687548705412da47c9cec342fd952d71ed3d5
commit ccbcd2674472a978b48c91c1fbfb66c0ff959f24

Looks like this change is not fully ported to 4.4 LTS yet. 
x86_spec_ctrl_set_guest() and  x86_spec_ctrl_restore_host() interfaces are 
available, however looks like  svm_vcpu_run() is not calling them.  
So qemu-kvm running on 4.4 kernels may not have SPEC_CTRL set properly in guest 
context.

Is there a plan to backport above changes fully into 4.4 LTS kernel?.

Thanks,
Paulose. 



Re: [PATCH v5 8/8] soc: qcom: rpmhpd: Mark mx as a parent for cx

2018-12-03 Thread Viresh Kumar
On 04-12-18, 10:51, Rajendra Nayak wrote:
> Specify the active + sleep and active-only MX power domains as
> the parents of the corresponding CX power domains. This will ensure that
> performance state requests on CX automatically generate equivalent requests
> on MX power domains.
> 
> This is used to enforce a requirement that exists for various
> hardware blocks on SDM845 that MX performance state >= CX performance
> state for a given operating frequency.
> 
> Signed-off-by: Rajendra Nayak 
> ---
> This patch is dependent on the series from
> Viresh [1] which adds support to propogate performance states across the
> power domain hierarchy which is still being reviewed.
> 
> [1] https://lkml.org/lkml/2018/11/26/333
> 
>  drivers/soc/qcom/rpmhpd.c | 11 +++
>  1 file changed, 11 insertions(+)

Acked-by: Viresh Kumar 

-- 
viresh


Re: [PATCH v2 1/5] devfreq: refactor set_target frequency function

2018-12-03 Thread Chanwoo Choi
Hi,

On 2018년 12월 04일 13:39, Chanwoo Choi wrote:
> Hi Lukasz,
> 
> On 2018년 12월 03일 23:31, Lukasz Luba wrote:
>> The refactoring is needed for the new client in devfreq: suspend.
>> To avoid code duplication, move it to the new local function
>> devfreq_set_target.
>>
>> The patch is based on earlier work by Tobias Jakobi.
> 
> As I already commented, Please remove it. You already mentioned it on 
> cover-letter.
> If you want to contain the contribution history of Tobias, you might better
> to add 'Signed-off-by' or others.

If you will fix it, feel free to add my tag:
Reviewed-by: Chanwoo Choi 

> 
>>
>> Suggested-by: Tobias Jakobi 
>> Suggested-by: Chanwoo Choi 
>> Signed-off-by: Lukasz Luba 
>> ---
>>  drivers/devfreq/devfreq.c | 62 
>> +++
>>  1 file changed, 36 insertions(+), 26 deletions(-)
>>
>> diff --git a/drivers/devfreq/devfreq.c b/drivers/devfreq/devfreq.c
>> index 1414130..a9fd61b 100644
>> --- a/drivers/devfreq/devfreq.c
>> +++ b/drivers/devfreq/devfreq.c
>> @@ -285,6 +285,40 @@ static int devfreq_notify_transition(struct devfreq 
>> *devfreq,
>>  return 0;
>>  }
>>  
>> +static int devfreq_set_target(struct devfreq *devfreq, unsigned long 
>> new_freq,
>> +  u32 flags)
>> +{
>> +struct devfreq_freqs freqs;
>> +unsigned long cur_freq;
>> +int err = 0;
>> +
>> +if (devfreq->profile->get_cur_freq)
>> +devfreq->profile->get_cur_freq(devfreq->dev.parent, _freq);
>> +else
>> +cur_freq = devfreq->previous_freq;
>> +
>> +freqs.old = cur_freq;
>> +freqs.new = new_freq;
>> +devfreq_notify_transition(devfreq, , DEVFREQ_PRECHANGE);
>> +
>> +err = devfreq->profile->target(devfreq->dev.parent, _freq, flags);
>> +if (err) {
>> +freqs.new = cur_freq;
>> +devfreq_notify_transition(devfreq, , DEVFREQ_POSTCHANGE);
>> +return err;
>> +}
>> +
>> +freqs.new = new_freq;
>> +devfreq_notify_transition(devfreq, , DEVFREQ_POSTCHANGE);
>> +
>> +if (devfreq_update_status(devfreq, new_freq))
>> +dev_err(>dev,
>> +"Couldn't update frequency transition information.\n");
>> +
>> +devfreq->previous_freq = new_freq;
>> +return err;
>> +}
>> +
>>  /* Load monitoring helper functions for governors use */
>>  
>>  /**
>> @@ -296,8 +330,7 @@ static int devfreq_notify_transition(struct devfreq 
>> *devfreq,
>>   */
>>  int update_devfreq(struct devfreq *devfreq)
>>  {
>> -struct devfreq_freqs freqs;
>> -unsigned long freq, cur_freq, min_freq, max_freq;
>> +unsigned long freq, min_freq, max_freq;
>>  int err = 0;
>>  u32 flags = 0;
>>  
>> @@ -333,31 +366,8 @@ int update_devfreq(struct devfreq *devfreq)
>>  flags |= DEVFREQ_FLAG_LEAST_UPPER_BOUND; /* Use LUB */
>>  }
>>  
>> -if (devfreq->profile->get_cur_freq)
>> -devfreq->profile->get_cur_freq(devfreq->dev.parent, _freq);
>> -else
>> -cur_freq = devfreq->previous_freq;
>> -
>> -freqs.old = cur_freq;
>> -freqs.new = freq;
>> -devfreq_notify_transition(devfreq, , DEVFREQ_PRECHANGE);
>> +return devfreq_set_target(devfreq, freq, flags);
>>  
>> -err = devfreq->profile->target(devfreq->dev.parent, , flags);
>> -if (err) {
>> -freqs.new = cur_freq;
>> -devfreq_notify_transition(devfreq, , DEVFREQ_POSTCHANGE);
>> -return err;
>> -}
>> -
>> -freqs.new = freq;
>> -devfreq_notify_transition(devfreq, , DEVFREQ_POSTCHANGE);
>> -
>> -if (devfreq_update_status(devfreq, freq))
>> -dev_err(>dev,
>> -"Couldn't update frequency transition information.\n");
>> -
>> -devfreq->previous_freq = freq;
>> -return err;
>>  }
>>  EXPORT_SYMBOL(update_devfreq);
>>  
>>
> 
> 


-- 
Best Regards,
Chanwoo Choi
Samsung Electronics


[PATCH v3] panic: Avoid the extra noise dmesg

2018-12-03 Thread Feng Tang
When kernel panic happens, it will first print the panic call stack,
then the ending msg like:

[   35.743249] ---[ end Kernel panic - not syncing: Fatal exception
[   35.749975] [ cut here ]

The above message are very useful for debugging.

But if system is configured to not reboot on panic, say the "panic_timeout"
parameter equals 0, it will likely print out many noisy message like
WARN() call stack for each and every CPU except the panic one, messages
like below:

WARNING: CPU: 1 PID: 280 at kernel/sched/core.c:1198 
set_task_cpu+0x183/0x190
Call Trace:

try_to_wake_up
default_wake_function
autoremove_wake_function
__wake_up_common
__wake_up_common_lock
__wake_up
wake_up_klogd_work_func
irq_work_run_list
irq_work_tick
update_process_times
tick_sched_timer
__hrtimer_run_queues
hrtimer_interrupt
smp_apic_timer_interrupt
apic_timer_interrupt

For people working in console mode, the screen will first show the panic
call stack, but immediately overridded by these noisy extra messages, which
makes debugging much more difficult, as the original context gets lost on
screen.

Also these noisy messages will confuse some users, as I have seen many bug
reporters posted the noisy message into bugzilla, instead of the real panic
call stack and context.

Removing the "local_irq_enable" will avoid the noisy message.

The justification for the removing is: when code runs to this point, it
means user has chosed to not reboot, or do any special handling by using
the panic notifier method, no much point in re-enabling the interrupt.

Signed-off-by: Feng Tang 
Cc: Thomas Gleixner 
Cc: Kees Cook 
Cc: Borislav Petkov 
Cc: sta...@kernel.org
---
 kernel/panic.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/kernel/panic.c b/kernel/panic.c
index f6d549a..a616e55 100644
--- a/kernel/panic.c
+++ b/kernel/panic.c
@@ -295,7 +295,6 @@ void panic(const char *fmt, ...)
}
 #endif
pr_emerg("---[ end Kernel panic - not syncing: %s ]---\n", buf);
-   local_irq_enable();
for (i = 0; ; i += PANIC_TIMER_STEP) {
touch_softlockup_watchdog();
if (i >= i_next) {
-- 
2.7.4



linux-next: manual merge of the akpm tree with the pm tree

2018-12-03 Thread Stephen Rothwell
Hi Andrew,

Today's linux-next merge of the akpm tree got a conflict in:

  fs/exec.c

between commit:

  67fe1224adc5 ("Revert "exec: make de_thread() freezable"")

from the pm tree and patch:

  "fs/: remove caller signal_pending branch predictions"

from the akpm tree.

I fixed it up (see below) and can carry the fix as necessary. This
is now fixed as far as linux-next is concerned, but any non trivial
conflicts should be mentioned to your upstream maintainer when your tree
is submitted for merging.  You may also want to consider cooperating
with the maintainer of the conflicting tree to minimise any particularly
complex conflicts.

-- 
Cheers,
Stephen Rothwell

diff --cc fs/exec.c
index ea7d439cf79e,044e296f2381..
--- a/fs/exec.c
+++ b/fs/exec.c
@@@ -1086,8 -1087,8 +1086,8 @@@ static int de_thread(struct task_struc
while (sig->notify_count) {
__set_current_state(TASK_KILLABLE);
spin_unlock_irq(lock);
 -  freezable_schedule();
 +  schedule();
-   if (unlikely(__fatal_signal_pending(tsk)))
+   if (__fatal_signal_pending(tsk))
goto killed;
spin_lock_irq(lock);
}
@@@ -1114,8 -1115,8 +1114,8 @@@
__set_current_state(TASK_KILLABLE);
write_unlock_irq(_lock);
cgroup_threadgroup_change_end(tsk);
 -  freezable_schedule();
 +  schedule();
-   if (unlikely(__fatal_signal_pending(tsk)))
+   if (__fatal_signal_pending(tsk))
goto killed;
}
  


pgpyJkxHzU7jW.pgp
Description: OpenPGP digital signature


Re: [PATCH] mm/alloc: fallback to first node if the wanted node offline

2018-12-03 Thread Pingfan Liu
On Tue, Dec 4, 2018 at 11:53 AM David Rientjes  wrote:
>
> On Tue, 4 Dec 2018, Pingfan Liu wrote:
>
> > diff --git a/include/linux/gfp.h b/include/linux/gfp.h
> > index 76f8db0..8324953 100644
> > --- a/include/linux/gfp.h
> > +++ b/include/linux/gfp.h
> > @@ -453,6 +453,8 @@ static inline int gfp_zonelist(gfp_t flags)
> >   */
> >  static inline struct zonelist *node_zonelist(int nid, gfp_t flags)
> >  {
> > + if (unlikely(!node_online(nid)))
> > + nid = first_online_node;
> >   return NODE_DATA(nid)->node_zonelists + gfp_zonelist(flags);
> >  }
> >
>
> So we're passing the node id from dev_to_node() to kmalloc which
> interprets that as the preferred node and then does node_zonelist() to
> find the zonelist at allocation time.
>
> What happens if we fix this in alloc_dr()?  Does anything else cause
> problems?
>
I think it is better to fix it mm, since it can protect any new
similar bug in future. While fixing in alloc_dr() just work at present

> And rather than using first_online_node, would next_online_node() work?
>
What is the gain? Is it for memory pressure on node0?

Thanks,
Pingfan

> I'm thinking about this:
>
> diff --git a/drivers/base/devres.c b/drivers/base/devres.c
> --- a/drivers/base/devres.c
> +++ b/drivers/base/devres.c
> @@ -100,6 +100,8 @@ static __always_inline struct devres * 
> alloc_dr(dr_release_t release,
> _size)))
> return NULL;
>
> +   if (unlikely(!node_online(nid)))
> +   nid = next_online_node(nid);
> dr = kmalloc_node_track_caller(tot_size, gfp, nid);
> if (unlikely(!dr))
> return NULL;


Re: [PATCH 3/9] tools/lib/traceevent: Install trace-seq.h API header file

2018-12-03 Thread Namhyung Kim
On Fri, Nov 30, 2018 at 10:44:06AM -0500, Steven Rostedt wrote:
> From: Tzvetomir Stoyanov 
> 
> This patch installs trace-seq.h header file on "make install".
> 
> Signed-off-by: Tzvetomir Stoyanov 
> Signed-off-by: Steven Rostedt (VMware) 
> ---
>  tools/lib/traceevent/Makefile | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/tools/lib/traceevent/Makefile b/tools/lib/traceevent/Makefile
> index adb16f845ab3..67fe5d7ef190 100644
> --- a/tools/lib/traceevent/Makefile
> +++ b/tools/lib/traceevent/Makefile
> @@ -285,7 +285,7 @@ define do_install_pkgconfig_file
>   fi
>  endef
>  
> -install_lib: all_cmd install_plugins install_pkgconfig
> +install_lib: all_cmd install_plugins install_headers install_pkgconfig
>   $(call QUIET_INSTALL, $(LIB_TARGET)) \
>   $(call do_install_mkdir,$(libdir_SQ)); \
>   cp -fpR $(LIB_INSTALL) $(DESTDIR)$(libdir_SQ)
> @@ -302,6 +302,7 @@ install_headers:
>   $(call QUIET_INSTALL, headers) \
>   $(call 
> do_install,event-parse.h,$(prefix)/include/traceevent,644); \
>   $(call 
> do_install,event-utils.h,$(prefix)/include/traceevent,644); \
> + $(call 
> do_install,trace-seq.h,$(prefix)/include/traceevent,644); \
>   $(call do_install,kbuffer.h,$(prefix)/include/traceevent,644)

Do you still wanna have 'traceevent' directory prefix?  I just
sometimes feel a bit annoying to type it. ;-)

Or you can rename it something like 'tep' or 'libtep' - and hopefully
having only single header file to include..

Thanks,
Namhyung


>  
>  install: install_lib
> -- 
> 2.19.1
> 
> 


Re: [PATCH v11 1/2] dt-bindings: cpufreq: Introduce QCOM CPUFREQ Firmware bindings

2018-12-03 Thread Stephen Boyd
Quoting Rob Herring (2018-12-03 15:09:07)
> On Sun, Dec 02, 2018 at 09:25:02AM +0530, Taniya Das wrote:
> > Add QCOM cpufreq firmware device bindings for Qualcomm Technology Inc's
> > SoCs. This is required for managing the cpu frequency transitions which are
> > controlled by the hardware engine.
> > 
> > Signed-off-by: Taniya Das 
> > ---
> >  .../bindings/cpufreq/cpufreq-qcom-hw.txt   | 172 
> > +
> >  1 file changed, 172 insertions(+)
> >  create mode 100644 
> > Documentation/devicetree/bindings/cpufreq/cpufreq-qcom-hw.txt
> > 
> > diff --git a/Documentation/devicetree/bindings/cpufreq/cpufreq-qcom-hw.txt 
> > b/Documentation/devicetree/bindings/cpufreq/cpufreq-qcom-hw.txt
> > new file mode 100644
> > index 000..2b82965
> > --- /dev/null
> > +++ b/Documentation/devicetree/bindings/cpufreq/cpufreq-qcom-hw.txt
> > @@ -0,0 +1,172 @@
> > +Qualcomm Technologies, Inc. CPUFREQ Bindings
> > +
> > +CPUFREQ HW is a hardware engine used by some Qualcomm Technologies, Inc. 
> > (QTI)
> > +SoCs to manage frequency in hardware. It is capable of controlling 
> > frequency
> > +for multiple clusters.
> > +
> > +Properties:
> > +- compatible
> > + Usage:  required
> > + Value type: 
> > + Definition: must be "qcom,cpufreq-hw".
> > +
> > +- clocks
> > + Usage:  required
> > + Value type:  From common clock binding.
> > + Definition: clock handle for XO clock and GPLL0 clock.
> > +
> > +- clock-names
> > + Usage:  required
> > + Value type:  From common clock binding.
> > + Definition: must be "xo", "alternate".
> > +
> > +- reg
> > + Usage:  required
> > + Value type: 
> > + Definition: Addresses and sizes for the memory of the HW bases in
> > + each frequency domain.
> > +- reg-names
> > + Usage:  Optional
> > + Value type: 
> > + Definition: Frequency domain name i.e.
> > + "freq-domain0", "freq-domain1".
> > +
> > +- freq-domain-cells:
> 
> #freq-domain-cells
> 
> Otherwise,
> 
> Reviewed-by: Rob Herring 

Or should it be #qcom,freq-domain-cells? That would match the same stem
of the property used in the cpu node.




Re: [PATCH] power: reset: gpio-poweroff: add ability to specific active and inactive delays

2018-12-03 Thread Moritz Fischer
On Mon, Dec 3, 2018 at 3:49 PM Rob Herring  wrote:
>
> On Sun, 11 Nov 2018 22:45:38 +0100, Heiko Stuebner wrote:
> > From: Heiko Stuebner 
> >
> > Similar to gpio-reset allow to specify active and inactive delays
> > while keeping the 100ms defaults that were used previously all the time.
> >
> > The dt-properties are named the same as in gpio-reset but get an "-ms"
> > suffix as properties should contain such a suffix specifying its unit.
> >
> > Signed-off-by: Heiko Stuebner 
> > ---
> >  .../devicetree/bindings/power/reset/gpio-poweroff.txt  |  2 ++
> >  drivers/power/reset/gpio-poweroff.c| 10 --
> >  2 files changed, 10 insertions(+), 2 deletions(-)
> >
>
> Reviewed-by: Rob Herring 
Reviewed-by: Moritz Fischer 


Re: [PATCH linux-next v2 6/6] ASoC: rsnd: add avb clocks

2018-12-03 Thread Kuninori Morimoto

HI Jiada

> There are AVB Counter Clocks in ADG, each clock has 12bits integral
> and 8 bits fractional dividers which operates with S0D1ϕ clock.
> 
> This patch registers 8 AVB Counter Clocks when clock-cells of
> rcar_sound node is 2,
> 
> Signed-off-by: Jiada Wang 
> ---
>  sound/soc/sh/rcar/adg.c  | 306 +--
>  sound/soc/sh/rcar/gen.c  |   9 ++
>  sound/soc/sh/rcar/rsnd.h |   9 ++
>  3 files changed, 315 insertions(+), 9 deletions(-)

Please update DT binding txt, too

Best regards
---
Kuninori Morimoto


[PATCH v5 03/13] iov_iter: pass void csum pointer to csum_and_copy_to_iter

2018-12-03 Thread Sagi Grimberg
From: Sagi Grimberg 

The single caller to csum_and_copy_to_iter is skb_copy_and_csum_datagram
and we are trying to unite its logic with skb_copy_datagram_iter by passing
a callback to the copy function that we want to apply. Thus, we need
to make the checksum pointer private to the function.

Acked-by: David S. Miller 
Signed-off-by: Sagi Grimberg 
---
 include/linux/uio.h | 2 +-
 lib/iov_iter.c  | 3 ++-
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/include/linux/uio.h b/include/linux/uio.h
index 55ce99ddb912..41d1f8d3313d 100644
--- a/include/linux/uio.h
+++ b/include/linux/uio.h
@@ -266,7 +266,7 @@ static inline void iov_iter_reexpand(struct iov_iter *i, 
size_t count)
 {
i->count = count;
 }
-size_t csum_and_copy_to_iter(const void *addr, size_t bytes, __wsum *csum, 
struct iov_iter *i);
+size_t csum_and_copy_to_iter(const void *addr, size_t bytes, void *csump, 
struct iov_iter *i);
 size_t csum_and_copy_from_iter(void *addr, size_t bytes, __wsum *csum, struct 
iov_iter *i);
 bool csum_and_copy_from_iter_full(void *addr, size_t bytes, __wsum *csum, 
struct iov_iter *i);
 
diff --git a/lib/iov_iter.c b/lib/iov_iter.c
index 7ebccb5c1637..db93531ca3e3 100644
--- a/lib/iov_iter.c
+++ b/lib/iov_iter.c
@@ -1432,10 +1432,11 @@ bool csum_and_copy_from_iter_full(void *addr, size_t 
bytes, __wsum *csum,
 }
 EXPORT_SYMBOL(csum_and_copy_from_iter_full);
 
-size_t csum_and_copy_to_iter(const void *addr, size_t bytes, __wsum *csum,
+size_t csum_and_copy_to_iter(const void *addr, size_t bytes, void *csump,
 struct iov_iter *i)
 {
const char *from = addr;
+   __wsum *csum = csump;
__wsum sum, next;
size_t off = 0;
sum = *csum;
-- 
2.17.1



[PATCH v5 02/13] datagram: open-code copy_page_to_iter

2018-12-03 Thread Sagi Grimberg
From: Sagi Grimberg 

This will be useful to consolidate skb_copy_and_hash_datagram_iter and
skb_copy_and_csum_datagram to a single code path.

Acked-by: David S. Miller 
Signed-off-by: Sagi Grimberg 
---
 net/core/datagram.c | 9 ++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/net/core/datagram.c b/net/core/datagram.c
index 57f3a6fcfc1e..abe642181b64 100644
--- a/net/core/datagram.c
+++ b/net/core/datagram.c
@@ -445,11 +445,14 @@ int skb_copy_datagram_iter(const struct sk_buff *skb, int 
offset,
 
end = start + skb_frag_size(frag);
if ((copy = end - offset) > 0) {
+   struct page *page = skb_frag_page(frag);
+   u8 *vaddr = kmap(page);
+
if (copy > len)
copy = len;
-   n = copy_page_to_iter(skb_frag_page(frag),
- frag->page_offset + offset -
- start, copy, to);
+   n = copy_to_iter(vaddr + frag->page_offset +
+offset - start, copy, to);
+   kunmap(page);
offset += n;
if (n != copy)
goto short_copy;
-- 
2.17.1



Re: [PATCH v11 2/2] cpufreq: qcom-hw: Add support for QCOM cpufreq HW driver

2018-12-03 Thread Viresh Kumar
Hi Taniya,

Sorry that I haven't been reviewing it much from last few iterations as I was
letting others get this into a better shape. Thanks for your efforts..

On 02-12-18, 09:25, Taniya Das wrote:
> +++ b/drivers/cpufreq/qcom-cpufreq-hw.c

> +struct cpufreq_qcom {
> + struct cpufreq_frequency_table *table;
> + void __iomem *perf_state_reg;
> + cpumask_t related_cpus;
> +};
> +
> +static struct cpufreq_qcom *qcom_freq_domain_map[NR_CPUS];

Now that the code is much more simplified, I am not sure if you need this
per-cpu structure at all. The only place where you are using it is in
qcom_cpufreq_hw_cpu_init() and probe(). Why not merge qcom_cpu_resources_init()
completely into qcom_cpufreq_hw_cpu_init() and get rid of this structure
entirely ?

-- 
viresh


Re: [PATCH] x86/boot: clear rsdp address in boot_params for broken loaders

2018-12-03 Thread Juergen Gross
On 04/12/2018 00:07, h...@zytor.com wrote:
> On December 3, 2018 2:38:11 AM PST, Juergen Gross  wrote:
>> In case a broken boot loader doesn't clear its struct boot_params clear
>> rsdp_addr in sanitize_boot_params().
>>
>> This fixes commit e6e094e053af75 ("x86/acpi, x86/boot: Take RSDP
>> address from boot params if available") e.g. for the case of a boot via
>> systemd-boot.
>>
>> Fixes: e6e094e053af75 ("x86/acpi, x86/boot: Take RSDP address from boot
>> params if available")
>> Reported-by: Gunnar Krueger 
>> Tested-by: Gunnar Krueger 
>> Signed-off-by: Juergen Gross 
>> ---
>> arch/x86/include/asm/bootparam_utils.h | 1 +
>> 1 file changed, 1 insertion(+)
>>
>> diff --git a/arch/x86/include/asm/bootparam_utils.h
>> b/arch/x86/include/asm/bootparam_utils.h
>> index a07ffd23e4dd..f6f6ef436599 100644
>> --- a/arch/x86/include/asm/bootparam_utils.h
>> +++ b/arch/x86/include/asm/bootparam_utils.h
>> @@ -36,6 +36,7 @@ static void sanitize_boot_params(struct boot_params
>> *boot_params)
>>   */
>>  if (boot_params->sentinel) {
>>  /* fields in boot_params are left uninitialized, clear them */
>> +boot_params->acpi_rsdp_addr = 0;
>>  memset(_params->ext_ramdisk_image, 0,
>> (char *)_params->efi_info -
>>  (char *)_params->ext_ramdisk_image);
> 
> Isn't this already covered by the memset()? If not, we should extend the 
> memset() to maximal coverage.

I'd like to send a followup patch doing that. And I'd like to not only
test sentinel for being non-zero, but all padding fields as well. This
should be 4.21 material, though.


Juergen


[PATCH] spi: lpspi: Add i.MX8 boards support for lpspi

2018-12-03 Thread Clark Wang
Add both ipg and per clock for lpspi to support i.MX8QM/QXP boards.

Signed-off-by: Clark Wang 
---
 drivers/spi/spi-fsl-lpspi.c | 52 +
 1 file changed, 41 insertions(+), 11 deletions(-)

diff --git a/drivers/spi/spi-fsl-lpspi.c b/drivers/spi/spi-fsl-lpspi.c
index 08dcc3c22e88..5802f188051b 100644
--- a/drivers/spi/spi-fsl-lpspi.c
+++ b/drivers/spi/spi-fsl-lpspi.c
@@ -80,7 +80,8 @@ struct lpspi_config {
 struct fsl_lpspi_data {
struct device *dev;
void __iomem *base;
-   struct clk *clk;
+   struct clk *clk_ipg;
+   struct clk *clk_per;
bool is_slave;
 
void *rx_buf;
@@ -147,8 +148,19 @@ static int lpspi_prepare_xfer_hardware(struct 
spi_controller *controller)
 {
struct fsl_lpspi_data *fsl_lpspi =
spi_controller_get_devdata(controller);
+   int ret;
+
+   ret = clk_prepare_enable(fsl_lpspi->clk_ipg);
+   if (ret)
+   return ret;
+
+   ret = clk_prepare_enable(fsl_lpspi->clk_per);
+   if (ret) {
+   clk_disable_unprepare(fsl_lpspi->clk_ipg);
+   return ret;
+   }
 
-   return clk_prepare_enable(fsl_lpspi->clk);
+   return 0;
 }
 
 static int lpspi_unprepare_xfer_hardware(struct spi_controller *controller)
@@ -156,7 +168,8 @@ static int lpspi_unprepare_xfer_hardware(struct 
spi_controller *controller)
struct fsl_lpspi_data *fsl_lpspi =
spi_controller_get_devdata(controller);
 
-   clk_disable_unprepare(fsl_lpspi->clk);
+   clk_disable_unprepare(fsl_lpspi->clk_ipg);
+   clk_disable_unprepare(fsl_lpspi->clk_per);
 
return 0;
 }
@@ -249,7 +262,7 @@ static int fsl_lpspi_set_bitrate(struct fsl_lpspi_data 
*fsl_lpspi)
unsigned int perclk_rate, scldiv;
u8 prescale;
 
-   perclk_rate = clk_get_rate(fsl_lpspi->clk);
+   perclk_rate = clk_get_rate(fsl_lpspi->clk_per);
for (prescale = 0; prescale < 8; prescale++) {
scldiv = perclk_rate /
 (clkdivs[prescale] * config.speed_hz) - 2;
@@ -522,15 +535,30 @@ static int fsl_lpspi_probe(struct platform_device *pdev)
goto out_controller_put;
}
 
-   fsl_lpspi->clk = devm_clk_get(>dev, "ipg");
-   if (IS_ERR(fsl_lpspi->clk)) {
-   ret = PTR_ERR(fsl_lpspi->clk);
+   fsl_lpspi->clk_per = devm_clk_get(>dev, "per");
+   if (IS_ERR(fsl_lpspi->clk_per)) {
+   ret = PTR_ERR(fsl_lpspi->clk_per);
+   goto out_controller_put;
+   }
+
+   fsl_lpspi->clk_ipg = devm_clk_get(>dev, "ipg");
+   if (IS_ERR(fsl_lpspi->clk_ipg)) {
+   ret = PTR_ERR(fsl_lpspi->clk_ipg);
+   goto out_controller_put;
+   }
+
+   ret = clk_prepare_enable(fsl_lpspi->clk_ipg);
+   if (ret) {
+   dev_err(>dev,
+   "can't enable lpspi ipg clock, ret=%d\n", ret);
goto out_controller_put;
}
 
-   ret = clk_prepare_enable(fsl_lpspi->clk);
+   ret = clk_prepare_enable(fsl_lpspi->clk_per);
if (ret) {
-   dev_err(>dev, "can't enable lpspi clock, ret=%d\n", ret);
+   dev_err(>dev,
+   "can't enable lpspi per clock, ret=%d\n", ret);
+   clk_disable_unprepare(fsl_lpspi->clk_ipg);
goto out_controller_put;
}
 
@@ -538,7 +566,8 @@ static int fsl_lpspi_probe(struct platform_device *pdev)
fsl_lpspi->txfifosize = 1 << (temp & 0x0f);
fsl_lpspi->rxfifosize = 1 << ((temp >> 8) & 0x0f);
 
-   clk_disable_unprepare(fsl_lpspi->clk);
+   clk_disable_unprepare(fsl_lpspi->clk_per);
+   clk_disable_unprepare(fsl_lpspi->clk_ipg);
 
ret = devm_spi_register_controller(>dev, controller);
if (ret < 0) {
@@ -560,7 +589,8 @@ static int fsl_lpspi_remove(struct platform_device *pdev)
struct fsl_lpspi_data *fsl_lpspi =
spi_controller_get_devdata(controller);
 
-   clk_disable_unprepare(fsl_lpspi->clk);
+   clk_disable_unprepare(fsl_lpspi->clk_per);
+   clk_disable_unprepare(fsl_lpspi->clk_ipg);
 
return 0;
 }
-- 
2.17.1



Re: [PATCH 5/5] i2c: mediatek: Add i2c compatible for MediaTek MT8183

2018-12-03 Thread Sean Wang
 於 2018年12月3日 週一 上午5:34寫道:
>
> From: qii wang 
>
> Add i2c compatible for MT8183. Compare to 2712 i2c controller, MT8183 has
> different registers, offsets, clock, and multi-user function.
>
> Signed-off-by: qii wang 
> ---
>  drivers/i2c/busses/i2c-mt65xx.c |  136 
> +--
>  1 file changed, 130 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/i2c/busses/i2c-mt65xx.c b/drivers/i2c/busses/i2c-mt65xx.c
> index 428ac99..6b979ab 100644
> --- a/drivers/i2c/busses/i2c-mt65xx.c
> +++ b/drivers/i2c/busses/i2c-mt65xx.c
> @@ -35,17 +35,23 @@
>  #include 
>
>  #define I2C_RS_TRANSFER(1 << 4)
> +#define I2C_ARB_LOST   (1 << 3)

it seems no one refers to the macro in the patch so it should be
better to be removed

>  #define I2C_HS_NACKERR (1 << 2)
>  #define I2C_ACKERR (1 << 1)
>  #define I2C_TRANSAC_COMP   (1 << 0)
>  #define I2C_TRANSAC_START  (1 << 0)
> +#define I2C_RESUME_ARBIT   (1 << 1)
>  #define I2C_RS_MUL_CNFG(1 << 15)
>  #define I2C_RS_MUL_TRIG(1 << 14)
> +#define I2C_HS_TIME_EN (1 << 7)
>  #define I2C_DCM_DISABLE0x
>  #define I2C_IO_CONFIG_OPEN_DRAIN   0x0003
>  #define I2C_IO_CONFIG_PUSH_PULL0x
>  #define I2C_SOFT_RST   0x0001
>  #define I2C_FIFO_ADDR_CLR  0x0001
> +#define I2C_FIFO_ADDR_CLRH 0x0002
> +#define I2C_FIFO_ADDR_CLR_MCH  0x0004
> +#define I2C_HFIFO_DATA 0x8208
>  #define I2C_DELAY_LEN  0x0002
>  #define I2C_ST_START_CON   0x8001
>  #define I2C_FS_START_CON   0x1800
> @@ -76,6 +82,8 @@
>  #define I2C_CONTROL_DIR_CHANGE  (0x1 << 4)
>  #define I2C_CONTROL_ACKERR_DET_EN   (0x1 << 5)
>  #define I2C_CONTROL_TRANSFER_LEN_CHANGE (0x1 << 6)
> +#define I2C_CONTROL_DMAACK_EN   (0x1 << 8)
> +#define I2C_CONTROL_ASYNC_MODE  (0x1 << 9)
>  #define I2C_CONTROL_WRAPPER (0x1 << 0)
>
>  #define I2C_DRV_NAME   "i2c-mt65xx"
> @@ -130,6 +138,15 @@ enum I2C_REGS_OFFSET {
> OFFSET_DEBUGCTRL,
> OFFSET_TRANSFER_LEN_AUX,
> OFFSET_CLOCK_DIV,
> +   /* MT8183 only regs */
> +   OFFSET_LTIMING,
> +   OFFSET_DATA_TIMING,
> +   OFFSET_MCU_INTR,
> +   OFFSET_HW_TIMEOUT,
> +   OFFSET_HFIFO_DATA,
> +   OFFSET_HFIFO_STAT,
> +   OFFSET_MULTI_DMA,
> +   OFFSET_ROLLBACK,
>  };
>
>  static const u16 mt_i2c_regs_v1[] = {
> @@ -159,6 +176,39 @@ enum I2C_REGS_OFFSET {
> [OFFSET_CLOCK_DIV] = 0x70,
>  };
>
> +static const u16 mt_i2c_regs_v2[] = {
> +   [OFFSET_DATA_PORT] = 0x0,
> +   [OFFSET_SLAVE_ADDR] = 0x4,
> +   [OFFSET_INTR_MASK] = 0x8,
> +   [OFFSET_INTR_STAT] = 0xc,
> +   [OFFSET_CONTROL] = 0x10,
> +   [OFFSET_TRANSFER_LEN] = 0x14,
> +   [OFFSET_TRANSAC_LEN] = 0x18,
> +   [OFFSET_DELAY_LEN] = 0x1c,
> +   [OFFSET_TIMING] = 0x20,
> +   [OFFSET_START] = 0x24,
> +   [OFFSET_EXT_CONF] = 0x28,
> +   [OFFSET_LTIMING] = 0x2c,
> +   [OFFSET_HS] = 0x30,
> +   [OFFSET_IO_CONFIG] = 0x34,
> +   [OFFSET_FIFO_ADDR_CLR] = 0x38,
> +   [OFFSET_DATA_TIMING] = 0x3c,
> +   [OFFSET_MCU_INTR] = 0x40,
> +   [OFFSET_TRANSFER_LEN_AUX] = 0x44,
> +   [OFFSET_CLOCK_DIV] = 0x48,
> +   [OFFSET_HW_TIMEOUT] = 0x4c,
> +   [OFFSET_SOFTRESET] = 0x50,
> +   [OFFSET_HFIFO_DATA] = 0x70,
> +   [OFFSET_DEBUGSTAT] = 0xe0,
> +   [OFFSET_DEBUGCTRL] = 0xe8,
> +   [OFFSET_FIFO_STAT] = 0xf4,
> +   [OFFSET_FIFO_THRESH] = 0xf8,
> +   [OFFSET_HFIFO_STAT] = 0xfc,
> +   [OFFSET_DCM_EN] = 0xf88,
> +   [OFFSET_MULTI_DMA] = 0xf8c,
> +   [OFFSET_ROLLBACK] = 0xf98,
> +};
> +
>  struct mtk_i2c_compatible {
> const struct i2c_adapter_quirks *quirks;
> const u16 *regs;
> @@ -168,6 +218,7 @@ struct mtk_i2c_compatible {
> unsigned char aux_len_reg: 1;
> unsigned char support_33bits: 1;
> unsigned char timing_adjust: 1;
> +   unsigned char dma_sync: 1;
>  };
>
>  struct mtk_i2c {
> @@ -181,8 +232,11 @@ struct mtk_i2c {
> struct clk *clk_main;   /* main clock for i2c bus */
> struct clk *clk_dma;/* DMA clock for i2c via DMA */
> struct clk *clk_pmic;   /* PMIC clock for i2c from PMIC */
> +   struct clk *clk_arb;/* Arbitrator clock for i2c */
> bool have_pmic; /* can use i2c pins from PMIC */
> bool use_push_pull; /* IO config push-pull mode */
> +   bool share_i3c; /* share i3c IP*/
> +   u32 ch_offset;  /* i2c multi-user channel offset */
>
> u16 irq_stat;   /* interrupt status */
> unsigned int clk_src_div;
> @@ -190,6 +244,7 @@ struct mtk_i2c {
> enum 

Re: [PATCH 3/3] arm64: ftrace: add cond_resched() to func ftrace_make_(call|nop)

2018-12-03 Thread Steven Rostedt
On Mon, 3 Dec 2018 22:51:52 +0100
Arnd Bergmann  wrote:

> On Mon, Dec 3, 2018 at 8:22 PM Will Deacon  wrote:
> >
> > Hi Anders,
> >
> > On Fri, Nov 30, 2018 at 04:09:56PM +0100, Anders Roxell wrote:  
> > > Both of those functions end up calling ftrace_modify_code(), which is
> > > expensive because it changes the page tables and flush caches.
> > > Microseconds add up because this is called in a loop for each dyn_ftrace
> > > record, and this triggers the softlockup watchdog unless we let it sleep
> > > occasionally.
> > > Rework so that we call cond_resched() before going into the
> > > ftrace_modify_code() function.
> > >
> > > Co-developed-by: Arnd Bergmann 
> > > Signed-off-by: Arnd Bergmann 
> > > Signed-off-by: Anders Roxell 
> > > ---
> > >  arch/arm64/kernel/ftrace.c | 10 ++
> > >  1 file changed, 10 insertions(+)  
> >
> > It sounds like you're running into issues with the existing code, but I'd
> > like to understand a bit more about exactly what you're seeing. Which part
> > of the ftrace patching is proving to be expensive?
> >
> > The page table manipulation only happens once per module when using PLTs,
> > and the cache maintenance is just a single line per patch site without an
> > IPI.
> >
> > Is it the loop in ftrace_replace_code() that is causing the hassle?  
> 
> Yes: with an allmodconfig kernel, the ftrace selftest calls 
> ftrace_replace_code
> to look >4 through ftrace_make_call/ftrace_make_nop, and these
> end up calling
> 
> static int __kprobes __aarch64_insn_write(void *addr, __le32 insn)
> {
> void *waddr = addr;
> unsigned long flags = 0;
> int ret;
> 
> raw_spin_lock_irqsave(_lock, flags);
> waddr = patch_map(addr, FIX_TEXT_POKE0);
> 
> ret = probe_kernel_write(waddr, , AARCH64_INSN_SIZE);
> 
> patch_unmap(FIX_TEXT_POKE0);
> raw_spin_unlock_irqrestore(_lock, flags);
> 
> return ret;
> }
> int __kprobes aarch64_insn_patch_text_nosync(void *addr, u32 insn)
> {
> u32 *tp = addr;
> int ret;
> 
> /* A64 instructions must be word aligned */
> if ((uintptr_t)tp & 0x3)
> return -EINVAL;
> 
> ret = aarch64_insn_write(tp, insn);
> if (ret == 0)
> __flush_icache_range((uintptr_t)tp,
>  (uintptr_t)tp + AARCH64_INSN_SIZE);
> 
> return ret;
> }
> 
> which seems to be where the main cost is. This is with inside of
> qemu, and with lots of debugging options (in particular
> kcov and ubsan) enabled, that make each function call
> more expensive.

I was thinking more about this. Would something like this work?

-- Steve

diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 8ef9fc226037..42e89397778b 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -2393,11 +2393,14 @@ void __weak ftrace_replace_code(int enable)
 {
struct dyn_ftrace *rec;
struct ftrace_page *pg;
+   bool schedulable;
int failed;
 
if (unlikely(ftrace_disabled))
return;
 
+   schedulable = !irqs_disabled() & !preempt_count();
+
do_for_each_ftrace_rec(pg, rec) {
 
if (rec->flags & FTRACE_FL_DISABLED)
@@ -2409,6 +2412,8 @@ void __weak ftrace_replace_code(int enable)
/* Stop processing */
return;
}
+   if (schedulable)
+   cond_resched();
} while_for_each_ftrace_rec();
 }
 



Re: [PATCH] x86/boot: clear rsdp address in boot_params for broken loaders

2018-12-03 Thread H. Peter Anvin
On 12/3/18 9:32 PM, Juergen Gross wrote:
> 
> I'd like to send a followup patch doing that. And I'd like to not only
> test sentinel for being non-zero, but all padding fields as well. This
> should be 4.21 material, though.
> 

No, you can't do that.  That breaks backwards compatibility.

-hpa



[PATCH] spi: lpspi: Add cs-gpio support

2018-12-03 Thread Clark Wang
Add cs-gpio feature for LPSPI. Use fsl_lpspi_prepare_message() and
fsl_lpspi_unprepare_message() to enable and control cs line.
These two functions will be only called at the beginning and the ending
of a message transfer.

Still support using the mode without cs-gpio. It depends on if attribute
cs-gpio has been configured in dts file.

Signed-off-by: Clark Wang 
---
 drivers/spi/spi-fsl-lpspi.c | 79 -
 1 file changed, 78 insertions(+), 1 deletion(-)

diff --git a/drivers/spi/spi-fsl-lpspi.c b/drivers/spi/spi-fsl-lpspi.c
index a7d01b79827b..c6fe3f94de19 100644
--- a/drivers/spi/spi-fsl-lpspi.c
+++ b/drivers/spi/spi-fsl-lpspi.c
@@ -9,6 +9,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -16,7 +17,9 @@
 #include 
 #include 
 #include 
+#include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -28,6 +31,10 @@
 
 #define FSL_LPSPI_RPM_TIMEOUT 50 /* 50ms */
 
+#define LPSPI_CS_ACTIVE1
+#define LPSPI_CS_INACTIVE  0
+#define LPSPI_CS_DELAY 100
+
 /* i.MX7ULP LPSPI registers */
 #define IMX7ULP_VERID  0x0
 #define IMX7ULP_PARAM  0x4
@@ -104,6 +111,8 @@ struct fsl_lpspi_data {
struct completion xfer_done;
 
bool slave_aborted;
+
+   int chipselect[0];
 };
 
 static const struct of_device_id fsl_lpspi_dt_ids[] = {
@@ -176,6 +185,48 @@ static int lpspi_unprepare_xfer_hardware(struct 
spi_controller *controller)
return 0;
 }
 
+static void fsl_lpspi_chipselect(struct spi_device *spi, bool enable)
+{
+   struct fsl_lpspi_data *fsl_lpspi =
+   spi_controller_get_devdata(spi->controller);
+   int gpio = fsl_lpspi->chipselect[spi->chip_select];
+
+   enable = (!!(spi->mode & SPI_CS_HIGH) == enable);
+
+   if (!gpio_is_valid(gpio))
+   return;
+
+   gpio_set_value_cansleep(gpio, enable);
+}
+
+static int fsl_lpspi_prepare_message(struct spi_controller *controller,
+   struct spi_message *msg)
+{
+   struct fsl_lpspi_data *fsl_lpspi =
+   spi_controller_get_devdata(controller);
+   struct spi_device *spi = msg->spi;
+   int gpio = fsl_lpspi->chipselect[spi->chip_select];
+
+   if (gpio_is_valid(gpio)) {
+   gpio_direction_output(gpio,
+   fsl_lpspi->config.mode & SPI_CS_HIGH ? 0 : 1);
+   }
+
+   fsl_lpspi_chipselect(spi, LPSPI_CS_ACTIVE);
+
+   return 0;
+}
+
+static int fsl_lpspi_unprepare_message(struct spi_controller *controller,
+   struct spi_message *msg)
+{
+   struct spi_device *spi = msg->spi;
+
+   fsl_lpspi_chipselect(spi, LPSPI_CS_INACTIVE);
+
+   return 0;
+}
+
 static void fsl_lpspi_write_tx_fifo(struct fsl_lpspi_data *fsl_lpspi)
 {
u8 txfifo_cnt;
@@ -512,10 +563,13 @@ static int fsl_lpspi_init_rpm(struct fsl_lpspi_data 
*fsl_lpspi)
 
 static int fsl_lpspi_probe(struct platform_device *pdev)
 {
+   struct device_node *np = pdev->dev.of_node;
struct fsl_lpspi_data *fsl_lpspi;
struct spi_controller *controller;
+   struct spi_imx_master *lpspi_platform_info =
+   dev_get_platdata(>dev);
struct resource *res;
-   int ret, irq;
+   int i, ret, irq;
u32 temp;
 
if (of_property_read_bool((>dev)->of_node, "spi-slave"))
@@ -539,6 +593,29 @@ static int fsl_lpspi_probe(struct platform_device *pdev)
fsl_lpspi->is_slave = of_property_read_bool((>dev)->of_node,
"spi-slave");
 
+   if (!fsl_lpspi->is_slave) {
+   for (i = 0; i < controller->num_chipselect; i++) {
+   int cs_gpio = of_get_named_gpio(np, "cs-gpios", i);
+
+   if (!gpio_is_valid(cs_gpio) && lpspi_platform_info)
+   cs_gpio = lpspi_platform_info->chipselect[i];
+
+   fsl_lpspi->chipselect[i] = cs_gpio;
+   if (!gpio_is_valid(cs_gpio))
+   continue;
+
+   ret = devm_gpio_request(>dev,
+   fsl_lpspi->chipselect[i], DRIVER_NAME);
+   if (ret) {
+   dev_err(>dev, "can't get cs gpios\n");
+   goto out_controller_put;
+   }
+   }
+
+   controller->prepare_message = fsl_lpspi_prepare_message;
+   controller->unprepare_message = fsl_lpspi_unprepare_message;
+   }
+
controller->transfer_one_message = fsl_lpspi_transfer_one_msg;
controller->prepare_transfer_hardware = lpspi_prepare_xfer_hardware;
controller->unprepare_transfer_hardware = lpspi_unprepare_xfer_hardware;
-- 
2.17.1



general protection fault in kvm_arch_vcpu_ioctl_run

2018-12-03 Thread syzbot

Hello,

syzbot found the following crash on:

HEAD commit:4b78317679c4 Merge branch 'x86-pti-for-linus' of git://git..
git tree:   upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=15e979f540
kernel config:  https://syzkaller.appspot.com/x/.config?x=4602730af4f872ef
dashboard link: https://syzkaller.appspot.com/bug?extid=39810e6c400efadfef71
compiler:   gcc (GCC) 8.0.1 20180413 (experimental)

Unfortunately, I don't have any reproducer for this crash yet.

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+39810e6c400efadfe...@syzkaller.appspotmail.com

kasan: GPF could be caused by NULL-ptr deref or user memory access
general protection fault:  [#1] PREEMPT SMP KASAN
CPU: 0 PID: 14932 Comm: syz-executor0 Not tainted 4.20.0-rc4+ #138
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011

RIP: 0010:kvm_apic_hw_enabled arch/x86/kvm/lapic.h:169 [inline]
RIP: 0010:vcpu_scan_ioapic arch/x86/kvm/x86.c:7449 [inline]
RIP: 0010:vcpu_enter_guest arch/x86/kvm/x86.c:7602 [inline]
RIP: 0010:vcpu_run arch/x86/kvm/x86.c:7874 [inline]
RIP: 0010:kvm_arch_vcpu_ioctl_run+0x5296/0x7320 arch/x86/kvm/x86.c:8074
Code: 03 00 00 48 89 f8 48 c1 e8 03 42 80 3c 20 00 0f 85 b4 1e 00 00 49 8b  
9f e0 03 00 00 48 8d bb 88 00 00 00 48 89 f8 48 c1 e8 03 <42> 80 3c 20 00  
0f 85 8a 1e 00 00 48 8b 9b 88 00 00 00 48 8d bb d8

RSP: 0018:88818b0bf530 EFLAGS: 00010206
RAX: 0011 RBX:  RCX: c9001302b000
RDX: 00cf RSI: 81103a68 RDI: 0088
RBP: 88818b0bf8d0 R08: 8881bfe8e0c0 R09: 0008
R10: 0028 R11: 810feb0f R12: dc00
R13:  R14: c90007ddfdb8 R15: 888188a18400
FS:  7ff919977700() GS:8881dae0() knlGS:
CS:  0010 DS:  ES:  CR0: 80050033
CR2: 7fc94713c518 CR3: 0001cdea CR4: 001426f0
DR0:  DR1:  DR2: 
DR3:  DR6: fffe0ff0 DR7: 0400
Call Trace:
 kvm_vcpu_ioctl+0x5c8/0x1150 arch/x86/kvm/../../../virt/kvm/kvm_main.c:2596
 vfs_ioctl fs/ioctl.c:46 [inline]
 file_ioctl fs/ioctl.c:509 [inline]
 do_vfs_ioctl+0x1de/0x1790 fs/ioctl.c:696
 ksys_ioctl+0xa9/0xd0 fs/ioctl.c:713
 __do_sys_ioctl fs/ioctl.c:720 [inline]
 __se_sys_ioctl fs/ioctl.c:718 [inline]
 __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:718
 do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
 entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x457569
Code: fd b3 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7  
48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff  
ff 0f 83 cb b3 fb ff c3 66 2e 0f 1f 84 00 00 00 00

RSP: 002b:7ff919976c78 EFLAGS: 0246 ORIG_RAX: 0010
RAX: ffda RBX: 0003 RCX: 00457569
RDX:  RSI: ae80 RDI: 0005
RBP: 0072bf00 R08:  R09: 
R10:  R11: 0246 R12: 7ff9199776d4
R13: 004c034e R14: 004d0d60 R15: 
Modules linked in:
kobject: 'loop5' (4f26f0d5): kobject_uevent_env
kobject: 'loop5' (4f26f0d5): fill_kobj_path: path  
= '/devices/virtual/block/loop5'

kobject: 'loop1' (10db8550): kobject_uevent_env
kobject: 'loop1' (10db8550): fill_kobj_path: path  
= '/devices/virtual/block/loop1'

kobject: 'loop3' (941a4e7a): kobject_uevent_env
kobject: 'loop3' (941a4e7a): fill_kobj_path: path  
= '/devices/virtual/block/loop3'

---[ end trace d7fab4e7c1a70214 ]---
RIP: 0010:kvm_apic_hw_enabled arch/x86/kvm/lapic.h:169 [inline]
RIP: 0010:vcpu_scan_ioapic arch/x86/kvm/x86.c:7449 [inline]
RIP: 0010:vcpu_enter_guest arch/x86/kvm/x86.c:7602 [inline]
RIP: 0010:vcpu_run arch/x86/kvm/x86.c:7874 [inline]
RIP: 0010:kvm_arch_vcpu_ioctl_run+0x5296/0x7320 arch/x86/kvm/x86.c:8074
Code: 03 00 00 48 89 f8 48 c1 e8 03 42 80 3c 20 00 0f 85 b4 1e 00 00 49 8b  
9f e0 03 00 00 48 8d bb 88 00 00 00 48 89 f8 48 c1 e8 03 <42> 80 3c 20 00  
0f 85 8a 1e 00 00 48 8b 9b 88 00 00 00 48 8d bb d8

kobject: 'loop4' (4a89aba1): kobject_uevent_env
kobject: 'loop4' (4a89aba1): fill_kobj_path: path  
= '/devices/virtual/block/loop4'

kobject: 'loop2' (704d7e59): kobject_uevent_env
kobject: 'loop2' (704d7e59): fill_kobj_path: path  
= '/devices/virtual/block/loop2'

kobject: 'loop4' (4a89aba1): kobject_uevent_env
kobject: 'loop4' (4a89aba1): fill_kobj_path: path  
= '/devices/virtual/block/loop4'

kobject: 'loop2' (704d7e59): kobject_uevent_env
RSP: 0018:88818b0bf530 EFLAGS: 00010206
RAX: 0011 RBX:  RCX: c9001302b000
kobject: 'loop2' (704d7e59): fill_kobj_path: path  
= '/devices/virtual/block/loop2'

RDX: 00cf RSI: 81103a68 RDI: 

Re: Strange hang with gcc 8 of kprobe multiple_kprobes test

2018-12-03 Thread Masami Hiramatsu
Hi Steve,

On Mon, 3 Dec 2018 21:18:07 -0500
Steven Rostedt  wrote:

> Hi Masami,
> 
> I started testing some of my new code and the system got into a
> strange state. Debugging further, I found the cause came from the
> kprobe tests. It became stranger to me that I could reproduce it with
> older kernels. I went back as far as 4.16 and it triggered. I thought
> this very strange because I ran this test on all those kernels in the
> past.
> 
> After a bit of hair pulling, I figured out what changed. I upgraded to
> gcc 8.1 (and I reproduce it with 8.2 as well). I convert back to gcc 7
> and the tests pass without issue.

OK, let me see.

> The issue that I notice when the system gets into this strange state is
> that I can't log into the box. Nor can I reboot. Basically it's
> anything to do with systemd just doesn't work (insert your jokes here
> now, and then let's move on).
> 
> I was able to narrow down what the exact function was that caused the
> issues and it is: update_vsyscall()
> 
> gcc 7 looks like this:
> 
> 81004bf0 :
> 81004bf0:   e8 0b cc 9f 00  callq  81a01800 
> <__fentry__>
> 81004bf1: R_X86_64_PC32 __fentry__-0x4
> 81004bf5:   48 8b 07mov(%rdi),%rax
> 81004bf8:   8b 15 96 5f 34 01   mov0x1345f96(%rip),%edx   
>  # 8234ab94 
> 81004bfa: R_X86_64_PC32 vclocks_used-0x4
> 81004bfe:   83 05 7b 84 6f 01 01addl   $0x1,0x16f847b(%rip)   
>  # 826fd080 
> 81004c00: R_X86_64_PC32 vsyscall_gtod_data-0x5
> 81004c05:   8b 48 24mov0x24(%rax),%ecx
> 81004c08:   b8 01 00 00 00  mov$0x1,%eax
> 81004c0d:   d3 e0   shl%cl,%eax
> 
> And gcc 8 looks like this:
> 
> 81004c90 :
> 81004c90:   e8 6b cb 9f 00  callq  81a01800 
> <__fentry__>
> 81004c91: R_X86_64_PC32 __fentry__-0x4
> 81004c95:   48 8b 07mov(%rdi),%rax
> 81004c98:   83 05 e1 93 6f 01 01addl   $0x1,0x16f93e1(%rip)   
>  # 826fe080 

Hm this is a RIP relative instruction, it should be modified by kprobes.

> 81004c9a: R_X86_64_PC32 vsyscall_gtod_data-0x5
> 81004c9f:   8b 50 24mov0x24(%rax),%edx
> 81004ca2:   8b 05 ec 5e 34 01   mov0x1345eec(%rip),%eax   
>  # 8234ab94 
> 81004ca4: R_X86_64_PC32 vclocks_used-0x4
> 
> The test adds a kprobe (optimized) at udpate_vsyscall+5. And will
> insert a jump on the two instructions after fentry. The difference
> between v7 and v8 is that v7 is touching vclocks_used and v8 is
> touching vsyscall_gtod_data.
> 
> Is there some black magic going on with the vsyscall area with
> vsyscall_gtod_data that is causing havoc when a kprobe is added there?

I think it might miss something when preprocessing RIP relative instruction.
Could you disable jump optimization as below and test what happen on
update_vsyscall+5 AND update_vsyscall+8? (RIP relative preprocess must
happen even if the jump optimization is disabled)

# echo 0 > /proc/sys/debug/kprobes-optimization


> I can dig a little more into this, but I'm currently at my HQ office
> with a lot of other objectives that I must get done, and I can't work
> on this much more this week.

OK, let me try to reproduce it in my environment.

> 
> I included my config (for my virt machine, which I was also able to
> trigger it with).

Thanks, but I think it should not depend on the kconfig.

> 
> The test that triggers this bug is:
> 
>  tools/testing/selftests/ftrace/test.d/kprobe/multiple_kprobes.tc
> 
> It runs the test fine, but other things just start to act up after I
> run it.

Yeah, thank you for digging it down. It is now much easier to me.

> 
> I notice that when I get into the state, journald and the dbus_daemon
> are constantly running. Perhaps the userspace time keeping went bad?

Yeah, I think so. Maybe addl instruction becomes broken.

Thank you,

-- 
Masami Hiramatsu 


Re: [PATCH] ubi: fastmap: Check each mapping only once

2018-12-03 Thread Martin Kepplinger

On 02.12.18 16:02, Richard Weinberger wrote:

Sasha,

Am Sonntag, 2. Dezember 2018, 15:35:43 CET schrieb Sasha Levin:

On Sun, Dec 02, 2018 at 11:50:33AM +, Sudip Mukherjee wrote:

Now queued up for 4.14.y, thanks.


can you *please* slow a little down?


True. It will really help if you can have some sort of fixed schedule
for stable release, like maybe stablerc is ready on Thursday or Friday
and release the stable on Monday. Having a weekend in stablerc will be
helpful for people like me who only get the time in weekends for
upstream or stable kernel.


Any sort of schedule will never work for everyone (for example, if it's
part of your paid job - you don't necessarily want to review stuff over
the weekend).


a schedule is not needed, but please give maintainers at least a chance
to react on stable inclusion request.
In this case Martin asked for inclusion on Monday and the patch was applied
two days later.


True, especially when the maintainer is asked a question as part of the 
patch.


I've already had the feeling that we'd need the other patch too, but in 
this case at least I should have searched for Fixes tags.


Greg, how about reminding people of Fixes tags in 
Documentation/process/stable-kernel-rules.rst ?


  martin


smime.p7s
Description: S/MIME cryptographic signature


[PATCH] pstore/ram: Avoid NULL deref in ftrace merging failure path

2018-12-03 Thread Kees Cook
Given corruption in the ftrace records, it might be possible to allocate
tmp_prz without assigning prz to it, but still marking it as needing to
be freed, which would cause at least a NULL dereference.

smatch warnings:
fs/pstore/ram.c:340 ramoops_pstore_read() error: we previously assumed 'prz' 
could be null (see line 255)

https://lists.01.org/pipermail/kbuild-all/2018-December/055528.html

Reported-by: Dan Carpenter 
Fixes: 2fbea82bbb89 ("pstore: Merge per-CPU ftrace records into one")
Cc: "Joel Fernandes (Google)" 
Signed-off-by: Kees Cook 
---
 fs/pstore/ram.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/pstore/ram.c b/fs/pstore/ram.c
index e6d9560ea455..96f7d32cd184 100644
--- a/fs/pstore/ram.c
+++ b/fs/pstore/ram.c
@@ -291,6 +291,7 @@ static ssize_t ramoops_pstore_read(struct pstore_record 
*record)
  GFP_KERNEL);
if (!tmp_prz)
return -ENOMEM;
+   prz = tmp_prz;
free_prz = true;
 
while (cxt->ftrace_read_cnt < cxt->max_ftrace_cnt) {
@@ -309,7 +310,6 @@ static ssize_t ramoops_pstore_read(struct pstore_record 
*record)
goto out;
}
record->id = 0;
-   prz = tmp_prz;
}
}
 
-- 
2.17.1


-- 
Kees Cook


Re: [PATCH v9 2/4] seccomp: switch system call argument type to void *

2018-12-03 Thread kbuild test robot
Hi Tycho,

I love your patch! Yet something to improve:

[auto build test ERROR on linus/master]
[also build test ERROR on v4.20-rc5 next-20181203]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Tycho-Andersen/seccomp-hoist-struct-seccomp_data-recalculation-higher/20181204-013450
config: i386-randconfig-x005-201848 (attached as .config)
compiler: gcc-7 (Debian 7.3.0-1) 7.3.0
reproduce:
# save the attached .config to linux build tree
make ARCH=i386 

All errors (new ones prefixed by >>):

   In file included from kernel/seccomp.c:28:0:
>> include/linux/syscalls.h:239:18: error: conflicting types for 'sys_seccomp'
 asmlinkage long sys##name(__MAP(x,__SC_DECL,__VA_ARGS__)) \
 ^
   include/linux/syscalls.h:225:2: note: in expansion of macro 
'__SYSCALL_DEFINEx'
 __SYSCALL_DEFINEx(x, sname, __VA_ARGS__)
 ^
   include/linux/syscalls.h:216:36: note: in expansion of macro 
'SYSCALL_DEFINEx'
#define SYSCALL_DEFINE3(name, ...) SYSCALL_DEFINEx(3, _##name, __VA_ARGS__)
   ^~~
   kernel/seccomp.c:946:1: note: in expansion of macro 'SYSCALL_DEFINE3'
SYSCALL_DEFINE3(seccomp, unsigned int, op, unsigned int, flags,
^~~
   In file included from kernel/seccomp.c:28:0:
   include/linux/syscalls.h:881:17: note: previous declaration of 'sys_seccomp' 
was here
asmlinkage long sys_seccomp(unsigned int op, unsigned int flags,
^~~

vim +/sys_seccomp +239 include/linux/syscalls.h

1bd21c6c2 Dominik Brodowski   2018-04-05  228  
e145242ea Dominik Brodowski   2018-04-09  229  /*
e145242ea Dominik Brodowski   2018-04-09  230   * The asmlinkage stub is 
aliased to a function named __se_sys_*() which
e145242ea Dominik Brodowski   2018-04-09  231   * sign-extends 32-bit ints to 
longs whenever needed. The actual work is
e145242ea Dominik Brodowski   2018-04-09  232   * done within __do_sys_*().
e145242ea Dominik Brodowski   2018-04-09  233   */
1bd21c6c2 Dominik Brodowski   2018-04-05  234  #ifndef __SYSCALL_DEFINEx
bed1ffca0 Frederic Weisbecker 2009-03-13  235  #define __SYSCALL_DEFINEx(x, 
name, ...)  \
bee200317 Arnd Bergmann   2018-06-19  236   __diag_push();  
\
bee200317 Arnd Bergmann   2018-06-19  237   __diag_ignore(GCC, 8, 
"-Wattribute-alias",  \
bee200317 Arnd Bergmann   2018-06-19  238 "Type aliasing is 
used to sanitize syscall arguments");\
83460ec8d Andi Kleen  2013-11-12 @239   asmlinkage long 
sys##name(__MAP(x,__SC_DECL,__VA_ARGS__))   \
e145242ea Dominik Brodowski   2018-04-09  240   
__attribute__((alias(__stringify(__se_sys##name;\
c9a211951 Howard McLauchlan   2018-03-21  241   
ALLOW_ERROR_INJECTION(sys##name, ERRNO);\
e145242ea Dominik Brodowski   2018-04-09  242   static inline long 
__do_sys##name(__MAP(x,__SC_DECL,__VA_ARGS__));\
e145242ea Dominik Brodowski   2018-04-09  243   asmlinkage long 
__se_sys##name(__MAP(x,__SC_LONG,__VA_ARGS__)); \
e145242ea Dominik Brodowski   2018-04-09  244   asmlinkage long 
__se_sys##name(__MAP(x,__SC_LONG,__VA_ARGS__))  \
1a94bc347 Heiko Carstens  2009-01-14  245   {   
\
e145242ea Dominik Brodowski   2018-04-09  246   long ret = 
__do_sys##name(__MAP(x,__SC_CAST,__VA_ARGS__));\
07fe6e00f Al Viro 2013-01-21  247   
__MAP(x,__SC_TEST,__VA_ARGS__); \
2cf096668 Al Viro 2013-01-21  248   __PROTECT(x, 
ret,__MAP(x,__SC_ARGS,__VA_ARGS__));   \
2cf096668 Al Viro 2013-01-21  249   return ret; 
\
1a94bc347 Heiko Carstens  2009-01-14  250   }   
\
bee200317 Arnd Bergmann   2018-06-19  251   __diag_pop();   
\
e145242ea Dominik Brodowski   2018-04-09  252   static inline long 
__do_sys##name(__MAP(x,__SC_DECL,__VA_ARGS__))
1bd21c6c2 Dominik Brodowski   2018-04-05  253  #endif /* __SYSCALL_DEFINEx */
1a94bc347 Heiko Carstens  2009-01-14  254  

:: The code at line 239 was first introduced by commit
:: 83460ec8dcac14142e7860a01fa59c267ac4657c syscalls.h: use gcc alias 
instead of assembler aliases for syscalls

:: TO: Andi Kleen 
:: CC: Linus Torvalds 

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip


Re: [PATCH v4 0/7] zram idle page writeback

2018-12-03 Thread Sergey Senozhatsky
On (12/03/18 11:40), Minchan Kim wrote:
> 
> Minchan Kim (7):
>   zram: fix lockdep warning of free block handling
>   zram: fix double free backing device
>   zram: refactoring flags and writeback stuff
>   zram: introduce ZRAM_IDLE flag
>   zram: support idle/huge page writeback
>   zram: add bd_stat statistics
>   zram: writeback throttle

Looks good to me.

Reviewed-by: Sergey Senozhatsky 

-ss


Re: [PATCH v7 2/4] clk: meson: add DT documentation for emmc clock controller

2018-12-03 Thread Jianxin Pan
Hi Stephen,

On 2018/12/4 6:45, Stephen Boyd wrote:
> Quoting Jianxin Pan (2018-11-15 04:18:30)
>> diff --git a/include/dt-bindings/clock/amlogic,mmc-clkc.h 
>> b/include/dt-bindings/clock/amlogic,mmc-clkc.h
>> new file mode 100644
>> index 000..162b949
>> --- /dev/null
>> +++ b/include/dt-bindings/clock/amlogic,mmc-clkc.h
>> @@ -0,0 +1,17 @@
>> +/* SPDX-License-Identifier: (GPL-2.0+ OR MIT) */
>> +/*
>> + * Meson MMC sub clock tree IDs
>> + *
>> + * Copyright (c) 2018 Amlogic, Inc. All rights reserved.
>> + * Author: Yixun Lan 
>> + */
>> +
>> +#ifndef __MMC_CLKC_H
>> +#define __MMC_CLKC_H
>> +
>> +#define CLKID_MMC_DIV  1
> 
> Why does the define numbering start with 1 instead of 0?
>
The Clock ID 0 is used by  CLKID_MMC_MUX.
CLKID_MMC_MUX is an internal clock which defined in 
drivers/clk/meson/mmc-clkc.c, and it's the parent of CLKID_MMC_DIV.
 
>> +#define CLKID_MMC_PHASE_CORE   2
>> +#define CLKID_MMC_PHASE_TX 3
>> +#define CLKID_MMC_PHASE_RX 4
>> +
> 
> .
> 



Re: [PATCH v5 2/2] phy: qualcomm: Add Synopsys High-Speed USB PHY driver

2018-12-03 Thread Shawn Guo
Hi Kishon,

On Tue, Dec 04, 2018 at 10:38:19AM +0530, Kishon Vijay Abraham I wrote:
> Hi,
> 
> On 27/11/18 3:37 PM, Shawn Guo wrote:
> > It adds Synopsys 28nm Femto High-Speed USB PHY driver support, which
> > is usually paired with Synopsys DWC3 USB controllers on Qualcomm SoCs.
> 
> Is this Synopsys PHY specific to Qualcomm or could it be used by other vendors
> (with just changing tuning parameters)? If it could be used by other vendors
> then it would make sense to add this PHY driver in synopsys directory.

My knowledge is that this Synopsys PHY is specific to Qualcomm SoCs.
@Sriharsha, correct me if I'm wrong.

Shawn


[PATCH] spi: lpspi: Fix CLK pin becomes low before one transfer

2018-12-03 Thread Clark Wang
Remove Reset operation in fsl_lpspi_config(). This RST may cause both CLK
and CS pins go from high to low level under cs-gpio mode.
Add fsl_lpspi_reset() function after one message transfer to clear all
flags in use.

Signed-off-by: Clark Wang 
Reviewed-by: Fugang Duan 
---
 drivers/spi/spi-fsl-lpspi.c | 24 
 1 file changed, 20 insertions(+), 4 deletions(-)

diff --git a/drivers/spi/spi-fsl-lpspi.c b/drivers/spi/spi-fsl-lpspi.c
index f32a2e0d7ae1..a7d01b79827b 100644
--- a/drivers/spi/spi-fsl-lpspi.c
+++ b/drivers/spi/spi-fsl-lpspi.c
@@ -279,10 +279,6 @@ static int fsl_lpspi_config(struct fsl_lpspi_data 
*fsl_lpspi)
u32 temp;
int ret;
 
-   temp = CR_RST;
-   writel(temp, fsl_lpspi->base + IMX7ULP_CR);
-   writel(0, fsl_lpspi->base + IMX7ULP_CR);
-
if (!fsl_lpspi->is_slave) {
ret = fsl_lpspi_set_bitrate(fsl_lpspi);
if (ret)
@@ -373,6 +369,24 @@ static int fsl_lpspi_wait_for_completion(struct 
spi_controller *controller)
return 0;
 }
 
+static int fsl_lpspi_reset(struct fsl_lpspi_data *fsl_lpspi)
+{
+   u32 temp;
+
+   /* Disable all interrupt */
+   fsl_lpspi_intctrl(fsl_lpspi, 0);
+
+   /* W1C for all flags in SR */
+   temp = 0x3F << 8;
+   writel(temp, fsl_lpspi->base + IMX7ULP_SR);
+
+   /* Clear FIFO and disable module */
+   temp = CR_RRF | CR_RTF;
+   writel(temp, fsl_lpspi->base + IMX7ULP_CR);
+
+   return 0;
+}
+
 static int fsl_lpspi_transfer_one(struct spi_controller *controller,
  struct spi_device *spi,
  struct spi_transfer *t)
@@ -394,6 +408,8 @@ static int fsl_lpspi_transfer_one(struct spi_controller 
*controller,
if (ret)
return ret;
 
+   fsl_lpspi_reset(fsl_lpspi);
+
return 0;
 }
 
-- 
2.17.1



[PATCH v19 5/5] iommu/arm-smmu: Add support for qcom,smmu-v2 variant

2018-12-03 Thread Vivek Gautam
qcom,smmu-v2 is an arm,smmu-v2 implementation with specific
clock and power requirements.
On msm8996, multiple cores, viz. mdss, video, etc. use this
smmu. On sdm845, this smmu is used with gpu.
Add bindings for the same.

Signed-off-by: Vivek Gautam 
Reviewed-by: Rob Herring 
Reviewed-by: Tomasz Figa 
Tested-by: Srinivas Kandagatla 
Reviewed-by: Robin Murphy 
---

Changes since v18:
 None.

 drivers/iommu/arm-smmu.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index b6b11642b3a9..ba18d89d4732 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -120,6 +120,7 @@ enum arm_smmu_implementation {
GENERIC_SMMU,
ARM_MMU500,
CAVIUM_SMMUV2,
+   QCOM_SMMUV2,
 };
 
 struct arm_smmu_s2cr {
@@ -2030,6 +2031,7 @@ ARM_SMMU_MATCH_DATA(smmu_generic_v2, ARM_SMMU_V2, 
GENERIC_SMMU);
 ARM_SMMU_MATCH_DATA(arm_mmu401, ARM_SMMU_V1_64K, GENERIC_SMMU);
 ARM_SMMU_MATCH_DATA(arm_mmu500, ARM_SMMU_V2, ARM_MMU500);
 ARM_SMMU_MATCH_DATA(cavium_smmuv2, ARM_SMMU_V2, CAVIUM_SMMUV2);
+ARM_SMMU_MATCH_DATA(qcom_smmuv2, ARM_SMMU_V2, QCOM_SMMUV2);
 
 static const struct of_device_id arm_smmu_of_match[] = {
{ .compatible = "arm,smmu-v1", .data = _generic_v1 },
@@ -2038,6 +2040,7 @@ static const struct of_device_id arm_smmu_of_match[] = {
{ .compatible = "arm,mmu-401", .data = _mmu401 },
{ .compatible = "arm,mmu-500", .data = _mmu500 },
{ .compatible = "cavium,smmu-v2", .data = _smmuv2 },
+   { .compatible = "qcom,smmu-v2", .data = _smmuv2 },
{ },
 };
 MODULE_DEVICE_TABLE(of, arm_smmu_of_match);
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation



[PATCH v19 3/5] iommu/arm-smmu: Add the device_link between masters and smmu

2018-12-03 Thread Vivek Gautam
From: Sricharan R 

Finally add the device link between the master device and
smmu, so that the smmu gets runtime enabled/disabled only when the
master needs it. This is done from add_device callback which gets
called once when the master is added to the smmu.

Signed-off-by: Sricharan R 
Signed-off-by: Vivek Gautam 
Reviewed-by: Tomasz Figa 
Tested-by: Srinivas Kandagatla 
Reviewed-by: Robin Murphy 
---

Changes since v18:
 None.

 drivers/iommu/arm-smmu.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 1917d214c4d9..b6b11642b3a9 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -1500,6 +1500,9 @@ static int arm_smmu_add_device(struct device *dev)
 
iommu_device_link(>iommu, dev);
 
+   device_link_add(dev, smmu->dev,
+   DL_FLAG_PM_RUNTIME | DL_FLAG_AUTOREMOVE_SUPPLIER);
+
return 0;
 
 out_cfg_free:
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation



[PATCH] spi: lpspi: Improve the stability of lpspi data transmission

2018-12-03 Thread Clark Wang
Use SR_TDF to judge if need to send data, and SR_FCF is to judge if
transmission end and to replace the waiting after transmission end.
This waiting has no actual meaning, for module will set the FCF
flag at the real end.

The changes of interrupt flag and ISR function reduce the times of
calling ISR. The use of the FCF flag improves the stability of the
data transmission. These two points generally improve the data
transfer speed of lpspi, especially when it is set to slave mode
it can support higher transfer speed of the host.

After making these changes, there is no need to use
fsl_lpspi_txfifo_empty(), so remove it.

Signed-off-by: Clark Wang 
---
 drivers/spi/spi-fsl-lpspi.c | 61 -
 1 file changed, 20 insertions(+), 41 deletions(-)

diff --git a/drivers/spi/spi-fsl-lpspi.c b/drivers/spi/spi-fsl-lpspi.c
index 3e935db5ff02..f32a2e0d7ae1 100644
--- a/drivers/spi/spi-fsl-lpspi.c
+++ b/drivers/spi/spi-fsl-lpspi.c
@@ -53,9 +53,11 @@
 #define CR_RST BIT(1)
 #define CR_MEN BIT(0)
 #define SR_TCF BIT(10)
+#define SR_FCF BIT(9)
 #define SR_RDF BIT(1)
 #define SR_TDF BIT(0)
 #define IER_TCIE   BIT(10)
+#define IER_FCIE   BIT(9)
 #define IER_RDIE   BIT(1)
 #define IER_TDIE   BIT(0)
 #define CFGR1_PCSCFG   BIT(27)
@@ -174,28 +176,10 @@ static int lpspi_unprepare_xfer_hardware(struct 
spi_controller *controller)
return 0;
 }
 
-static int fsl_lpspi_txfifo_empty(struct fsl_lpspi_data *fsl_lpspi)
-{
-   u32 txcnt;
-   unsigned long orig_jiffies = jiffies;
-
-   do {
-   txcnt = readl(fsl_lpspi->base + IMX7ULP_FSR) & 0xff;
-
-   if (time_after(jiffies, orig_jiffies + msecs_to_jiffies(500))) {
-   dev_dbg(fsl_lpspi->dev, "txfifo empty timeout\n");
-   return -ETIMEDOUT;
-   }
-   cond_resched();
-
-   } while (txcnt);
-
-   return 0;
-}
-
 static void fsl_lpspi_write_tx_fifo(struct fsl_lpspi_data *fsl_lpspi)
 {
u8 txfifo_cnt;
+   u32 temp;
 
txfifo_cnt = readl(fsl_lpspi->base + IMX7ULP_FSR) & 0xff;
 
@@ -206,9 +190,15 @@ static void fsl_lpspi_write_tx_fifo(struct fsl_lpspi_data 
*fsl_lpspi)
txfifo_cnt++;
}
 
-   if (!fsl_lpspi->remain && (txfifo_cnt < fsl_lpspi->txfifosize))
-   writel(0, fsl_lpspi->base + IMX7ULP_TDR);
-   else
+   if (txfifo_cnt < fsl_lpspi->txfifosize) {
+   if (!fsl_lpspi->is_slave) {
+   temp = readl(fsl_lpspi->base + IMX7ULP_TCR);
+   temp &= ~TCR_CONTC;
+   writel(temp, fsl_lpspi->base + IMX7ULP_TCR);
+   }
+
+   fsl_lpspi_intctrl(fsl_lpspi, IER_FCIE);
+   } else
fsl_lpspi_intctrl(fsl_lpspi, IER_TDIE);
 }
 
@@ -404,12 +394,6 @@ static int fsl_lpspi_transfer_one(struct spi_controller 
*controller,
if (ret)
return ret;
 
-   ret = fsl_lpspi_txfifo_empty(fsl_lpspi);
-   if (ret)
-   return ret;
-
-   fsl_lpspi_read_rx_fifo(fsl_lpspi);
-
return 0;
 }
 
@@ -421,7 +405,6 @@ static int fsl_lpspi_transfer_one_msg(struct spi_controller 
*controller,
struct spi_device *spi = msg->spi;
struct spi_transfer *xfer;
bool is_first_xfer = true;
-   u32 temp;
int ret = 0;
 
msg->status = 0;
@@ -441,13 +424,6 @@ static int fsl_lpspi_transfer_one_msg(struct 
spi_controller *controller,
}
 
 complete:
-   if (!fsl_lpspi->is_slave) {
-   /* de-assert SS, then finalize current message */
-   temp = readl(fsl_lpspi->base + IMX7ULP_TCR);
-   temp &= ~TCR_CONTC;
-   writel(temp, fsl_lpspi->base + IMX7ULP_TCR);
-   }
-
msg->status = ret;
spi_finalize_current_message(controller);
 
@@ -456,20 +432,23 @@ static int fsl_lpspi_transfer_one_msg(struct 
spi_controller *controller,
 
 static irqreturn_t fsl_lpspi_isr(int irq, void *dev_id)
 {
+   u32 temp_SR, temp_IER;
struct fsl_lpspi_data *fsl_lpspi = dev_id;
-   u32 temp;
 
+   temp_IER = readl(fsl_lpspi->base + IMX7ULP_IER);
fsl_lpspi_intctrl(fsl_lpspi, 0);
-   temp = readl(fsl_lpspi->base + IMX7ULP_SR);
+   temp_SR = readl(fsl_lpspi->base + IMX7ULP_SR);
 
fsl_lpspi_read_rx_fifo(fsl_lpspi);
 
-   if (temp & SR_TDF) {
+   if ((temp_SR & SR_TDF) && (temp_IER & IER_TDIE)) {
fsl_lpspi_write_tx_fifo(fsl_lpspi);
+   return IRQ_HANDLED;
+   }
 
-   if (!fsl_lpspi->remain)
+   if (temp_SR & SR_FCF && (temp_IER & IER_FCIE)) {
+   writel(SR_FCF, fsl_lpspi->base + IMX7ULP_SR);
complete(_lpspi->xfer_done);
-
return IRQ_HANDLED;
}
 
-- 
2.17.1



Re: [PATCH v9 2/4] seccomp: switch system call argument type to void *

2018-12-03 Thread Paul Moore
On Mon, Dec 3, 2018 at 12:01 AM Serge E. Hallyn  wrote:
> On Sun, Dec 02, 2018 at 08:28:25PM -0700, Tycho Andersen wrote:
> > The const qualifier causes problems for any code that wants to write to the
> > third argument of the seccomp syscall, as we will do in a future patch in
> > this series.
> >
> > The third argument to the seccomp syscall is documented as void *, so
> > rather than just dropping the const, let's switch everything to use void *
> > as well.
> >
> > I believe this is safe because of 1. the documentation above, 2. there's no
> > real type information exported about syscalls anywhere besides the man
> > pages.
> >
> > Signed-off-by: Tycho Andersen 
> > CC: Kees Cook 
> > CC: Andy Lutomirski 
> > CC: Oleg Nesterov 
> > CC: Eric W. Biederman 
> > CC: "Serge E. Hallyn" 
>
> Acked-by: Serge Hallyn 
>
> Though I'm not entirely convinced there will be no ill effects of changing
> the argument type.  I'll feel comfortable when Michael and Paul say it's
> fine :)

Well, looking at the seccomp(2) manpage on my system (dated
2018-02-02) the third argument is already shown as a "void *args":

 SYNOPSIS
  #include 
  #include 
  #include 
  #include 
  #include 

  int seccomp(unsigned int operation, unsigned int flags, void *args);

... so I think we're safe :)

>From a libseccomp perspective, we always call seccomp(2) via
syscall(2) so it is unlikely we would ever run into problems, not too
mention that we are just talking about the pointer type used in the
kernel; from a syscall ABI perspective it is still a pointer value and
that is the important part.

> > CC: Christian Brauner 
> > CC: Tyler Hicks 
> > CC: Akihiro Suda 
> > ---
> >  include/linux/seccomp.h | 2 +-
> >  kernel/seccomp.c| 8 
> >  2 files changed, 5 insertions(+), 5 deletions(-)
> >
> > diff --git a/include/linux/seccomp.h b/include/linux/seccomp.h
> > index e5320f6c8654..b5103c019cf4 100644
> > --- a/include/linux/seccomp.h
> > +++ b/include/linux/seccomp.h
> > @@ -43,7 +43,7 @@ extern void secure_computing_strict(int this_syscall);
> >  #endif
> >
> >  extern long prctl_get_seccomp(void);
> > -extern long prctl_set_seccomp(unsigned long, char __user *);
> > +extern long prctl_set_seccomp(unsigned long, void __user *);
> >
> >  static inline int seccomp_mode(struct seccomp *s)
> >  {
> > diff --git a/kernel/seccomp.c b/kernel/seccomp.c
> > index 96afc32e041d..393e029f778a 100644
> > --- a/kernel/seccomp.c
> > +++ b/kernel/seccomp.c
> > @@ -924,7 +924,7 @@ static long seccomp_get_action_avail(const char __user 
> > *uaction)
> >
> >  /* Common entry point for both prctl and syscall. */
> >  static long do_seccomp(unsigned int op, unsigned int flags,
> > -const char __user *uargs)
> > +void __user *uargs)
> >  {
> >   switch (op) {
> >   case SECCOMP_SET_MODE_STRICT:
> > @@ -944,7 +944,7 @@ static long do_seccomp(unsigned int op, unsigned int 
> > flags,
> >  }
> >
> >  SYSCALL_DEFINE3(seccomp, unsigned int, op, unsigned int, flags,
> > -  const char __user *, uargs)
> > +  void __user *, uargs)
> >  {
> >   return do_seccomp(op, flags, uargs);
> >  }
> > @@ -956,10 +956,10 @@ SYSCALL_DEFINE3(seccomp, unsigned int, op, unsigned 
> > int, flags,
> >   *
> >   * Returns 0 on success or -EINVAL on failure.
> >   */
> > -long prctl_set_seccomp(unsigned long seccomp_mode, char __user *filter)
> > +long prctl_set_seccomp(unsigned long seccomp_mode, void __user *filter)
> >  {
> >   unsigned int op;
> > - char __user *uargs;
> > + void __user *uargs;
> >
> >   switch (seccomp_mode) {
> >   case SECCOMP_MODE_STRICT:
> > --
> > 2.19.1



-- 
paul moore
www.paul-moore.com


Re: [PATCH 2/2] dt-bindings: rtc: Move trivial RTCs to rtc.txt

2018-12-03 Thread Alexandre Belloni
On 03/12/2018 17:45:58-0600, Rob Herring wrote:
> On Sun, Nov 11, 2018 at 08:31:14PM +0100, Alexandre Belloni wrote:
> > Move trivial RTCs to the rtc generic binding documentation as they all also
> > support at least 'start-year'.
> > 
> > Signed-off-by: Alexandre Belloni 
> > ---
> >  Documentation/devicetree/bindings/rtc/rtc.txt | 34 +++
> >  .../devicetree/bindings/trivial-devices.txt   | 24 -
> >  2 files changed, 34 insertions(+), 24 deletions(-)
> 
> Okay if I take these 2 to avoid conflicts with trivial-devices.txt?
> 

Sure, I was actually planning for them to go through your tree for this
reason.

-- 
Alexandre Belloni, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com


Re: linux-next: build failure after merge of the rdma tree

2018-12-03 Thread Jason Gunthorpe
On Tue, Dec 04, 2018 at 11:47:31AM +1100, Stephen Rothwell wrote:
> Hi all,
> 
> After merging the rdma tree, today's linux-next build (x86_64
> allmodconfig) failed like this:
> 
> ERROR: "mlx5_get_send_wqe" [drivers/infiniband/hw/mlx5/mlx5_ib.ko] undefined!
> 
> Caused by commit
> 
>   34f4c9554d8b ("IB/mlx5: Use fragmented QP's buffer for in-kernel users")
> 
> mlx5_get_send_wqe() is still used in drivers/infiniband/hw/mlx5/cq.c
> and declared in drivers/infiniband/hw/mlx5/mlx5_ib.h ...
> 
> I have used the version of the rdma tree from next-20181203 for today.

Huh. So apparently every compiler that tested this patch (0-day, mine,
the submitters) optimized this call away because is_atomic_response()
always returns 0: meaning mlx5_get_atomic_laddr is never callable and
can be deleted entirely, including the call to mlx5_get_send_wqe.

Not sure what compiler setup will hit this, but it is clearly wrong
code..

Guy/Leon, please send a fixup.. Maybe just delete all this
handle_atomics stuff?

Thanks,
Jason


Re: [PATCH linux-next v2 6/6] ASoC: rsnd: add avb clocks

2018-12-03 Thread Kuninori Morimoto

Hi Jiada

> There are AVB Counter Clocks in ADG, each clock has 12bits integral
> and 8 bits fractional dividers which operates with S0D1ϕ clock.
> 
> This patch registers 8 AVB Counter Clocks when clock-cells of
> rcar_sound node is 2,
> 
> Signed-off-by: Jiada Wang 
> ---
(snip)
> +struct clk_avb {
> + struct clk_hw hw;
> + unsigned int idx;
> + struct rsnd_mod *mod;
> + /* lock reg access */
> + spinlock_t *lock;
> +};
> +
> +#define to_clk_avb(_hw) container_of(_hw, struct clk_avb, hw)

I like "hw_to_avb()"

> +static struct clk *rsnd_adg_clk_src_twocell_get(struct of_phandle_args 
> *clkspec,
> + void *data)
> +{
> + unsigned int clkidx = clkspec->args[1];
> + struct rsnd_adg *adg = data;
> + const char *type;
> + struct clk *clk;
> +
> + switch (clkspec->args[0]) {
> + case ADG_FIX:
> + type = "fixed";
> + if (clkidx >= CLKOUTMAX) {
> + pr_err("Invalid %s clock index %u\n", type,
> +clkidx);
> + return ERR_PTR(-EINVAL);
> + }
> + clk = adg->clkout[clkidx];
> + break;
> + case ADG_AVB:
> + type = "avb";
> + if (clkidx >= AVB_CLK_NUM) {
> + pr_err("Invalid %s clock index %u\n", type,
> +clkidx);
> + return ERR_PTR(-EINVAL);
> + }
> + clk = adg->clkavb[clkidx];
> + break;
> + default:
> + pr_err("Invalid ADG clock type %u\n", clkspec->args[0]);
> + return ERR_PTR(-EINVAL);
> + }
> +
> + return clk;
> +}

In this function
1) I don't think you need to use "char *type".
2) If you use "clkidx = clkspec->args[1]", having same name for 
"clkspec->args[0]"
   is readable.
3) please use dev_err() instad of pr_err()
   I think data can be priv, and you can use rsnd_priv_to_adg(), 
rsnd_priv_to_dev()

> +static void clk_avb_div_write(struct rsnd_mod *mod, u32 data, int idx)
(snip)
> +static u32 clk_avb_div_read(struct rsnd_mod *mod, int idx)

To reduce confusion, and be more redable code,
I think these function can be

clk_avb_div_write(struct rsnd_adg *adg, u32 data);
clk_avb_div_read(struct rsnd_adg *adg);

> +static int clk_avb_set_rate(struct clk_hw *hw, unsigned long rate,
> + unsigned long parent_rate)
> +{
> + struct clk_avb *avb = to_clk_avb(hw);
> + unsigned int div = clk_avb_calc_div(rate, parent_rate);
> + u32 val;
> +
> + val = clk_avb_div_read(avb->mod, avb->idx) & ~AVB_DIV_MASK;
> + clk_avb_div_write(avb->mod, val | div, avb->idx);
> +
> + return 0;
> +}

Why do we need to care about ~AVB_DIV_MASK area ?
These are 0 Reserved, I think.

> +static const struct clk_ops clk_avb_ops = {
> + .enable = clk_avb_enable,
> + .disable = clk_avb_disable,
> + .is_enabled = clk_avb_is_enabled,
> + .recalc_rate = clk_avb_recalc_rate,
> + .round_rate = clk_avb_round_rate,
> + .set_rate = clk_avb_set_rate,
> +};

This is not a big deal, but I like tab aligned ops

> +static struct clk *clk_register_avb(struct device *dev, struct rsnd_mod *mod,
> + unsigned int id, spinlock_t *lock)
> +{
> + struct clk_init_data init;
> + struct clk_avb *avb;
> + struct clk *clk;
> + char name[AVB_CLK_NAME_SIZE];
> + const char *parent_name = ADG_CLK_NAME;
> +
> + avb = devm_kzalloc(dev, sizeof(*avb), GFP_KERNEL);
> + if (!avb)
> + return ERR_PTR(-ENOMEM);
> +
> + snprintf(name, AVB_CLK_NAME_SIZE, "%s%u", AVB_CLK_NAME, id);
> +
> + avb->idx = id;
> + avb->lock = lock;
> + avb->mod = mod;
> +
> + /* Register the clock. */
> + init.name = name;
> + init.ops = _avb_ops;
> + init.flags = CLK_IS_BASIC;
> + init.parent_names = _name;
> + init.num_parents = 1;
> +
> + avb->hw.init = 
> +
> + /* init DIV to a valid state */
> + clk_avb_div_write(avb->mod, avb->idx, AVB_MAX_DIV);

Please check parameter, I think you want to do is

- clk_avb_div_write(avb->mod, avb->idx, AVB_MAX_DIV);
+ clk_avb_div_write(avb->mod, AVB_MAX_DIV, avb->idx);

Best regards
---
Kuninori Morimoto


[PATCH v5 10/13] nvme-tcp: Add protocol header

2018-12-03 Thread Sagi Grimberg
From: Sagi Grimberg 

Signed-off-by: Sagi Grimberg 
---
 include/linux/nvme-tcp.h | 189 +++
 include/linux/nvme.h |   1 +
 2 files changed, 190 insertions(+)
 create mode 100644 include/linux/nvme-tcp.h

diff --git a/include/linux/nvme-tcp.h b/include/linux/nvme-tcp.h
new file mode 100644
index ..03d87c0550a9
--- /dev/null
+++ b/include/linux/nvme-tcp.h
@@ -0,0 +1,189 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * NVMe over Fabrics TCP protocol header.
+ * Copyright (c) 2018 Lightbits Labs. All rights reserved.
+ */
+
+#ifndef _LINUX_NVME_TCP_H
+#define _LINUX_NVME_TCP_H
+
+#include 
+
+#define NVME_TCP_DISC_PORT 8009
+#define NVME_TCP_ADMIN_CCSZSZ_8K
+#define NVME_TCP_DIGEST_LENGTH 4
+
+enum nvme_tcp_pfv {
+   NVME_TCP_PFV_1_0 = 0x0,
+};
+
+enum nvme_tcp_fatal_error_status {
+   NVME_TCP_FES_INVALID_PDU_HDR= 0x01,
+   NVME_TCP_FES_PDU_SEQ_ERR= 0x02,
+   NVME_TCP_FES_HDR_DIGEST_ERR = 0x03,
+   NVME_TCP_FES_DATA_OUT_OF_RANGE  = 0x04,
+   NVME_TCP_FES_R2T_LIMIT_EXCEEDED = 0x05,
+   NVME_TCP_FES_DATA_LIMIT_EXCEEDED= 0x05,
+   NVME_TCP_FES_UNSUPPORTED_PARAM  = 0x06,
+};
+
+enum nvme_tcp_digest_option {
+   NVME_TCP_HDR_DIGEST_ENABLE  = (1 << 0),
+   NVME_TCP_DATA_DIGEST_ENABLE = (1 << 1),
+};
+
+enum nvme_tcp_pdu_type {
+   nvme_tcp_icreq  = 0x0,
+   nvme_tcp_icresp = 0x1,
+   nvme_tcp_h2c_term   = 0x2,
+   nvme_tcp_c2h_term   = 0x3,
+   nvme_tcp_cmd= 0x4,
+   nvme_tcp_rsp= 0x5,
+   nvme_tcp_h2c_data   = 0x6,
+   nvme_tcp_c2h_data   = 0x7,
+   nvme_tcp_r2t= 0x9,
+};
+
+enum nvme_tcp_pdu_flags {
+   NVME_TCP_F_HDGST= (1 << 0),
+   NVME_TCP_F_DDGST= (1 << 1),
+   NVME_TCP_F_DATA_LAST= (1 << 2),
+   NVME_TCP_F_DATA_SUCCESS = (1 << 3),
+};
+
+/**
+ * struct nvme_tcp_hdr - nvme tcp pdu common header
+ *
+ * @type:  pdu type
+ * @flags: pdu specific flags
+ * @hlen:  pdu header length
+ * @pdo:   pdu data offset
+ * @plen:  pdu wire byte length
+ */
+struct nvme_tcp_hdr {
+   __u8type;
+   __u8flags;
+   __u8hlen;
+   __u8pdo;
+   __le32  plen;
+};
+
+/**
+ * struct nvme_tcp_icreq_pdu - nvme tcp initialize connection request pdu
+ *
+ * @hdr:   pdu generic header
+ * @pfv:   pdu version format
+ * @hpda:  host pdu data alignment (dwords, 0's based)
+ * @digest:digest types enabled
+ * @maxr2t:maximum r2ts per request supported
+ */
+struct nvme_tcp_icreq_pdu {
+   struct nvme_tcp_hdr hdr;
+   __le16  pfv;
+   __u8hpda;
+   __u8digest;
+   __le32  maxr2t;
+   __u8rsvd2[112];
+};
+
+/**
+ * struct nvme_tcp_icresp_pdu - nvme tcp initialize connection response pdu
+ *
+ * @hdr:   pdu common header
+ * @pfv:   pdu version format
+ * @cpda:  controller pdu data alignment (dowrds, 0's based)
+ * @digest:digest types enabled
+ * @maxdata:   maximum data capsules per r2t supported
+ */
+struct nvme_tcp_icresp_pdu {
+   struct nvme_tcp_hdr hdr;
+   __le16  pfv;
+   __u8cpda;
+   __u8digest;
+   __le32  maxdata;
+   __u8rsvd[112];
+};
+
+/**
+ * struct nvme_tcp_term_pdu - nvme tcp terminate connection pdu
+ *
+ * @hdr:   pdu common header
+ * @fes:   fatal error status
+ * @fei:   fatal error information
+ */
+struct nvme_tcp_term_pdu {
+   struct nvme_tcp_hdr hdr;
+   __le16  fes;
+   __le32  fei;
+   __u8rsvd[8];
+};
+
+/**
+ * struct nvme_tcp_cmd_pdu - nvme tcp command capsule pdu
+ *
+ * @hdr:   pdu common header
+ * @cmd:   nvme command
+ */
+struct nvme_tcp_cmd_pdu {
+   struct nvme_tcp_hdr hdr;
+   struct nvme_command cmd;
+};
+
+/**
+ * struct nvme_tcp_rsp_pdu - nvme tcp response capsule pdu
+ *
+ * @hdr:   pdu common header
+ * @hdr:   nvme-tcp generic header
+ * @cqe:   nvme completion queue entry
+ */
+struct nvme_tcp_rsp_pdu {
+   struct nvme_tcp_hdr hdr;
+   struct nvme_completion  cqe;
+};
+
+/**
+ * struct nvme_tcp_r2t_pdu - nvme tcp ready-to-transfer pdu
+ *
+ * @hdr:   pdu common header
+ * @command_id:nvme command identifier which this relates to
+ * @ttag:  transfer tag (controller generated)
+ * @r2t_offset:offset from the start of the command data
+ * @r2t_length:length the host is allowed to send
+ */
+struct nvme_tcp_r2t_pdu {
+   struct nvme_tcp_hdr hdr;
+   __u16 

Re: [PATCH] printk: Add caller information to printk() output.

2018-12-03 Thread Sergey Senozhatsky
On (12/02/18 20:23), Tetsuo Handa wrote:
> 
> Some examples for console output:
> 
>   [0.919699]@T1 x86: Booting SMP configuration:
>   [4.152681]@T271 Fusion MPT base driver 3.04.20
>   [5.070470]@C0 random: fast init done
>   [6.587900]@C3 random: crng init done

This is hard to read. Let's have a fixed width space for from_id.

> +#ifdef CONFIG_PRINTK_FROM
> + if (in_task())
> + msg->from_id = current->pid;

Let's use task_pid_nr().

> -static size_t print_time(u64 ts, char *buf)
> -{
> - unsigned long rem_nsec = do_div(ts, 10);
> -
> - if (!buf)
> - return snprintf(NULL, 0, "[%5lu.00] ", (unsigned long)ts);
> -
> - return sprintf(buf, "[%5lu.%06lu] ",
> -(unsigned long)ts, rem_nsec / 1000);
> -}

OK, this patch depends on printk_time patch.

> +config PRINTK_FROM
> + bool "Show caller information on printks"
> + depends on PRINTK

Wasn't it supposed to also depend on DEBUG_AID_FOR_SYZBOT?

-ss


Re: [PATCH 3/3] mm/memcg: Avoid reclaiming below hard protection

2018-12-03 Thread Xunlei Pang
On 2018/12/3 PM 7:57, Michal Hocko wrote:
> On Mon 03-12-18 16:01:19, Xunlei Pang wrote:
>> When memcgs get reclaimed after its usage exceeds min, some
>> usages below the min may also be reclaimed in the current
>> implementation, the amount is considerably large during kswapd
>> reclaim according to my ftrace results.
> 
> And here again. Describe the setup and the behavior please?
> 

step 1
mkdir -p /sys/fs/cgroup/memory/online
cd /sys/fs/cgroup/memory/online
echo 512M > memory.max
echo 40960 > memory.min
echo $$ > tasks
dd if=/dev/sda of=/dev/null


while true; do sleep 1; cat memory.current ; cat memory.min; done


step 2
create global memory pressure by allocating annoymous and cached
pages to constantly trigger kswap: dd if=/dev/sdb of=/dev/null

step 3
Then observe "online" groups, hundreds of kbytes a little over
memory.min can cause tens of MiB to be reclaimed by kswapd.

Here is one of test results I got:
cat memory.current; cat memory.min; echo;
409485312   // current
40960   // min

385052672   // See current got over reclaimed for 23MB
40960   // min

Its corresponding ftrace output I monitored:
kswapd_0-281   [000]    304.706632: shrink_node_memcg:
min_excess=24, nr_reclaimed=6013, sc->nr_to_reclaim=147, exceeds
5989pages


Re: [PATCH V3 1/3] mmc: sdhci: add support for using external DMA devices

2018-12-03 Thread Chunyan Zhang
Hi Faiz,

On Mon, 3 Dec 2018 at 21:55, Faiz Abbas  wrote:
>
> Hi,
>
> On 03/12/18 5:45 PM, Faiz Abbas wrote:
> > Hi,
> >
> >> +static void sdhci_external_dma_prepare_data(struct sdhci_host *host,
> >> +struct mmc_command *cmd)
> >> +{
> >
> > Please add a condition for data == NULL here. This was already pointed
> > out by Adrian in v2.

The check for data is added in sdhci_external_dma_setup() .

> >
> > My test with an am335x-evm failed with these patches. Looks like the
> > very first SDIO commands failing.

I guess you didn't add 'dmas' in device tree, like patch 3 shows.

> >
> > https://pastebin.ubuntu.com/p/Y2RDjSKpgd/
> >
> > Currently am335x-evm is using omap_hsmmc driver. I added the following
> > patch to make it work with sdhci_omap.
> >
> > https://pastebin.ubuntu.com/p/VTGrCbJxY3/
> >
> > Will look deeper into this. Please ping if you need any more information.
> >
>
> So I disabled DMA in the driver altogether and still got the same
> messages on am335x-evm in PIO mode. Looks like something more is
> required for it to be supported.
>
> I instead shifted to a dra71-evm which supports both ADMA and external
> DMA. Here is the log:
>
> https://pastebin.ubuntu.com/p/mcJmgcjQsp/
>
> The interface fundamentally works but it complains with the following error:
>

Yes, it switched back to ADMA/PIO since sdhci couldn't find 'dmas'
property in devicetree.

> [3.111693] Failed to request TX DMA channel.

Ok, I will add a check in sdhci-omap.c before switching to external
dma, that should be able to avoid this error logs.

Thanks for the review and test!

Chunyan


[RFC PATCH] Add IPA clock support for clk-rpmh

2018-12-03 Thread David Dai
This patch extends the existing clk-rpmh driver to support a different
type of RPMh resource known as Bus Clock Manager(BCM) in order to scale
performance for the Qualcomm IP Accelerator(IPA) core clock.

David Dai (1):
  clk: qcom: clk-rpmh: Add IPA clock support

 drivers/clk/qcom/clk-rpmh.c   | 142 ++
 include/dt-bindings/clock/qcom,rpmh.h |   1 +
 2 files changed, 143 insertions(+)

-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project



[RFC PATCH] clk: qcom: clk-rpmh: Add IPA clock support

2018-12-03 Thread David Dai
Add IPA clock support by extending the current clk rpmh driver to support
clocks that are managed by a different type of RPMh resource known as
Bus Clock Manager(BCM).

Signed-off-by: David Dai 
---
 drivers/clk/qcom/clk-rpmh.c   | 142 ++
 include/dt-bindings/clock/qcom,rpmh.h |   1 +
 2 files changed, 143 insertions(+)

diff --git a/drivers/clk/qcom/clk-rpmh.c b/drivers/clk/qcom/clk-rpmh.c
index 9f4fc77..42e2cd2 100644
--- a/drivers/clk/qcom/clk-rpmh.c
+++ b/drivers/clk/qcom/clk-rpmh.c
@@ -18,6 +18,32 @@
 #define CLK_RPMH_ARC_EN_OFFSET 0
 #define CLK_RPMH_VRM_EN_OFFSET 4
 
+#define BCM_TCS_CMD_COMMIT_MASK0x4000
+#define BCM_TCS_CMD_VALID_SHIFT29
+#define BCM_TCS_CMD_VOTE_MASK  0x3fff
+#define BCM_TCS_CMD_VOTE_SHIFT 0
+
+#define BCM_TCS_CMD(valid, vote) \
+   (BCM_TCS_CMD_COMMIT_MASK |\
+   ((valid) << BCM_TCS_CMD_VALID_SHIFT) |\
+   ((cpu_to_le32(vote) &\
+   BCM_TCS_CMD_VOTE_MASK) << BCM_TCS_CMD_VOTE_SHIFT))
+
+/**
+ * struct bcm_db - Auxiliary data pertaining to each Bus Clock Manager(BCM)
+ * @unit: divisor used to convert Hz value to an RPMh msg
+ * @width: multiplier used to convert Hz value to an RPMh msg
+ * @vcd: virtual clock domain that this bcm belongs to
+ * @reserved: reserved to pad the struct
+ */
+
+struct bcm_db {
+   u32 unit;
+   u16 width;
+   u8 vcd;
+   u8 reserved;
+};
+
 /**
  * struct clk_rpmh - individual rpmh clock data structure
  * @hw:handle between common and hardware-specific 
interfaces
@@ -29,6 +55,7 @@
  * @aggr_state:rpmh clock aggregated state
  * @last_sent_aggr_state: rpmh clock last aggr state sent to RPMh
  * @valid_state_mask:  mask to determine the state of the rpmh clock
+ * @aux_data:  data specific to the bcm rpmh resource
  * @dev:   device to which it is attached
  * @peer:  pointer to the clock rpmh sibling
  */
@@ -42,6 +69,7 @@ struct clk_rpmh {
u32 aggr_state;
u32 last_sent_aggr_state;
u32 valid_state_mask;
+   struct bcm_db aux_data;
struct device *dev;
struct clk_rpmh *peer;
 };
@@ -98,6 +126,17 @@ struct clk_rpmh_desc {
__DEFINE_CLK_RPMH(_platform, _name, _name_active, _res_name,\
  CLK_RPMH_VRM_EN_OFFSET, 1, _div)
 
+#define DEFINE_CLK_RPMH_BCM(_platform, _name, _res_name)   \
+   static struct clk_rpmh _platform##_##_name = {  \
+   .res_name = _res_name,  \
+   .valid_state_mask = BIT(RPMH_ACTIVE_ONLY_STATE),\
+   .div = 1,   \
+   .hw.init = &(struct clk_init_data){ \
+   .ops = _rpmh_bcm_ops,   \
+   .name = #_name, \
+   },  \
+   }
+
 static inline struct clk_rpmh *to_clk_rpmh(struct clk_hw *_hw)
 {
return container_of(_hw, struct clk_rpmh, hw);
@@ -210,6 +249,91 @@ static unsigned long clk_rpmh_recalc_rate(struct clk_hw 
*hw,
.recalc_rate= clk_rpmh_recalc_rate,
 };
 
+static int clk_rpmh_bcm_send_cmd(struct clk_rpmh *c, bool enable)
+{
+   struct tcs_cmd cmd = { 0 };
+   u32 cmd_state;
+   int ret;
+
+   cmd_state = enable ? (c->aggr_state ? c->aggr_state : 1) : 0;
+
+   if (c->last_sent_aggr_state == cmd_state)
+   return 0;
+
+   cmd.addr = c->res_addr;
+   cmd.data = BCM_TCS_CMD(enable, cmd_state);
+
+   ret = rpmh_write_async(c->dev, RPMH_ACTIVE_ONLY_STATE, , 1);
+   if (ret) {
+   dev_err(c->dev, "set active state of %s failed: (%d)\n",
+   c->res_name, ret);
+   return ret;
+   }
+
+   c->last_sent_aggr_state = cmd_state;
+
+   return 0;
+}
+
+static int clk_rpmh_bcm_prepare(struct clk_hw *hw)
+{
+   struct clk_rpmh *c = to_clk_rpmh(hw);
+   int ret = 0;
+
+   mutex_lock(_clk_lock);
+   ret = clk_rpmh_bcm_send_cmd(c, true);
+   mutex_unlock(_clk_lock);
+
+   return ret;
+};
+
+static void clk_rpmh_bcm_unprepare(struct clk_hw *hw)
+{
+   struct clk_rpmh *c = to_clk_rpmh(hw);
+
+   mutex_lock(_clk_lock);
+   clk_rpmh_bcm_send_cmd(c, false);
+   mutex_unlock(_clk_lock);
+};
+
+static int clk_rpmh_bcm_set_rate(struct clk_hw *hw, unsigned long rate,
+unsigned long parent_rate)
+{
+   struct clk_rpmh *c = to_clk_rpmh(hw);
+
+   c->aggr_state = rate / (c->aux_data.unit * 1000);
+
+   if (clk_hw_is_prepared(hw)) {
+   mutex_lock(_clk_lock);
+   clk_rpmh_bcm_send_cmd(c, true);
+   mutex_unlock(_clk_lock);
+   }
+
+   return 0;
+};
+
+static long clk_rpmh_round_rate(struct 

Re: [PATCH] ext4: clean up indentation issues, remove extraneous tabs

2018-12-03 Thread Theodore Y. Ts'o
On Tue, Nov 27, 2018 at 10:28:36AM +0100, Jan Kara wrote:
> On Fri 23-11-18 16:30:43, Colin King wrote:
> > From: Colin Ian King 
> > 
> > There are several lines that are indented too far, clean these
> > up by removing the tabs.
> > 
> > Signed-off-by: Colin Ian King 
> 
> The patch looks good. You can add:
> 
> Reviewed-by: Jan Kara 

Thanks, applied.

- Ted


Re: [PATCH v2 2/3] phy: sr-usb: Add stingray usb phy driver

2018-12-03 Thread Kishon Vijay Abraham I
Hi,

On 03/12/18 2:06 PM, Srinath Mannam wrote:
> This driver supports all versions of stingray SS and HS
> USB phys.
> In version 1 is combo phy contain both SS and HS phys
> in a common IO space.
> In version 2 a single HS phy.
> These phys support both xHCI host driver and
> BDC Broadcom device controller driver.
> 
> Signed-off-by: Srinath Mannam 
> Reviewed-by: Florian Fainelli 
> Reviewed-by: Scott Branden 
> ---
>  drivers/phy/broadcom/Kconfig  |  11 +
>  drivers/phy/broadcom/Makefile |   1 +
>  drivers/phy/broadcom/phy-bcm-sr-usb.c | 373 
> ++
>  3 files changed, 385 insertions(+)
>  create mode 100644 drivers/phy/broadcom/phy-bcm-sr-usb.c
> 
> diff --git a/drivers/phy/broadcom/Kconfig b/drivers/phy/broadcom/Kconfig
> index 8786a96..c1e4dd5 100644
> --- a/drivers/phy/broadcom/Kconfig
> +++ b/drivers/phy/broadcom/Kconfig
> @@ -10,6 +10,17 @@ config PHY_CYGNUS_PCIE
> Enable this to support the Broadcom Cygnus PCIe PHY.
> If unsure, say N.
>  
> +config PHY_BCM_SR_USB
> + tristate "Broadcom Stingray USB PHY driver"
> + depends on OF && (ARCH_BCM_IPROC || COMPILE_TEST)
> + select GENERIC_PHY
> + default ARCH_BCM_IPROC
> + help
> +   Enable this to support the Broadcom Stingray USB PHY
> +   driver. It supports all versions of Superspeed and
> +   Highspeed PHYs.
> +   If unsure, say N.
> +
>  config BCM_KONA_USB2_PHY
>   tristate "Broadcom Kona USB2 PHY Driver"
>   depends on HAS_IOMEM
> diff --git a/drivers/phy/broadcom/Makefile b/drivers/phy/broadcom/Makefile
> index 0f60184..f453c7d 100644
> --- a/drivers/phy/broadcom/Makefile
> +++ b/drivers/phy/broadcom/Makefile
> @@ -11,3 +11,4 @@ obj-$(CONFIG_PHY_BRCM_USB)  += phy-brcm-usb-dvr.o
>  phy-brcm-usb-dvr-objs := phy-brcm-usb.o phy-brcm-usb-init.o
>  
>  obj-$(CONFIG_PHY_BCM_SR_PCIE)+= phy-bcm-sr-pcie.o
> +obj-$(CONFIG_PHY_BCM_SR_USB) += phy-bcm-sr-usb.o
> diff --git a/drivers/phy/broadcom/phy-bcm-sr-usb.c 
> b/drivers/phy/broadcom/phy-bcm-sr-usb.c
> new file mode 100644
> index 000..52484b3
> --- /dev/null
> +++ b/drivers/phy/broadcom/phy-bcm-sr-usb.c
> @@ -0,0 +1,373 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Copyright (C) 2016-2018 Broadcom
> + */
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +enum bcm_usb_phy_version {
> + BCM_USB_PHY_V1,
> + BCM_USB_PHY_V2,
> +};
> +
> +enum bcm_usb_phy_reg {
> + PLL_NDIV_FRAC,
> + PLL_NDIV_INT,
> + PLL_CTRL,
> + PHY_CTRL,
> + PHY_PLL_CTRL,
> +};
> +
> +/* USB PHY registers */
> +
> +static const u8 bcm_usb_u3phy_v1[] = {
> + [PLL_CTRL]  = 0x18,
> + [PHY_CTRL]  = 0x14,
> +};
> +
> +static const u8 bcm_usb_u2phy_v1[] = {
> + [PLL_NDIV_FRAC] = 0x04,
> + [PLL_NDIV_INT]  = 0x08,
> + [PLL_CTRL]  = 0x0c,
> + [PHY_CTRL]  = 0x10,
> +};
> +
> +#define HSPLL_NDIV_INT_VAL   0x13
> +#define HSPLL_NDIV_FRAC_VAL  0x1005
> +
> +static const u8 bcm_usb_u2phy_v2[] = {
> + [PLL_NDIV_FRAC] = 0x0,
> + [PLL_NDIV_INT]  = 0x4,
> + [PLL_CTRL]  = 0x8,
> + [PHY_CTRL]  = 0xc,
> +};
> +
> +enum pll_ctrl_bits {
> + PLL_RESETB,
> + SSPLL_SUSPEND_EN,
> + PLL_SEQ_START,
> + PLL_LOCK,
> + PLL_PDIV,
> +};
> +
> +static const u8 u3pll_ctrl[] = {
> + [PLL_RESETB]= 0,
> + [SSPLL_SUSPEND_EN]  = 1,
> + [PLL_SEQ_START] = 2,
> + [PLL_LOCK]  = 3,
> +};
> +
> +#define HSPLL_PDIV_MASK  0xF
> +#define HSPLL_PDIV_VAL   0x1
> +
> +static const u8 u2pll_ctrl[] = {
> + [PLL_PDIV]  = 1,
> + [PLL_RESETB]= 5,
> + [PLL_LOCK]  = 6,
> +};
> +
> +enum bcm_usb_phy_ctrl_bits {
> + CORERDY,
> + AFE_LDO_PWRDWNB,
> + AFE_PLL_PWRDWNB,
> + AFE_BG_PWRDWNB,
> + PHY_ISO,
> + PHY_RESETB,
> + PHY_PCTL,
> +};
> +
> +#define PHY_PCTL_MASK0x
> +/*
> + * 0x0806 of PCTL_VAL has below bits set
> + * BIT-8 : refclk divider 1
> + * BIT-3:2: device mode; mode is not effect
> + * BIT-1: soft reset active low
> + */
> +#define HSPHY_PCTL_VAL   0x0806
> +#define SSPHY_PCTL_VAL   0x0006
> +
> +static const u8 u3phy_ctrl[] = {
> + [PHY_RESETB]= 1,
> + [PHY_PCTL]  = 2,
> +};
> +
> +static const u8 u2phy_ctrl[] = {
> + [CORERDY]   = 0,
> + [AFE_LDO_PWRDWNB]   = 1,
> + [AFE_PLL_PWRDWNB]   = 2,
> + [AFE_BG_PWRDWNB]= 3,
> + [PHY_ISO]   = 4,
> + [PHY_RESETB]= 5,
> + [PHY_PCTL]  = 6,
> +};
> +
> +struct bcm_usb_phy_cfg {
> + uint32_t type;
> + uint32_t ver;
> + void __iomem *regs;
> + struct phy *phy;
> + const u8 *offset;
> +};
> +
> +#define PLL_LOCK_RETRY_COUNT 1000
> +
> +enum bcm_usb_phy_type {
> + USB_HS_PHY,
> + USB_SS_PHY,
> +};
> +
> +static inline void bcm_usb_reg32_clrbits(void __iomem *addr, 

Re: [PATCH] Uprobes: Fix kernel oops with delayed_uprobe_remove()

2018-12-03 Thread Steven Rostedt
On Mon, 3 Dec 2018 11:52:41 +0530
Ravi Bangoria  wrote:

> Hi Steve,
> 
> Please pull this patch.
> 

Please send a v2 version of the patch with the updated change log. And
should it have a Fixes and be tagged for stable?

-- Steve

> Thanks.
> 
> On 11/15/18 6:13 PM, Oleg Nesterov wrote:
> > On 11/15, Ravi Bangoria wrote:  
> >>
> >> There could be a race between task exit and probe unregister:
> >>
> >>   exit_mm()
> >>   mmput()
> >>   __mmput() uprobe_unregister()
> >>   uprobe_clear_state()  put_uprobe()
> >>   delayed_uprobe_remove()   delayed_uprobe_remove()
> >>
> >> put_uprobe() is calling delayed_uprobe_remove() without taking
> >> delayed_uprobe_lock and thus the race sometimes results in a
> >> kernel crash. Fix this by taking delayed_uprobe_lock before
> >> calling delayed_uprobe_remove() from put_uprobe().
> >>
> >> Detailed crash log can be found at:
> >>   https://lkml.org/lkml/2018/11/1/1244  
> > 
> > Thanks, looks good,
> > 
> > Oleg.
> >   



Re: [PATCH v2 2/5] devfreq: add support for suspend/resume of a devfreq device

2018-12-03 Thread Chanwoo Choi
Hi Lukasz,

I add the comment about 'suspend_count'.

On 2018년 12월 04일 14:43, Chanwoo Choi wrote:
> Hi,
> 
> On 2018년 12월 04일 14:36, Chanwoo Choi wrote:
>> Hi Lukasz,
>>
>> Looks good to me. But, I add the some comments.
>> If you will fix it, feel free to add my tag:
>> Reviewed-by: Chanwoo choi 
> 
> Sorry. Fix typo 'choi' to 'Choi' as following.
> Reviewed-by: Chanwoo Choi 
> 
>>
>> On 2018년 12월 03일 23:31, Lukasz Luba wrote:
>>> The patch prepares devfreq device for handling suspend/resume
>>> functionality.  The new fields will store needed information during this
>>
>> nitpick. Remove unneeded space. There are two spaces between '.' and 'The 
>> new'. 
>>
>>> process.  Devfreq framework handles opp-suspend DT entry and there is no
>>
>> ditto.
>>
>>> need of modyfications in the drivers code.  It uses atomic variables to
>>
>> ditto.
>>
>>> make sure no race condition affects the process.
>>>
>>> The patch is based on earlier work by Tobias Jakobi.
>>
>> Please remove it from each patch description.
>>
>>>
>>> Suggested-by: Tobias Jakobi 
>>> Suggested-by: Chanwoo Choi 
>>> Signed-off-by: Lukasz Luba 
>>> ---
>>>  drivers/devfreq/devfreq.c | 51 
>>> +++
>>>  include/linux/devfreq.h   |  7 +++
>>>  2 files changed, 50 insertions(+), 8 deletions(-)
>>>
>>> diff --git a/drivers/devfreq/devfreq.c b/drivers/devfreq/devfreq.c
>>> index a9fd61b..36bed24 100644
>>> --- a/drivers/devfreq/devfreq.c
>>> +++ b/drivers/devfreq/devfreq.c
>>> @@ -316,6 +316,10 @@ static int devfreq_set_target(struct devfreq *devfreq, 
>>> unsigned long new_freq,
>>> "Couldn't update frequency transition information.\n");
>>>  
>>> devfreq->previous_freq = new_freq;
>>> +
>>> +   if (devfreq->suspend_freq)
>>> +   devfreq->resume_freq = cur_freq;
>>> +
>>> return err;
>>>  }
>>>  
>>> @@ -667,6 +671,9 @@ struct devfreq *devfreq_add_device(struct device *dev,
>>> }
>>> devfreq->max_freq = devfreq->scaling_max_freq;
>>>  
>>> +   devfreq->suspend_freq = dev_pm_opp_get_suspend_opp_freq(dev);
>>> +   atomic_set(>suspend_count, 0);
>>> +
>>> dev_set_name(>dev, "devfreq%d",
>>> atomic_inc_return(_no));
>>> err = device_register(>dev);
>>> @@ -867,14 +874,28 @@ EXPORT_SYMBOL(devm_devfreq_remove_device);
>>>   */
>>>  int devfreq_suspend_device(struct devfreq *devfreq)
>>>  {
>>> +   int ret;
>>> +
>>> if (!devfreq)
>>> return -EINVAL;
>>>  
>>> -   if (!devfreq->governor)
>>> -   return 0;
>>> +   if (devfreq->governor) {
>>> +   ret = devfreq->governor->event_handler(devfreq,
>>> +   DEVFREQ_GOV_SUSPEND, NULL);
>>> +   if (ret)
>>> +   return ret;
>>> +   }
>>> +
>>> +   if (devfreq->suspend_freq) {
>>> +   if (atomic_inc_return(>suspend_count) > 1)
>>> +   return 0;
>>> +
>>> +   ret = devfreq_set_target(devfreq, devfreq->suspend_freq, 0);
>>> +   if (ret)
>>> +   return ret;
>>> +   }

In this patch, if some users call 'devfreq_suspend_device' twice,
'devfreq->governor->event_handler(devfreq, DEVFREQ_GOV_SUSPEND, NULL)'
is called twice but devfreq_set_target() is called only one.
I knew that it is no problem for operation.

But,
I think that you better to use 'suspend_count' as the reference count
of devfreq_suspend/resume_device(). But, if you use 'suspend_count'
in order to check whether this devfreq is suspended or not,
we can reduce the unneeded redundant call when calling it twice.

clock and regulator used the 'reference count' method in order to
remove the redundant call.


>>>  
>>> -   return devfreq->governor->event_handler(devfreq,
>>> -   DEVFREQ_GOV_SUSPEND, NULL);
>>> +   return 0;
>>>  }
>>>  EXPORT_SYMBOL(devfreq_suspend_device);
>>>  
>>> @@ -888,14 +909,28 @@ EXPORT_SYMBOL(devfreq_suspend_device);
>>>   */
>>>  int devfreq_resume_device(struct devfreq *devfreq)
>>>  {
>>> +   int ret;
>>> +
>>> if (!devfreq)
>>> return -EINVAL;
>>>  
>>> -   if (!devfreq->governor)
>>> -   return 0;
>>> +   if (devfreq->resume_freq) {
>>> +   if (atomic_dec_return(>suspend_count) >= 1)
>>> +   return 0;

ditto.

>>>  
>>> -   return devfreq->governor->event_handler(devfreq,
>>> -   DEVFREQ_GOV_RESUME, NULL);
>>> +   ret = devfreq_set_target(devfreq, devfreq->resume_freq, 0);
>>> +   if (ret)
>>> +   return ret;
>>> +   }
>>> +
>>> +   if (devfreq->governor) {
>>> +   ret = devfreq->governor->event_handler(devfreq,
>>> +   DEVFREQ_GOV_RESUME, NULL);
>>> +   if (ret)
>>> +   return ret;
>>> +   }
>>> +
>>> +   return 0;
>>>  }
>>>  EXPORT_SYMBOL(devfreq_resume_device);
>>>  
>>> diff --git a/include/linux/devfreq.h b/include/linux/devfreq.h
>>> index e4963b0..d985199 100644
>>> --- 

linux-next: Tree for Dec 4

2018-12-03 Thread Stephen Rothwell
Hi all,

Changes since 20181203:

The rdma tree gained a build failure so I used the version from
next-20181203.

The bpf-next tree gained conflicts against the bpf tree.

The char-misc tree gained a conflict against the char-misc.current tree.

The akpm tree lost its build failure but gained a conflict against the
pm tree.

Non-merge commits (relative to Linus' tree): 5958
 6050 files changed, 291311 insertions(+), 163581 deletions(-)



I have created today's linux-next tree at
git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
(patches at http://www.kernel.org/pub/linux/kernel/next/ ).  If you
are tracking the linux-next tree using git, you should not use "git pull"
to do so as that will try to merge the new linux-next release with the
old one.  You should use "git fetch" and checkout or reset to the new
master.

You can see which trees have been included by looking in the Next/Trees
file in the source.  There are also quilt-import.log and merge.log
files in the Next directory.  Between each merge, the tree was built
with a ppc64_defconfig for powerpc, an allmodconfig for x86_64, a
multi_v7_defconfig for arm and a native build of tools/perf. After
the final fixups (if any), I do an x86_64 modules_install followed by
builds for x86_64 allnoconfig, powerpc allnoconfig (32 and 64 bit),
ppc44x_defconfig, allyesconfig and pseries_le_defconfig and i386, sparc
and sparc64 defconfig. And finally, a simple boot test of the powerpc
pseries_le_defconfig kernel in qemu (with and without kvm enabled).

Below is a summary of the state of the merge.

I am currently merging 286 trees (counting Linus' and 67 trees of bug
fix patches pending for the current merge release).

Stats about the size of the tree over time can be seen at
http://neuling.org/linux-next-size.html .

Status of my local build tests will be at
http://kisskb.ellerman.id.au/linux-next .  If maintainers want to give
advice about cross compilers/configs that work, we are always open to add
more builds.

Thanks to Randy Dunlap for doing many randconfig builds.  And to Paul
Gortmaker for triage and bug fixes.

-- 
Cheers,
Stephen Rothwell

$ git checkout master
$ git reset --hard stable
Merging origin/master (0072a0c14d5b Merge tag 'media/v4.20-4' of 
git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media)
Merging fixes/master (d8c137546ef8 powerpc: tag implicit fall throughs)
Merging kbuild-current/fixes (ccda4af0f4b9 Linux 4.20-rc2)
Merging arc-current/for-curr (10d443431dc2 ARC: io.h: Implement 
reads{x}()/writes{x}())
Merging arm-current/fixes (e46daee53bb5 ARM: 8806/1: kprobes: Fix false 
positive with FORTIFY_SOURCE)
Merging arm64-fixes/for-next/fixes (ea2412dc21cc ACPI/IORT: Fix 
iort_get_platform_device_domain() uninitialized pointer value)
Merging m68k-current/for-linus (58c116fb7dc6 m68k/sun3: Remove is_medusa and 
m68k_pgtable_cachemode)
Merging powerpc-fixes/fixes (bf3d6afbb234 powerpc: Look for "stdout-path" when 
setting up legacy consoles)
Merging sparc/master (f3f950dba37b Merge 
git://git.kernel.org/pub/scm/linux/kernel/git/davem/ide)
Merging fscrypt-current/for-stable (ae64f9bd1d36 Linux 4.15-rc2)
Merging net/master (35b827b6d061 tun: forbid iface creation with rtnl ops)
Merging bpf/master (dcb40590e69e bpf: refactor bpf_test_run() to separate own 
failures and test program result)
Merging ipsec/master (4a135e538962 xfrm_user: fix freeing of xfrm states on 
acquire)
Merging netfilter/master (d78a5ebd8b18 Merge branch '1GbE' of 
git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-queue)
Merging ipvs/master (feb9f55c33e5 netfilter: nft_dynset: allow dynamic updates 
of non-anonymous set)
Merging wireless-drivers/master (2e6e902d1850 Linux 4.20-rc4)
Merging mac80211/master (113f3aaa81bd cfg80211: Prevent regulatory restore 
during STA disconnect in concurrent interfaces)
Merging rdma-fixes/for-rc (7bca603a69c0 RDMA/mlx5: Initialize return variable 
in case pagefault was skipped)
Merging sound-current/for-linus (5f8cf7125826 ALSA: usb-audio: Fix UAF 
decrement if card has no live interfaces in card.c)
Merging sound-asoc-fixes/for-linus (280ea4299e05 Merge branch 'asoc-4.20' into 
asoc-linus)
Merging regmap-fixes/for-linus (9ff01193a20d Linux 4.20-rc3)
Merging regulator-fixes/for-linus (fea4962497d8 Merge branch 'regulator-4.20' 
into regulator-linus)
Merging spi-fixes/for-linus (9ea83d4c2b9a Merge branch 'spi-4.20' into 
spi-linus)
Merging pci-current/for-linus (c74eadf881ad Merge remote-tracking branch 
'lorenzo/pci/controller-fixes' into for-linus)
Merging driver-core.current/driver-core-linus (2595646791c3 Linux 4.20-rc5)
Merging tty.current/tty-linus (2a48602615e0 tty: do not set TTY_IO_ERROR flag 
if console port)
Merging usb.current/usb-linus (2595646791c3 Linux 4.20-rc5)
Merging usb-gadget-fixes/fixes (069caf5950df USB: omap_udc: fix rejection of 
out transfers when DMA is used)
Merging usb-serial

Re: [PATCH AUTOSEL 4.14 25/35] iomap: sub-block dio needs to zeroout beyond EOF

2018-12-03 Thread Greg KH
On Mon, Dec 03, 2018 at 11:22:46PM +0159, Thomas Backlund wrote:
> Den 2018-12-03 kl. 11:22, skrev Sasha Levin:
> 
> > 
> > This is a case where theory collides with the real world. Yes, our QA is
> > lacking, but we don't have the option of not doing the current process.
> > If we stop backporting until a future data where our QA problem is
> > solved we'll end up with what we had before: users stuck on ancient
> > kernels without a way to upgrade.
> > 
> 
> Sorry, but you seem to be living in a different "real world"...
> 
> People stay on "ancient kernels" that "just works" instead of updating
> to a newer one that "hopefully/maybe/... works"

That's not good as those "ancient kernels" really just are "kernels with
lots of known security bugs".

It's your systems, I can't tell you what to do, but I will tell you that
running older, unfixed kernels, is a known liability.

Good luck!

greg k-h


Re: [patch 1/2 for-4.20] mm, thp: restore node-local hugepage allocations

2018-12-03 Thread Michal Hocko
On Mon 03-12-18 15:50:24, David Rientjes wrote:
> This is a full revert of ac5b2c18911f ("mm: thp: relax __GFP_THISNODE for
> MADV_HUGEPAGE mappings") and a partial revert of 89c83fb539f9 ("mm, thp:
> consolidate THP gfp handling into alloc_hugepage_direct_gfpmask").
> 
> By not setting __GFP_THISNODE, applications can allocate remote hugepages
> when the local node is fragmented or low on memory when either the thp
> defrag setting is "always" or the vma has been madvised with
> MADV_HUGEPAGE.
> 
> Remote access to hugepages often has much higher latency than local pages
> of the native page size.  On Haswell, ac5b2c18911f was shown to have a
> 13.9% access regression after this commit for binaries that remap their
> text segment to be backed by transparent hugepages.
> 
> The intent of ac5b2c18911f is to address an issue where a local node is
> low on memory or fragmented such that a hugepage cannot be allocated.  In
> every scenario where this was described as a fix, there is abundant and
> unfragmented remote memory available to allocate from, even with a greater
> access latency.
> 
> If remote memory is also low or fragmented, not setting __GFP_THISNODE was
> also measured on Haswell to have a 40% regression in allocation latency.
> 
> Restore __GFP_THISNODE for thp allocations.
> 
> Fixes: ac5b2c18911f ("mm: thp: relax __GFP_THISNODE for MADV_HUGEPAGE 
> mappings")
> Fixes: 89c83fb539f9 ("mm, thp: consolidate THP gfp handling into 
> alloc_hugepage_direct_gfpmask")

At minimum do not remove the cleanup part which consolidates the gfp
hadnling to a single place. There is no real reason to have the
__GFP_THISNODE ugliness outside of alloc_hugepage_direct_gfpmask.

I still hate the __GFP_THISNODE part as mentioned before. It is an ugly
hack but I can learn to live with it if this is indeed the only option
for the short term workaround until we find a proper solution.

> Signed-off-by: David Rientjes 
> ---
>  include/linux/mempolicy.h |  2 --
>  mm/huge_memory.c  | 42 +++
>  mm/mempolicy.c|  7 ---
>  3 files changed, 20 insertions(+), 31 deletions(-)
> 
> diff --git a/include/linux/mempolicy.h b/include/linux/mempolicy.h
> --- a/include/linux/mempolicy.h
> +++ b/include/linux/mempolicy.h
> @@ -139,8 +139,6 @@ struct mempolicy *mpol_shared_policy_lookup(struct 
> shared_policy *sp,
>  struct mempolicy *get_task_policy(struct task_struct *p);
>  struct mempolicy *__get_vma_policy(struct vm_area_struct *vma,
>   unsigned long addr);
> -struct mempolicy *get_vma_policy(struct vm_area_struct *vma,
> - unsigned long addr);
>  bool vma_policy_mof(struct vm_area_struct *vma);
>  
>  extern void numa_default_policy(void);
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -632,37 +632,27 @@ static vm_fault_t __do_huge_pmd_anonymous_page(struct 
> vm_fault *vmf,
>  static inline gfp_t alloc_hugepage_direct_gfpmask(struct vm_area_struct 
> *vma, unsigned long addr)
>  {
>   const bool vma_madvised = !!(vma->vm_flags & VM_HUGEPAGE);
> - gfp_t this_node = 0;
> -
> -#ifdef CONFIG_NUMA
> - struct mempolicy *pol;
> - /*
> -  * __GFP_THISNODE is used only when __GFP_DIRECT_RECLAIM is not
> -  * specified, to express a general desire to stay on the current
> -  * node for optimistic allocation attempts. If the defrag mode
> -  * and/or madvise hint requires the direct reclaim then we prefer
> -  * to fallback to other node rather than node reclaim because that
> -  * can lead to excessive reclaim even though there is free memory
> -  * on other nodes. We expect that NUMA preferences are specified
> -  * by memory policies.
> -  */
> - pol = get_vma_policy(vma, addr);
> - if (pol->mode != MPOL_BIND)
> - this_node = __GFP_THISNODE;
> - mpol_cond_put(pol);
> -#endif
> + const gfp_t gfp_mask = GFP_TRANSHUGE_LIGHT | __GFP_THISNODE;
>  
> + /* Always do synchronous compaction */
>   if (test_bit(TRANSPARENT_HUGEPAGE_DEFRAG_DIRECT_FLAG, 
> _hugepage_flags))
> - return GFP_TRANSHUGE | (vma_madvised ? 0 : __GFP_NORETRY);
> + return GFP_TRANSHUGE | __GFP_THISNODE |
> +(vma_madvised ? 0 : __GFP_NORETRY);
> +
> + /* Kick kcompactd and fail quickly */
>   if (test_bit(TRANSPARENT_HUGEPAGE_DEFRAG_KSWAPD_FLAG, 
> _hugepage_flags))
> - return GFP_TRANSHUGE_LIGHT | __GFP_KSWAPD_RECLAIM | this_node;
> + return gfp_mask | __GFP_KSWAPD_RECLAIM;
> +
> + /* Synchronous compaction if madvised, otherwise kick kcompactd */
>   if (test_bit(TRANSPARENT_HUGEPAGE_DEFRAG_KSWAPD_OR_MADV_FLAG, 
> _hugepage_flags))
> - return GFP_TRANSHUGE_LIGHT | (vma_madvised ? 
> __GFP_DIRECT_RECLAIM :
> -  
> __GFP_KSWAPD_RECLAIM | this_node);
> + 

Re: [RFC PATCH 00/14] Heterogeneous Memory System (HMS) and hbind()

2018-12-03 Thread Aneesh Kumar K.V

On 12/4/18 5:04 AM, jgli...@redhat.com wrote:

From: Jérôme Glisse 

Heterogeneous memory system are becoming more and more the norm, in
those system there is not only the main system memory for each node,
but also device memory and|or memory hierarchy to consider. Device
memory can comes from a device like GPU, FPGA, ... or from a memory
only device (persistent memory, or high density memory device).

Memory hierarchy is when you not only have the main memory but also
other type of memory like HBM (High Bandwidth Memory often stack up
on CPU die or GPU die), peristent memory or high density memory (ie
something slower then regular DDR DIMM but much bigger).

On top of this diversity of memories you also have to account for the
system bus topology ie how all CPUs and devices are connected to each
others. Userspace do not care about the exact physical topology but
care about topology from behavior point of view ie what are all the
paths between an initiator (anything that can initiate memory access
like CPU, GPU, FGPA, network controller ...) and a target memory and
what are all the properties of each of those path (bandwidth, latency,
granularity, ...).

This means that it is no longer sufficient to consider a flat view
for each node in a system but for maximum performance we need to
account for all of this new memory but also for system topology.
This is why this proposal is unlike the HMAT proposal [1] which
tries to extend the existing NUMA for new type of memory. Here we
are tackling a much more profound change that depart from NUMA.


One of the reasons for radical change is the advance of accelerator
like GPU or FPGA means that CPU is no longer the only piece where
computation happens. It is becoming more and more common for an
application to use a mix and match of different accelerator to
perform its computation. So we can no longer satisfy our self with
a CPU centric and flat view of a system like NUMA and NUMA distance.


This patchset is a proposal to tackle this problems through three
aspects:
 1 - Expose complex system topology and various kind of memory
 to user space so that application have a standard way and
 single place to get all the information it cares about.
 2 - A new API for user space to bind/provide hint to kernel on
 which memory to use for range of virtual address (a new
 mbind() syscall).
 3 - Kernel side changes for vm policy to handle this changes

This patchset is not and end to end solution but it provides enough
pieces to be useful against nouveau (upstream open source driver for
NVidia GPU). It is intended as a starting point for discussion so
that we can figure out what to do. To avoid having too much topics
to discuss i am not considering memory cgroup for now but it is
definitely something we will want to integrate with.

The rest of this emails is splits in 3 sections, the first section
talks about complex system topology: what it is, how it is use today
and how to describe it tomorrow. The second sections talks about
new API to bind/provide hint to kernel for range of virtual address.
The third section talks about new mechanism to track bind/hint
provided by user space or device driver inside the kernel.


1) Complex system topology and representing them


Inside a node you can have a complex topology of memory, for instance
you can have multiple HBM memory in a node, each HBM memory tie to a
set of CPUs (all of which are in the same node). This means that you
have a hierarchy of memory for CPUs. The local fast HBM but which is
expected to be relatively small compare to main memory and then the
main memory. New memory technology might also deepen this hierarchy
with another level of yet slower memory but gigantic in size (some
persistent memory technology might fall into that category). Another
example is device memory, and device themself can have a hierarchy
like HBM on top of device core and main device memory.

On top of that you can have multiple path to access each memory and
each path can have different properties (latency, bandwidth, ...).
Also there is not always symmetry ie some memory might only be
accessible by some device or CPU ie not accessible by everyone.

So a flat hierarchy for each node is not capable of representing this
kind of complexity. To simplify discussion and because we do not want
to single out CPU from device, from here on out we will use initiator
to refer to either CPU or device. An initiator is any kind of CPU or
device that can access memory (ie initiate memory access).

At this point a example of such system might help:
 - 2 nodes and for each node:
 - 1 CPU per node with 2 complex of CPUs cores per CPU
 - one HBM memory for each complex of CPUs cores (200GB/s)
 - CPUs cores complex are linked to each other (100GB/s)
 - main memory is (90GB/s)
 - 4 GPUs each with:
 - HBM memory for 

Re: [RFC/RFT][PATCH v6] cpuidle: New timer events oriented governor for tickless systems

2018-12-03 Thread Rafael J. Wysocki
On Thursday, November 29, 2018 12:20:07 AM CET Doug Smythies wrote:
> On 2018.11.23 02:36 Rafael J. Wysocki wrote:
> 
> v5 -> v6:
>  * Avoid applying poll_time_limit to non-polling idle states by mistake.
>  * Use idle duration measured by the governor for everything (as it likely is
>more accurate than the one measured by the core).
> 
> -- above missing-- (see follow up e-mail from Rafael)
> 
>  * Rename SPIKE to PULSE.
>  * Do not run pattern detection upfront.  Instead, use recent idle duration
>values to refine the state selection after finding a candidate idle state.
>  * Do not use the expected idle duration as an extra latency constraint
>(exit latency is less than the target residency for all of the idle states
>known to me anyway, so this doesn't change anything in practice).
> 
> Hi Rafael,
> 
> I did some minimal testing on teov6, using kernel 4.20-rc3 as my baseline
> reference kernel.
> 
> Test 1: Phoronix bdench test, all options: 1, 6, 12, 48, 128, 256 clients.
> 
> Note: because it uses the disk, the dbench test is somewhat non-repeatable.
> However, if particular attention is paid to not doing anything else with
> the disk between tests, then it seems to be repeatable to within about 6%.
> 
> Anyway no significant difference observed between kernel 4.20-rc3 and the
> same with the teov6 patch.
> 
> Test 2: Pipe test, non cross core. (And idle state 0 test, really)
> I ran 4 pipe tests, 1 for each of my 4 cores, @2 CPUs per core.
> Thus, pretty much only idle state 0 was ever used.
> Processor package power was similar for both kernels.
> teov6 entered/exited idle state 0 about 60,984 times/second/cpu.
> -rc3 entered/exited idle state 0 about 62,806 times/second/cpu.
> There was a difference in percentage time spent in idle state 0,
> with kernel 4.20-rc3 spending 0.2441% in idle state 0 verses
> teov6 at 0.0641%.
> 
> For throughput, teov6 was 1.4% faster.

This may indicate that teov6 is somewhat too aggressive.

> Test 3: was an attempt to sweep through a preference for
> all idle states.
> 
> 40 threads were launched with nothing to do except sleep
> for a variable duration of 1 to 500 uSec, each step was
> run for 1 minute. With 1 minute idle before the test and a few
> minutes idle after, the total test duration was about 505 minutes.
> Recall that when one asks for a short sleep of 1 uSec, they actually
> get about 50 uSec, due to overheads. So I use 40 threads in an attempt
> to get the average time between wakeup events per CPU down somewhat.
> 
> The results are here:
> http://fast.smythies.com/linux-pm/k420/k420-pn-sweep-teo6-2.htm

And, so long as my understanding of the graphs is correct, the results
here indicate that teov6 tends to prefer relatively shallow idle states
which is good for performance (at least with some workloads), but not
necessarily for energy-efficiency.

I will send a v7 of TEO with some changes to make it a bit more
energy-efficient with respect to the v6.

Thanks,
Rafael



Re: [PATCH memory-model 0/3] Updates to the formal memory model

2018-12-03 Thread Paul E. McKenney
On Tue, Dec 04, 2018 at 08:28:03AM +0900, Akira Yokosawa wrote:
> On 2018/12/03 15:04:11 -0800, Paul E. McKenney wrote:
> > Hello, Ingo!
> > 
> > This series contains updates to the Linux kernel's formal memory model
> > in tools/memory-model.  These patches are ready for inclusion into -tip.
> > 
> > 1.  Model smp_mb__after_unlock_lock(), courtesy of Andrea Parri.
> > 
> > 2.  Add scripts to check github litmus tests.
> > 
> > 3.  Make scripts take "-j" abbreviation for "--jobs".
> > 
> > There is another series in preparation to model SRCU, but this series
> > requires hot-off-the presses changes to the herd tool that have not yet
> > been released.  This SRCU series is therefore targeting the merge window
> > after the upcoming one.  People wishing to experiment with the prototype
> > SRCU model may obtain it from my -rcu tree at branch "dev", and use
> > a bleeding-edge herd7 built from https://github.com/herd/herdtools7/,
> > version 7.51+2(dev), which is (commit 10403b24070c) or later.
> 
> On the master branch of herdtools7, SRCU support was added in version
> 7.51+4(dev), which is commit 6ec9da1f4d58, or later.

It has been working for me with version 7.51+2(dev), but perhaps I
have just been getting lucky.  It wouldn't be the first time!  ;-)

Thanx, Paul

> Thanks, Akira
> 
> > 
> > Thanx, Paul
> > 
> > 
> > 
> >  .gitignore |1 
> >  README |2 
> >  linux-kernel.bell  |3 
> >  linux-kernel.cat   |4 -
> >  linux-kernel.def   |1 
> >  scripts/README |   70 ++
> >  scripts/checkalllitmus.sh  |   53 +++--
> >  scripts/checkghlitmus.sh   |   65 
> >  scripts/checklitmus.sh |   74 +++
> >  scripts/checklitmushist.sh |   60 +++
> >  scripts/cmplitmushist.sh   |   87 +++
> >  scripts/initlitmushist.sh  |   68 +
> >  scripts/judgelitmus.sh |   78 +
> >  scripts/newlitmushist.sh   |   61 +++
> >  scripts/parseargs.sh   |  140 
> > -
> >  scripts/runlitmushist.sh   |   87 +++
> >  16 files changed, 757 insertions(+), 97 deletions(-)
> > 
> 



Re: [PATCH] arm64: dts: qcom: msm8998: Fix compatible of scm node

2018-12-03 Thread Stephen Boyd
Quoting Bjorn Andersson (2018-11-29 22:56:55)
> The scm binding and driver was updated to rely on the fallback to the
> default qcom,scm for any modern SoC and as such both are required. Add
> the default compatible to make the scm instance probe.
> 
> Fixes: d850156a226a ("arm64: dts: qcom: msm8998: Add firmware node")
> Signed-off-by: Bjorn Andersson 
> ---

Reviewed-by: Stephen Boyd 



Re: [PATCH v1 2/2] mtd: spi-nor: add NPCM FIU controller driver

2018-12-03 Thread kbuild test robot
Hi Tomer,

I love your patch! Yet something to improve:

[auto build test ERROR on mtd/spi-nor/next]
[also build test ERROR on v4.20-rc5 next-20181203]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Tomer-Maimon/dt-binding-mtd-add-NPCM-FIU-controller/20181203-201804
base:   git://git.infradead.org/linux-mtd.git spi-nor/next
config: i386-allmodconfig (attached as .config)
compiler: gcc-7 (Debian 7.3.0-1) 7.3.0
reproduce:
# save the attached .config to linux build tree
make ARCH=i386 

All errors (new ones prefixed by >>):

>> drivers/mtd/spi-nor/npcm-fiu.c:25:10: fatal error: asm/sizes.h: No such file 
>> or directory
#include 
 ^
   compilation terminated.

vim +25 drivers/mtd/spi-nor/npcm-fiu.c

24  
  > 25  #include 
26  #include 
27  

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip


Re: [PATCH] x86/mpx: pass 'mm' to kernel_managing_mpx_tables() in mpx_notify_unmap()

2018-12-03 Thread Jarkko Sakkinen
On Mon, Dec 03, 2018 at 12:49:44PM -0800, Dave Hansen wrote:
> On 12/3/18 12:43 PM, Jarkko Sakkinen wrote:
> > If mm is not the same as current->mm, mpx_notify_unmap() will yield
> > invalid results and at worst will lead to a crash if it gets called by
> > a kthread.
> 
> It's also worth noting that this does not fix any actual,
> end-user-visible bug today.  It really only prepares the code for the
> case where it is called for a different mm than current->mm.
> 
> > --- a/arch/x86/mm/mpx.c
> > +++ b/arch/x86/mm/mpx.c
> > @@ -882,7 +882,7 @@ static int mpx_unmap_tables(struct mm_struct *mm,
> >   * necessary, and the 'vma' is the first vma in this range (start -> end).
> >   */
> >  void mpx_notify_unmap(struct mm_struct *mm, struct vm_area_struct *vma,
> > -   unsigned long start, unsigned long end)
> > + unsigned long start, unsigned long end)
> >  {
> > int ret;
> 
> Please leave superfluous whitespace changes out of these things.
> 
> But, otherwise, this looks fine.
> 
> > Fixes: 1de4fa14ee25 ("x86, mpx: Cleanup unused bound tables")
> 
> FWIW, I'm not sure you should be submitting this separately from your
> SGX series.  The deferred unmapping is really the thing that requires
> the code to be changed.

Thank you for the feedback. I'll include this to the next revision of the
SGX patch set and explain why the change is needed.

/Jarkko


Re: [PATCH 2/2] irqchip/meson-gpio: Add support for Meson-G12A SoC

2018-12-03 Thread Xingyu Chen




On 2018/12/3 18:06, Neil Armstrong wrote:

On 03/12/2018 10:28, Xingyu Chen wrote:



On 2018/12/3 17:19, Jerome Brunet wrote:

On Mon, 2018-12-03 at 14:13 +0800, Xingyu Chen wrote:

The Meson-G12A SoC uses the same GPIO interrupt controller IP block as the
other Meson SoCs, A totle of 100 pins can be spied on, which is the sum of:

- 223:100 undefined (no interrupt)
- 99:97   3 pins on bank GPIOE
- 96:77   20 pins on bank GPIOX
- 76:61   16 pins on bank GPIOA
- 60:53   8 pins on bank GPIOC
- 52:37   16 pins on bank BOOT
- 36:28   9 pins on bank GPIOH
- 27:12   16 pins on bank GPIOZ
- 11:0    12 pins in the AO domain

Signed-off-by: Xingyu Chen 
Signed-off-by: Jianxin Pan 
---
   drivers/irqchip/irq-meson-gpio.c | 5 +
   1 file changed, 5 insertions(+)

diff --git a/drivers/irqchip/irq-meson-gpio.c b/drivers/irqchip/irq-meson-
gpio.c
index 7b531fd075b8..971e8dea069a 100644
--- a/drivers/irqchip/irq-meson-gpio.c
+++ b/drivers/irqchip/irq-meson-gpio.c
@@ -67,12 +67,17 @@ static const struct meson_gpio_irq_params axg_params = {
   .nr_hwirq = 100,
   };
   +static const struct meson_gpio_irq_params g12a_params = {
+    .nr_hwirq = 100,
+};
+


Same comment as on i2c, the g12 seems compatible with the axg.
Is this patchset patchset really necessary ?


Although the total number of pins is the same as the Meson-AXG SoC, the gpio 
banks and irq numbers are different. To avoid confusion on use, i think the new 
compatible string is needed.


OK for the new compatible, but you can re-use the same struct like for i2c.

Neil


Thanks for your comment, I will fix it in the next version.


   static const struct of_device_id meson_irq_gpio_matches[] = {
   { .compatible = "amlogic,meson8-gpio-intc", .data = _params },
   { .compatible = "amlogic,meson8b-gpio-intc", .data = _params
},
   { .compatible = "amlogic,meson-gxbb-gpio-intc", .data = _params
},
   { .compatible = "amlogic,meson-gxl-gpio-intc", .data = _params },
   { .compatible = "amlogic,meson-axg-gpio-intc", .data = _params },
+    { .compatible = "amlogic,meson-g12a-gpio-intc", .data = _params
},
   { }
   };
   



.



___
linux-amlogic mailing list
linux-amlo...@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-amlogic


.



Re: [PATCH 2/3] mm/vmscan: Enable kswapd to reclaim low-protected memory

2018-12-03 Thread Xunlei Pang
On 2018/12/4 AM 1:22, Michal Hocko wrote:
> On Mon 03-12-18 23:20:31, Xunlei Pang wrote:
>> On 2018/12/3 下午7:56, Michal Hocko wrote:
>>> On Mon 03-12-18 16:01:18, Xunlei Pang wrote:
 There may be cgroup memory overcommitment, it will become
 even common in the future.

 Let's enable kswapd to reclaim low-protected memory in case
 of memory pressure, to mitigate the global direct reclaim
 pressures which could cause jitters to the response time of
 lantency-sensitive groups.
>>>
>>> Please be more descriptive about the problem you are trying to handle
>>> here. I haven't actually read the patch but let me emphasise that the
>>> low limit protection is important isolation tool. And allowing kswapd to
>>> reclaim protected memcgs is going to break the semantic as it has been
>>> introduced and designed.
>>
>> We have two types of memcgs: online groups(important business)
>> and offline groups(unimportant business). Online groups are
>> all configured with MAX low protection, while offline groups
>> are not at all protected(with default 0 low).
>>
>> When offline groups are overcommitted, the global memory pressure
>> suffers. This will cause the memory allocations from online groups
>> constantly go to the slow global direct reclaim in order to reclaim
>> online's page caches, as kswap is not able to reclaim low-protection
>> memory. low is not hard limit, it's reasonable to be reclaimed by
>> kswapd if there's no other reclaimable memory.
> 
> I am sorry I still do not follow. What role do offline cgroups play.
> Those are certainly not low mem protected because mem_cgroup_css_offline
> will reset them to 0.
> 

Oh, I meant "offline groups" to be "offline-business groups", memcgs
refered to here are all "online state" from kernel's perspective.


Re: [PATCH v2 2/5] devfreq: add support for suspend/resume of a devfreq device

2018-12-03 Thread Chanwoo Choi
Hi Lukasz,

Looks good to me. But, I add the some comments.
If you will fix it, feel free to add my tag:
Reviewed-by: Chanwoo choi 

On 2018년 12월 03일 23:31, Lukasz Luba wrote:
> The patch prepares devfreq device for handling suspend/resume
> functionality.  The new fields will store needed information during this

nitpick. Remove unneeded space. There are two spaces between '.' and 'The new'. 

> process.  Devfreq framework handles opp-suspend DT entry and there is no

ditto.

> need of modyfications in the drivers code.  It uses atomic variables to

ditto.

> make sure no race condition affects the process.
> 
> The patch is based on earlier work by Tobias Jakobi.

Please remove it from each patch description.

> 
> Suggested-by: Tobias Jakobi 
> Suggested-by: Chanwoo Choi 
> Signed-off-by: Lukasz Luba 
> ---
>  drivers/devfreq/devfreq.c | 51 
> +++
>  include/linux/devfreq.h   |  7 +++
>  2 files changed, 50 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/devfreq/devfreq.c b/drivers/devfreq/devfreq.c
> index a9fd61b..36bed24 100644
> --- a/drivers/devfreq/devfreq.c
> +++ b/drivers/devfreq/devfreq.c
> @@ -316,6 +316,10 @@ static int devfreq_set_target(struct devfreq *devfreq, 
> unsigned long new_freq,
>   "Couldn't update frequency transition information.\n");
>  
>   devfreq->previous_freq = new_freq;
> +
> + if (devfreq->suspend_freq)
> + devfreq->resume_freq = cur_freq;
> +
>   return err;
>  }
>  
> @@ -667,6 +671,9 @@ struct devfreq *devfreq_add_device(struct device *dev,
>   }
>   devfreq->max_freq = devfreq->scaling_max_freq;
>  
> + devfreq->suspend_freq = dev_pm_opp_get_suspend_opp_freq(dev);
> + atomic_set(>suspend_count, 0);
> +
>   dev_set_name(>dev, "devfreq%d",
>   atomic_inc_return(_no));
>   err = device_register(>dev);
> @@ -867,14 +874,28 @@ EXPORT_SYMBOL(devm_devfreq_remove_device);
>   */
>  int devfreq_suspend_device(struct devfreq *devfreq)
>  {
> + int ret;
> +
>   if (!devfreq)
>   return -EINVAL;
>  
> - if (!devfreq->governor)
> - return 0;
> + if (devfreq->governor) {
> + ret = devfreq->governor->event_handler(devfreq,
> + DEVFREQ_GOV_SUSPEND, NULL);
> + if (ret)
> + return ret;
> + }
> +
> + if (devfreq->suspend_freq) {
> + if (atomic_inc_return(>suspend_count) > 1)
> + return 0;
> +
> + ret = devfreq_set_target(devfreq, devfreq->suspend_freq, 0);
> + if (ret)
> + return ret;
> + }
>  
> - return devfreq->governor->event_handler(devfreq,
> - DEVFREQ_GOV_SUSPEND, NULL);
> + return 0;
>  }
>  EXPORT_SYMBOL(devfreq_suspend_device);
>  
> @@ -888,14 +909,28 @@ EXPORT_SYMBOL(devfreq_suspend_device);
>   */
>  int devfreq_resume_device(struct devfreq *devfreq)
>  {
> + int ret;
> +
>   if (!devfreq)
>   return -EINVAL;
>  
> - if (!devfreq->governor)
> - return 0;
> + if (devfreq->resume_freq) {
> + if (atomic_dec_return(>suspend_count) >= 1)
> + return 0;
>  
> - return devfreq->governor->event_handler(devfreq,
> - DEVFREQ_GOV_RESUME, NULL);
> + ret = devfreq_set_target(devfreq, devfreq->resume_freq, 0);
> + if (ret)
> + return ret;
> + }
> +
> + if (devfreq->governor) {
> + ret = devfreq->governor->event_handler(devfreq,
> + DEVFREQ_GOV_RESUME, NULL);
> + if (ret)
> + return ret;
> + }
> +
> + return 0;
>  }
>  EXPORT_SYMBOL(devfreq_resume_device);
>  
> diff --git a/include/linux/devfreq.h b/include/linux/devfreq.h
> index e4963b0..d985199 100644
> --- a/include/linux/devfreq.h
> +++ b/include/linux/devfreq.h
> @@ -131,6 +131,9 @@ struct devfreq_dev_profile {
>   * @scaling_min_freq:Limit minimum frequency requested by OPP 
> interface
>   * @scaling_max_freq:Limit maximum frequency requested by OPP 
> interface
>   * @stop_polling: devfreq polling status of a device.
> + * @suspend_freq: frequency of a device set during suspend phase.
> + * @resume_freq:  frequency of a device set in resume phase.
> + * @suspend_count:suspend requests counter for a device.
>   * @total_trans: Number of devfreq transitions
>   * @trans_table: Statistics of devfreq transitions
>   * @time_in_state:   Statistics of devfreq states
> @@ -167,6 +170,10 @@ struct devfreq {
>   unsigned long scaling_max_freq;
>   bool stop_polling;
>  
> + unsigned long suspend_freq;
> + unsigned long resume_freq;
> + atomic_t suspend_count;
> +
>   /* information for device frequency transition */
>   unsigned int 

Re: [PATCH] mm/alloc: fallback to first node if the wanted node offline

2018-12-03 Thread Michal Hocko
On Tue 04-12-18 11:05:57, Pingfan Liu wrote:
> During my test on some AMD machine, with kexec -l nr_cpus=x option, the
> kernel failed to bootup, because some node's data struct can not be allocated,
> e.g, on x86, initialized by init_cpu_to_node()->init_memory_less_node(). But
> device->numa_node info is used as preferred_nid param for
> __alloc_pages_nodemask(), which causes NULL reference
>   ac->zonelist = node_zonelist(preferred_nid, gfp_mask);
> This patch tries to fix the issue by falling back to the first online node,
> when encountering such corner case.

We have seen similar issues already and the bug was usually that the
zonelists were not initialized yet or the node is completely bogus.
Zonelists should be initialized by build_all_zonelists quite early so I
am wondering whether the later is the case. What is the actual node
number the device is associated with?

Your patch is not correct btw, because we want to fallback into the node in
the distance order rather into the first online node.
-- 
Michal Hocko
SUSE Labs


Re: [PATCH 2/9] tools/lib/traceevent: Added support for pkg-config

2018-12-03 Thread Namhyung Kim
Hi Steve,

On Fri, Nov 30, 2018 at 10:44:05AM -0500, Steven Rostedt wrote:
> From: Tzvetomir Stoyanov 
> 
> This patch implements integration with pkg-config framework.
> pkg-config can be used by the library users to determine
> required CFLAGS and LDFLAGS in order to use the library
> 
> Signed-off-by: Tzvetomir Stoyanov 
> Signed-off-by: Steven Rostedt (VMware) 
> ---

[SNIP]
> diff --git a/tools/lib/traceevent/libtraceevent.pc.template 
> b/tools/lib/traceevent/libtraceevent.pc.template
> new file mode 100644
> index ..42e4d6cb6b9e
> --- /dev/null
> +++ b/tools/lib/traceevent/libtraceevent.pc.template
> @@ -0,0 +1,10 @@
> +prefix=INSTALL_PREFIX
> +libdir=${prefix}/lib64

Don't we care 32-bit systems anymore? :)

Thanks,
Namhyung


> +includedir=${prefix}/include/traceevent
> +
> +Name: libtraceevent
> +URL: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
> +Description: Linux kernel trace event library
> +Version: LIB_VERSION
> +Cflags: -I${includedir}
> +Libs: -L${libdir} -ltraceevent
> -- 
> 2.19.1
> 
> 


Re: ext4 file system corruption with v4.19.3 / v4.19.4

2018-12-03 Thread Gunter Königsmann
After upgrading my kernel to 4.19 I got a corruption on nearly every
reboot or resume from suspend on my Acer s7-391 [UEFI boot].

Going to my UEFI setup and changing IDE mode from IDE to ATA seems to
have resolved the issue for me.

Don't know, though, if that is a valid data point or if it was a mere
accident (tested only on one computer) or just avoids the Bad Timing by
a few nanoseconds




Re: [PATCH 1/3] dt-bindings: soc: qcom: Add AOSS QMP binding

2018-12-03 Thread Rob Herring
On Mon, Nov 12, 2018 at 12:05:55AM -0800, Bjorn Andersson wrote:
> Add binding for the QMP based side-channel communication mechanism to
> the AOSS, which is used to control resources not exposed through the
> RPMh interface.
> 
> Signed-off-by: Bjorn Andersson 
> ---
>  .../bindings/soc/qcom/qcom,aoss-qmp.txt   | 63 +++
>  include/dt-bindings/power/qcom-aoss-qmp.h | 15 +
>  2 files changed, 78 insertions(+)
>  create mode 100644 
> Documentation/devicetree/bindings/soc/qcom/qcom,aoss-qmp.txt
>  create mode 100644 include/dt-bindings/power/qcom-aoss-qmp.h
> 
> diff --git a/Documentation/devicetree/bindings/soc/qcom/qcom,aoss-qmp.txt 
> b/Documentation/devicetree/bindings/soc/qcom/qcom,aoss-qmp.txt
> new file mode 100644
> index ..e3c8cb4372f2
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/soc/qcom/qcom,aoss-qmp.txt
> @@ -0,0 +1,63 @@
> +Qualcomm Always-On Subsystem side channel binding
> +
> +This binding describes the hardware component responsible for side channel
> +requests to the always-on subsystem (AOSS), used for certain power management
> +requests that is not handled by the standard RPMh interface. Each client in 
> the
> +SoC has it's own block of message RAM and IRQ for communication with the 
> AOSS.
> +The protocol used to communicate in the message RAM is known as QMP.
> +
> +- compatible:
> + Usage: required
> + Value type: 
> + Definition: must be "qcom,sdm845-aoss-qmp"
> +
> +- reg:
> + Usage: required
> + Value type: 
> + Definition: the base address and size of the message RAM for this
> + client's communication with the AOSS
> +
> +- interrupts:
> + Usage: required
> + Value type: 
> + Definition: should specify the AOSS message IRQ for this client
> +
> +- mboxes:
> + Usage: required
> + Value type: 
> + Definition: reference to the mailbox representing the outgoing doorbell
> + in APCS for this client, as described in mailbox/mailbox.txt
> +
> += AOSS Power-domains
> +The AOSS side channel exposes control over a set of resources, used to 
> control
> +a set of debug related clocks and to affect the low power state of resources
> +related to the secondary subsystems. These resources are described as a set 
> of
> +power-domains in a subnode of hte AOSS side channel node.

Why does this need to be a sub-node? Are there other sub-nodes needed?

> +
> +- compatible:
> + Usage: required
> + Value type: 
> + Definition: must be "qcom,sdm845-aoss-qmp-pd"
> +
> +- #power-domain-cells:
> + Usage: required
> + Value type: 
> + Definition: must be 1


[PATCH] perf, tools: Support srccode output

2018-12-03 Thread Andi Kleen
From: Andi Kleen 

When looking at PT or brstackinsn traces with perf script
it can be very useful to see the source code. This adds a simple
facility to print them with perf script, if the information
is available through dwarf

% perf record ...
% perf script -F insn,ip,sym,srccode
...

  4004c6 main
5   for (i = 0; i < 1000; i++)
   4004cd main
5   for (i = 0; i < 1000; i++)
   4004c6 main
5   for (i = 0; i < 1000; i++)
   4004cd main
5   for (i = 0; i < 1000; i++)
   4004cd main
5   for (i = 0; i < 1000; i++)
   4004cd main
5   for (i = 0; i < 1000; i++)
   4004cd main
5   for (i = 0; i < 1000; i++)
   4004cd main
5   for (i = 0; i < 1000; i++)
   4004b3 main
6   v++;

% perf record -b ...
% perf script -F insn,ip,sym,srccode,brstackinsn

...
   main+22:
00400543insn: e8 ca ff ff ff# PRED
|18 f1();
f1:
00400512insn: 55
|10   {
00400513insn: 48 89 e5
00400516insn: b8 00 00 00 00
|11 f2();
0040051binsn: e8 d6 ff ff ff# PRED
f2:
004004f6insn: 55
|5{
004004f7insn: 48 89 e5
004004fainsn: 8b 05 2c 0b 20 00
|6  c = a / b;
00400500insn: 8b 0d 2a 0b 20 00
00400506insn: 99
00400507insn: f7 f9
00400509insn: 89 05 29 0b 20 00
0040050finsn: 90
|7}
00400510insn: 5d
00400511insn: c3# PRED
f1+14:
00400520insn: b8 00 00 00 00
|12 f2();
00400525insn: e8 cc ff ff ff# PRED
f2:
004004f6insn: 55
|5{
004004f7insn: 48 89 e5
004004fainsn: 8b 05 2c 0b 20 00
|6  c = a / b;

Not supported for callchains currently, would need some
layout changes there.

Signed-off-by: Andi Kleen 
---
 tools/perf/Documentation/perf-script.txt |   2 +-
 tools/perf/builtin-script.c  |  47 +-
 tools/perf/util/Build|   1 +
 tools/perf/util/evsel_fprintf.c  |   1 +
 tools/perf/util/map.c|  49 ++
 tools/perf/util/map.h|  16 ++
 tools/perf/util/srccode.c| 186 +++
 tools/perf/util/srccode.h|   7 +
 tools/perf/util/srcline.c|  28 
 tools/perf/util/srcline.h|   1 +
 tools/perf/util/thread.c |   2 +
 tools/perf/util/thread.h |   2 +
 12 files changed, 339 insertions(+), 3 deletions(-)
 create mode 100644 tools/perf/util/srccode.c
 create mode 100644 tools/perf/util/srccode.h

diff --git a/tools/perf/Documentation/perf-script.txt 
b/tools/perf/Documentation/perf-script.txt
index a2b37ce48094..9e4def08d569 100644
--- a/tools/perf/Documentation/perf-script.txt
+++ b/tools/perf/Documentation/perf-script.txt
@@ -117,7 +117,7 @@ OPTIONS
 Comma separated list of fields to print. Options are:
 comm, tid, pid, time, cpu, event, trace, ip, sym, dso, addr, symoff,
 srcline, period, iregs, uregs, brstack, brstacksym, flags, bpf-output, 
brstackinsn,
-brstackoff, callindent, insn, insnlen, synth, phys_addr, metric, misc.
+brstackoff, callindent, insn, insnlen, synth, phys_addr, metric, misc, 
srccode.
 Field list can be prepended with the type, trace, sw or hw,
 to indicate to which event type the field list applies.
 e.g., -F sw:comm,tid,time,ip,sym  and -F trace:time,cpu,trace
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 04913136bac9..fa6f86b98c76 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -96,6 +96,7 @@ enum perf_output_field {
PERF_OUTPUT_UREGS   = 1U << 27,
PERF_OUTPUT_METRIC  = 1U << 28,
PERF_OUTPUT_MISC= 1U << 29,
+   PERF_OUTPUT_SRCCODE = 1U << 30,
 };
 
 struct output_option {
@@ -132,6 +133,7 @@ struct output_option {
{.str = "phys_addr", .field = PERF_OUTPUT_PHYS_ADDR},
{.str = "metric", .field = PERF_OUTPUT_METRIC},
{.str = "misc", .field = PERF_OUTPUT_MISC},
+   {.str = "srccode", .field = PERF_OUTPUT_SRCCODE},
 };
 
 enum {
@@ -424,7 +426,7 @@ static int perf_evsel__check_attr(struct perf_evsel *evsel,
pr_err("Display of DSO requested but no address to convert.\n");
return -EINVAL;
}
-   if (PRINT_FIELD(SRCLINE) && 

[PATCH v2 04/24] lockdep tests: Run lockdep tests a second time under Valgrind

2018-12-03 Thread Bart Van Assche
This improves test coverage.

Cc: Peter Zijlstra 
Cc: Waiman Long 
Cc: Johannes Berg 
Signed-off-by: Bart Van Assche 
---
 tools/lib/lockdep/run_tests.sh | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/tools/lib/lockdep/run_tests.sh b/tools/lib/lockdep/run_tests.sh
index bc36178329a8..c8fbd0306960 100755
--- a/tools/lib/lockdep/run_tests.sh
+++ b/tools/lib/lockdep/run_tests.sh
@@ -31,3 +31,17 @@ find tests -name '*.c' | sort | while read -r i; do
fi
rm -f "tests/$testname"
 done
+
+find tests -name '*.c' | sort | while read -r i; do
+   testname=$(basename "$i" .c)
+   echo -ne "(PRELOAD + Valgrind) $testname... "
+   if gcc -o "tests/$testname" -pthread -Iinclude "$i" &&
+   { timeout 10 valgrind --read-var-info=yes ./lockdep 
"./tests/$testname" >& "tests/${testname}.vg.out"; true; } &&
+   "tests/${testname}.sh" < "tests/${testname}.vg.out" &&
+   ! grep -Eq '(^==[0-9]*== (Invalid |Uninitialised ))|Mismatched 
free|Source and destination overlap| UME ' "tests/${testname}.vg.out"; then
+   echo "PASSED!"
+   else
+   echo "FAILED!"
+   fi
+   rm -f "tests/$testname"
+done
-- 
2.20.0.rc1.387.gf8505762e3-goog



[PATCH v2 00/24] locking/lockdep: Add support for dynamic keys

2018-12-03 Thread Bart Van Assche
Hi Ingo and Peter,

A known shortcoming of the current lockdep implementation is that it requires
lock keys to be allocated statically. This forces certain unrelated
synchronization objects to share keys and this key sharing can cause false
positive deadlock reports. This patch series adds support for dynamic keys in
the lockdep code and eliminates a class of false positive reports from the
workqueue implementation.

The changes compared to v1 are:
- Addressed Peter's review comments: remove the list_head that I had added
  to struct lock_list again, replaced all_list_entries and free_list_entries
  by two bitmaps, use call_rcu() to free lockdep objects, add a BUILD_BUG_ON()
  that compares the size of struct lock_class_key and raw_spin_lock_t.
- Addressed the "unknown symbol" errors reported by the build bot by adding a
  few #ifdef / #endif directives. Addressed the 32-bit warnings by using %d
  instead of %ld for array indices and by casting the array indices to
  unsigned int.
- Removed several WARN_ON_ONCE(!class->hash_entry.pprev) statements since
  these duplicate the code in check_data_structures().
- Left out the patch that causes lockdep to complain if no name has been
  assigned to a lock object. That patch namely causes the build bot to
  complain about certain lock objects but I have not yet had the time to
  figure out the identity of these lock objects.
  
Bart.

Bart Van Assche (24):
  lockdep tests: Display compiler warning and error messages
  lockdep tests: Fix shellcheck warnings
  lockdep tests: Improve testing accuracy
  lockdep tests: Run lockdep tests a second time under Valgrind
  liblockdep: Rename "trywlock" into "trywrlock"
  liblockdep: Add dummy print_irqtrace_events() implementation
  lockdep tests: Test the lockdep_reset_lock() implementation
  locking/lockdep: Declare local symbols static
  locking/lockdep: Inline __lockdep_init_map()
  locking/lockdep: Introduce lock_class_cache_is_registered()
  locking/lockdep: Remove a superfluous INIT_LIST_HEAD() statement
  locking/lockdep: Make concurrent lockdep_reset_lock() calls safe
  locking/lockdep: Stop using RCU primitives to access all_lock_classes
  locking/lockdep: Make zap_class() remove all matching lock order
entries
  locking/lockdep: Reorder struct lock_class members
  locking/lockdep: Retain the class key and name while freeing a lock
class
  locking/lockdep: Free lock classes that are no longer in use
  locking/lockdep: Reuse list entries that are no longer in use
  locking/lockdep: Check data structure consistency
  locking/lockdep: Introduce __lockdep_free_key_range()
  locking/lockdep: Verify whether lock objects are small enough to be
used as class keys
  locking/lockdep: Add support for dynamic keys
  kernel/workqueue: Use dynamic lockdep keys for workqueues
  lockdep tests: Test dynamic key registration

 include/linux/lockdep.h   |  37 +-
 include/linux/workqueue.h |  28 +-
 kernel/locking/lockdep.c  | 660 +++---
 kernel/workqueue.c|  60 +-
 tools/lib/lockdep/include/liblockdep/common.h |   3 +
 tools/lib/lockdep/include/liblockdep/mutex.h  |  12 +-
 tools/lib/lockdep/include/liblockdep/rwlock.h |   6 +-
 tools/lib/lockdep/lockdep.c   |   5 +
 tools/lib/lockdep/run_tests.sh|  39 +-
 tools/lib/lockdep/tests/AA.sh |   2 +
 tools/lib/lockdep/tests/ABA.sh|   2 +
 tools/lib/lockdep/tests/ABBA.c|  12 +
 tools/lib/lockdep/tests/ABBA.sh   |   2 +
 tools/lib/lockdep/tests/ABBA_2threads.sh  |   2 +
 tools/lib/lockdep/tests/ABBCCA.c  |   4 +
 tools/lib/lockdep/tests/ABBCCA.sh |   2 +
 tools/lib/lockdep/tests/ABBCCDDA.c|   5 +
 tools/lib/lockdep/tests/ABBCCDDA.sh   |   2 +
 tools/lib/lockdep/tests/ABCABC.c  |   4 +
 tools/lib/lockdep/tests/ABCABC.sh |   2 +
 tools/lib/lockdep/tests/ABCDBCDA.c|   5 +
 tools/lib/lockdep/tests/ABCDBCDA.sh   |   2 +
 tools/lib/lockdep/tests/ABCDBDDA.c|   5 +
 tools/lib/lockdep/tests/ABCDBDDA.sh   |   2 +
 tools/lib/lockdep/tests/WW.sh |   2 +
 tools/lib/lockdep/tests/unlock_balance.c  |   2 +
 tools/lib/lockdep/tests/unlock_balance.sh |   2 +
 27 files changed, 744 insertions(+), 165 deletions(-)
 create mode 100755 tools/lib/lockdep/tests/AA.sh
 create mode 100755 tools/lib/lockdep/tests/ABA.sh
 create mode 100755 tools/lib/lockdep/tests/ABBA.sh
 create mode 100755 tools/lib/lockdep/tests/ABBA_2threads.sh
 create mode 100755 tools/lib/lockdep/tests/ABBCCA.sh
 create mode 100755 tools/lib/lockdep/tests/ABBCCDDA.sh
 create mode 100755 tools/lib/lockdep/tests/ABCABC.sh
 create mode 100755 tools/lib/lockdep/tests/ABCDBCDA.sh
 create mode 100755 tools/lib/lockdep/tests/ABCDBDDA.sh
 create mode 100755 tools/lib/lockdep/tests/WW.sh
 create mode 100755 

[PATCH v2 14/24] locking/lockdep: Make zap_class() remove all matching lock order entries

2018-12-03 Thread Bart Van Assche
Make sure that all lock order entries that refer to a class are removed
from the list_entries[] array when a kernel module is unloaded.

Cc: Peter Zijlstra 
Cc: Waiman Long 
Cc: Johannes Berg 
Signed-off-by: Bart Van Assche 
---
 include/linux/lockdep.h  |  1 +
 kernel/locking/lockdep.c | 17 +++--
 2 files changed, 12 insertions(+), 6 deletions(-)

diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h
index 1fd82ff99c65..6d0f8d1c2bee 100644
--- a/include/linux/lockdep.h
+++ b/include/linux/lockdep.h
@@ -180,6 +180,7 @@ static inline void lockdep_copy_map(struct lockdep_map *to,
 struct lock_list {
struct list_headentry;
struct lock_class   *class;
+   struct lock_class   *links_to;
struct stack_trace  trace;
int distance;
 
diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 5c837a537273..ecd92969674c 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -859,7 +859,8 @@ static struct lock_list *alloc_list_entry(void)
 /*
  * Add a new dependency to the head of the list:
  */
-static int add_lock_to_list(struct lock_class *this, struct list_head *head,
+static int add_lock_to_list(struct lock_class *this,
+   struct lock_class *links_to, struct list_head *head,
unsigned long ip, int distance,
struct stack_trace *trace)
 {
@@ -872,7 +873,9 @@ static int add_lock_to_list(struct lock_class *this, struct 
list_head *head,
if (!entry)
return 0;
 
+   WARN_ON_ONCE(this == links_to);
entry->class = this;
+   entry->links_to = links_to;
entry->distance = distance;
entry->trace = *trace;
/*
@@ -1918,14 +1921,14 @@ check_prev_add(struct task_struct *curr, struct 
held_lock *prev,
 * Ok, all validations passed, add the new lock
 * to the previous lock's dependency list:
 */
-   ret = add_lock_to_list(hlock_class(next),
+   ret = add_lock_to_list(hlock_class(next), hlock_class(prev),
   _class(prev)->locks_after,
   next->acquire_ip, distance, trace);
 
if (!ret)
return 0;
 
-   ret = add_lock_to_list(hlock_class(prev),
+   ret = add_lock_to_list(hlock_class(prev), hlock_class(next),
   _class(next)->locks_before,
   next->acquire_ip, distance, trace);
if (!ret)
@@ -4128,15 +4131,17 @@ void lockdep_reset(void)
  */
 static void zap_class(struct lock_class *class)
 {
+   struct lock_list *entry;
int i;
 
/*
 * Remove all dependencies this lock is
 * involved in:
 */
-   for (i = 0; i < nr_list_entries; i++) {
-   if (list_entries[i].class == class)
-   list_del_rcu(_entries[i].entry);
+   for (i = 0, entry = list_entries; i < nr_list_entries; i++, entry++) {
+   if (entry->class != class && entry->links_to != class)
+   continue;
+   list_del_rcu(>entry);
}
/*
 * Unhash the class and remove it from the all_lock_classes list:
-- 
2.20.0.rc1.387.gf8505762e3-goog



[PATCH v2 08/24] locking/lockdep: Declare local symbols static

2018-12-03 Thread Bart Van Assche
This patch avoids that sparse complains about a missing declaration for
the lock_classes array.

Cc: Peter Zijlstra 
Cc: Waiman Long 
Cc: Johannes Berg 
Signed-off-by: Bart Van Assche 
---
 kernel/locking/lockdep.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 1efada2dd9dd..7434a00b2b2f 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -138,6 +138,9 @@ static struct lock_list list_entries[MAX_LOCKDEP_ENTRIES];
  * get freed - this significantly simplifies the debugging code.
  */
 unsigned long nr_lock_classes;
+#ifndef CONFIG_DEBUG_LOCKDEP
+static
+#endif
 struct lock_class lock_classes[MAX_LOCKDEP_KEYS];
 
 static inline struct lock_class *hlock_class(struct held_lock *hlock)
-- 
2.20.0.rc1.387.gf8505762e3-goog



[PATCH v2 12/24] locking/lockdep: Make concurrent lockdep_reset_lock() calls safe

2018-12-03 Thread Bart Van Assche
Since zap_class() removes items from the all_lock_classes list and the
classhash_table, protect all zap_class() calls with the graph lock.

Cc: Peter Zijlstra 
Cc: Waiman Long 
Cc: Johannes Berg 
Signed-off-by: Bart Van Assche 
---
 kernel/locking/lockdep.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 346b5a1fd062..e78623819184 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -4229,6 +4229,7 @@ void lockdep_reset_lock(struct lockdep_map *lock)
int j, locked;
 
raw_local_irq_save(flags);
+   locked = graph_lock();
 
/*
 * Remove all classes this lock might have:
@@ -4245,7 +4246,6 @@ void lockdep_reset_lock(struct lockdep_map *lock)
 * Debug check: in the end all mapped classes should
 * be gone.
 */
-   locked = graph_lock();
if (unlikely(lock_class_cache_is_registered(lock))) {
if (debug_locks_off_graph_unlock()) {
/*
-- 
2.20.0.rc1.387.gf8505762e3-goog



[PATCH v2 02/24] lockdep tests: Fix shellcheck warnings

2018-12-03 Thread Bart Van Assche
Use find instead of ls to avoid splitting filenames that contain spaces.
Use rm -f instead of if ... then rm ...; fi. This patch addresses all
shellcheck complaints about the run_tests.sh shell script.

Cc: Peter Zijlstra 
Cc: Waiman Long 
Cc: Johannes Berg 
Signed-off-by: Bart Van Assche 
---
 tools/lib/lockdep/run_tests.sh | 12 
 1 file changed, 4 insertions(+), 8 deletions(-)

diff --git a/tools/lib/lockdep/run_tests.sh b/tools/lib/lockdep/run_tests.sh
index 9f31f84e7fac..253719ee6377 100755
--- a/tools/lib/lockdep/run_tests.sh
+++ b/tools/lib/lockdep/run_tests.sh
@@ -7,7 +7,7 @@ if ! make >/dev/null; then
 exit 1
 fi
 
-for i in `ls tests/*.c`; do
+find tests -name '*.c' | sort | while read -r i; do
testname=$(basename "$i" .c)
echo -ne "$testname... "
if gcc -o "tests/$testname" -pthread "$i" liblockdep.a -Iinclude 
-D__USE_LIBLOCKDEP &&
@@ -16,12 +16,10 @@ for i in `ls tests/*.c`; do
else
echo "FAILED!"
fi
-   if [ -f "tests/$testname" ]; then
-   rm tests/$testname
-   fi
+   rm -f "tests/$testname"
 done
 
-for i in `ls tests/*.c`; do
+find tests -name '*.c' | sort | while read -r i; do
testname=$(basename "$i" .c)
echo -ne "(PRELOAD) $testname... "
if gcc -o "tests/$testname" -pthread -Iinclude "$i" &&
@@ -30,7 +28,5 @@ for i in `ls tests/*.c`; do
else
echo "FAILED!"
fi
-   if [ -f "tests/$testname" ]; then
-   rm tests/$testname
-   fi
+   rm -f "tests/$testname"
 done
-- 
2.20.0.rc1.387.gf8505762e3-goog



[PATCH v2 17/24] locking/lockdep: Free lock classes that are no longer in use

2018-12-03 Thread Bart Van Assche
Instead of leaving lock classes that are no longer in use in the
lock_classes array, reuse entries from that array that are no longer
in use. Maintain a linked list of free lock classes with list head
'free_lock_class'. Initialize that list from inside register_lock_class()
instead of from inside lockdep_init() because register_lock_class() can
be called before lockdep_init() has been called. Only add freed lock
classes to the free_lock_classes list after a grace period to avoid that
a lock_classes[] element would be reused while an RCU reader is
accessing it.

Cc: Peter Zijlstra 
Cc: Waiman Long 
Cc: Johannes Berg 
Signed-off-by: Bart Van Assche 
---
 include/linux/lockdep.h  |   9 +-
 kernel/locking/lockdep.c | 237 ---
 2 files changed, 205 insertions(+), 41 deletions(-)

diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h
index 9421f028c26c..02a1469c46e1 100644
--- a/include/linux/lockdep.h
+++ b/include/linux/lockdep.h
@@ -63,7 +63,8 @@ extern struct lock_class_key __lockdep_no_validate__;
 #define LOCKSTAT_POINTS4
 
 /*
- * The lock-class itself:
+ * The lock-class itself. The order of the structure members matters.
+ * reinit_class() zeroes the key member and all subsequent members.
  */
 struct lock_class {
/*
@@ -72,7 +73,9 @@ struct lock_class {
struct hlist_node   hash_entry;
 
/*
-* global list of all lock-classes:
+* Entry in all_lock_classes when in use. Entry in free_lock_classes
+* when not in use. Instances that are being freed are briefly on
+* neither list.
 */
struct list_headlock_entry;
 
@@ -106,7 +109,7 @@ struct lock_class {
unsigned long   contention_point[LOCKSTAT_POINTS];
unsigned long   contending_point[LOCKSTAT_POINTS];
 #endif
-};
+} __no_randomize_layout;
 
 #ifdef CONFIG_LOCK_STAT
 struct lock_time {
diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 92bdb187987f..d907d8bfefdf 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -134,8 +134,8 @@ static struct lock_list list_entries[MAX_LOCKDEP_ENTRIES];
 /*
  * All data structures here are protected by the global debug_lock.
  *
- * Mutex key structs only get allocated, once during bootup, and never
- * get freed - this significantly simplifies the debugging code.
+ * nr_lock_classes is the number of elements of lock_classes[] that is
+ * in use.
  */
 unsigned long nr_lock_classes;
 #ifndef CONFIG_DEBUG_LOCKDEP
@@ -277,11 +277,18 @@ static inline void lock_release_holdtime(struct held_lock 
*hlock)
 #endif
 
 /*
- * We keep a global list of all lock classes. The list only grows,
- * never shrinks. The list is only accessed with the lockdep
- * spinlock lock held.
+ * We keep a global list of all lock classes. The list is only accessed with
+ * the lockdep spinlock lock held. The zapped_classes list contains lock
+ * classes that are about to be freed but that may still be accessed by an RCU
+ * reader. free_lock_classes is a list with free elements. These elements are
+ * linked together by the lock_entry member in struct lock_class.
  */
 LIST_HEAD(all_lock_classes);
+static LIST_HEAD(zapped_classes);
+static LIST_HEAD(free_lock_classes);
+static bool initialization_happened;
+static bool rcu_callback_scheduled;
+static struct rcu_head free_zapped_classes_rcu_head;
 
 /*
  * The lockdep classes are in a hash-table as well, for fast lookup:
@@ -735,6 +742,21 @@ static bool assign_lock_key(struct lockdep_map *lock)
return true;
 }
 
+/*
+ * Initialize the lock_classes[] array elements and also the free_lock_classes
+ * list.
+ */
+static void init_data_structures(void)
+{
+   int i;
+
+   for (i = 0; i < ARRAY_SIZE(lock_classes); i++) {
+   list_add_tail(_classes[i].lock_entry, _lock_classes);
+   INIT_LIST_HEAD(_classes[i].locks_after);
+   INIT_LIST_HEAD(_classes[i].locks_before);
+   }
+}
+
 /*
  * Register a lock's class in the hash-table, if the class is not present
  * yet. Otherwise we look it up. We cache the result in the lock object
@@ -775,11 +797,14 @@ register_lock_class(struct lockdep_map *lock, unsigned 
int subclass, int force)
goto out_unlock_set;
}
 
-   /*
-* Allocate a new key from the static array, and add it to
-* the hash:
-*/
-   if (nr_lock_classes >= MAX_LOCKDEP_KEYS) {
+   /* Allocate a new lock class and add it to the hash. */
+   if (unlikely(!initialization_happened)) {
+   initialization_happened = true;
+   init_data_structures();
+   }
+   class = list_first_entry_or_null(_lock_classes, typeof(*class),
+lock_entry);
+   if (!class) {
if (!debug_locks_off_graph_unlock()) {
return NULL;
}
@@ 

[PATCH v2 16/24] locking/lockdep: Retain the class key and name while freeing a lock class

2018-12-03 Thread Bart Van Assche
The next patch in this series uses the class name in code that
detects lock class use-after-free. Hence retain the class name for
lock classes that are being freed.

Cc: Peter Zijlstra 
Cc: Waiman Long 
Cc: Johannes Berg 
Signed-off-by: Bart Van Assche 
---
 kernel/locking/lockdep.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index ecd92969674c..92bdb187987f 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -4147,10 +4147,8 @@ static void zap_class(struct lock_class *class)
 * Unhash the class and remove it from the all_lock_classes list:
 */
hlist_del_rcu(>hash_entry);
+   class->hash_entry.pprev = NULL;
list_del(>lock_entry);
-
-   RCU_INIT_POINTER(class->key, NULL);
-   RCU_INIT_POINTER(class->name, NULL);
 }
 
 static inline int within(const void *addr, void *start, unsigned long size)
-- 
2.20.0.rc1.387.gf8505762e3-goog



[PATCH v2 05/24] liblockdep: Rename "trywlock" into "trywrlock"

2018-12-03 Thread Bart Van Assche
This patch avoids that the following compiler warning is reported while
compiling the lockdep unit tests:

include/liblockdep/rwlock.h: In function 'liblockdep_pthread_rwlock_trywlock':
include/liblockdep/rwlock.h:66:9: warning: implicit declaration of function 
'pthread_rwlock_trywlock'; did you mean 'pthread_rwlock_trywrlock'? 
[-Wimplicit-function-declaration]
  return pthread_rwlock_trywlock(>rwlock) == 0 ? 1 : 0;
 ^~~
 pthread_rwlock_trywrlock

Fixes: 5a52c9b480e0 ("liblockdep: Add public headers for pthread_rwlock_t 
implementation")
Cc: Sasha Levin 
Cc: Peter Zijlstra 
Cc: Waiman Long 
Cc: Johannes Berg 
Signed-off-by: Bart Van Assche 
---
 tools/lib/lockdep/include/liblockdep/rwlock.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/tools/lib/lockdep/include/liblockdep/rwlock.h 
b/tools/lib/lockdep/include/liblockdep/rwlock.h
index a96c3bf0fef1..365762e3a1ea 100644
--- a/tools/lib/lockdep/include/liblockdep/rwlock.h
+++ b/tools/lib/lockdep/include/liblockdep/rwlock.h
@@ -60,10 +60,10 @@ static inline int 
liblockdep_pthread_rwlock_tryrdlock(liblockdep_pthread_rwlock_
return pthread_rwlock_tryrdlock(>rwlock) == 0 ? 1 : 0;
 }
 
-static inline int 
liblockdep_pthread_rwlock_trywlock(liblockdep_pthread_rwlock_t *lock)
+static inline int 
liblockdep_pthread_rwlock_trywrlock(liblockdep_pthread_rwlock_t *lock)
 {
lock_acquire(>dep_map, 0, 1, 0, 1, NULL, (unsigned long)_RET_IP_);
-   return pthread_rwlock_trywlock(>rwlock) == 0 ? 1 : 0;
+   return pthread_rwlock_trywrlock(>rwlock) == 0 ? 1 : 0;
 }
 
 static inline int liblockdep_rwlock_destroy(liblockdep_pthread_rwlock_t *lock)
@@ -79,7 +79,7 @@ static inline int 
liblockdep_rwlock_destroy(liblockdep_pthread_rwlock_t *lock)
 #define pthread_rwlock_unlock  liblockdep_pthread_rwlock_unlock
 #define pthread_rwlock_wrlock  liblockdep_pthread_rwlock_wrlock
 #define pthread_rwlock_tryrdlock   liblockdep_pthread_rwlock_tryrdlock
-#define pthread_rwlock_trywlock
liblockdep_pthread_rwlock_trywlock
+#define pthread_rwlock_trywrlock   liblockdep_pthread_rwlock_trywrlock
 #define pthread_rwlock_destroy liblockdep_rwlock_destroy
 
 #endif
-- 
2.20.0.rc1.387.gf8505762e3-goog



[PATCH v2 22/24] locking/lockdep: Add support for dynamic keys

2018-12-03 Thread Bart Van Assche
A shortcoming of the current lockdep implementation is that it requires
lock keys to be allocated statically. That forces certain lock objects
to share lock keys. Since lock dependency analysis groups lock objects
per key sharing lock keys can cause false positive lockdep reports.
Make it possible to avoid such false positive reports by allowing lock
keys to be allocated dynamically. Require that dynamically allocated
lock keys are registered before use by calling lockdep_register_key().
Complain about attempts to register the same lock key pointer twice
without calling lockdep_unregister_key() between successive
registration calls.

The purpose of the new lock_keys_hash[] data structure that keeps
track of all dynamic keys is twofold:
- Verify whether the lockdep_register_key() and  lockdep_unregister_key()
  functions are used correctly.
- Avoid that lockdep_init_map() complains when encountering a dynamically
  allocated key.

Cc: Peter Zijlstra 
Cc: Waiman Long 
Cc: Johannes Berg 
Signed-off-by: Bart Van Assche 
---
 include/linux/lockdep.h  |  13 -
 kernel/locking/lockdep.c | 120 ---
 2 files changed, 122 insertions(+), 11 deletions(-)

diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h
index 02a1469c46e1..ea09048b6e1c 100644
--- a/include/linux/lockdep.h
+++ b/include/linux/lockdep.h
@@ -46,15 +46,19 @@ extern int lock_stat;
 #define NR_LOCKDEP_CACHING_CLASSES 2
 
 /*
- * Lock-classes are keyed via unique addresses, by embedding the
- * lockclass-key into the kernel (or module) .data section. (For
- * static locks we use the lock address itself as the key.)
+ * A lockdep key is associated with each lock object. For static locks we use
+ * the lock address itself as the key. Dynamically allocated lock objects can
+ * have a statically or dynamically allocated key. Dynamically allocated lock
+ * keys must be registered before being used and must be unregistered before
+ * the key memory is freed.
  */
 struct lockdep_subclass_key {
char __one_byte;
 } __attribute__ ((__packed__));
 
+/* hash_entry is used to keep track of dynamically allocated keys. */
 struct lock_class_key {
+   struct hlist_node   hash_entry;
struct lockdep_subclass_key subkeys[MAX_LOCKDEP_SUBCLASSES];
 };
 
@@ -274,6 +278,9 @@ extern asmlinkage void lockdep_sys_exit(void);
 extern void lockdep_off(void);
 extern void lockdep_on(void);
 
+extern void lockdep_register_key(struct lock_class_key *key);
+extern void lockdep_unregister_key(struct lock_class_key *key);
+
 /*
  * These methods are used by specific locking variants (spinlocks,
  * rwlocks, mutexes and rwsems) to pass init/acquire/release events
diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index b4772e5fc176..d8ceed4f2900 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -141,6 +141,9 @@ static DECLARE_BITMAP(list_entries_being_freed, 
MAX_LOCKDEP_ENTRIES);
  * nr_lock_classes is the number of elements of lock_classes[] that is
  * in use.
  */
+#define KEYHASH_BITS   (MAX_LOCKDEP_KEYS_BITS - 1)
+#define KEYHASH_SIZE   (1UL << KEYHASH_BITS)
+static struct hlist_head lock_keys_hash[KEYHASH_SIZE];
 unsigned long nr_lock_classes;
 #ifndef CONFIG_DEBUG_LOCKDEP
 static
@@ -610,7 +613,7 @@ static int very_verbose(struct lock_class *class)
  * Is this the address of a static object:
  */
 #ifdef __KERNEL__
-static int static_obj(void *obj)
+static int static_obj(const void *obj)
 {
unsigned long start = (unsigned long) &_stext,
  end   = (unsigned long) &_end,
@@ -912,6 +915,70 @@ static void init_data_structures(void)
}
 }
 
+static inline struct hlist_head *keyhashentry(const struct lock_class_key *key)
+{
+   unsigned long hash = hash_long((uintptr_t)key, KEYHASH_BITS);
+
+   return lock_keys_hash + hash;
+}
+
+/*
+ * Register a dynamically allocated key.
+ */
+void lockdep_register_key(struct lock_class_key *key)
+{
+   struct hlist_head *hash_head;
+   struct lock_class_key *k;
+   unsigned long flags;
+
+   if (WARN_ON_ONCE(static_obj(key)))
+   return;
+   hash_head = keyhashentry(key);
+   raw_local_irq_save(flags);
+   if (!graph_lock())
+   goto restore_irqs;
+   hlist_for_each_entry_rcu(k, hash_head, hash_entry) {
+   if (WARN_ON_ONCE(k == key))
+   goto out_unlock;
+   }
+   hlist_add_head_rcu(>hash_entry, hash_head);
+out_unlock:
+   graph_unlock();
+restore_irqs:
+   raw_local_irq_restore(flags);
+}
+EXPORT_SYMBOL_GPL(lockdep_register_key);
+
+/*
+ * Check whether a key has been registered as a dynamic key. Must not be called
+ * from interrupt context.
+ */
+static bool is_dynamic_key(const struct lock_class_key *key)
+{
+   struct hlist_head *hash_head;
+   struct lock_class_key *k;
+   unsigned long flags;
+   bool found = false;
+
+   if 

[PATCH v2 23/24] kernel/workqueue: Use dynamic lockdep keys for workqueues

2018-12-03 Thread Bart Van Assche
Commit 87915adc3f0a ("workqueue: re-add lockdep dependencies for flushing")
improved deadlock checking in the workqueue implementation. Unfortunately
that patch also introduced a few false positive lockdep complaints. This
patch suppresses these false positives by allocating the workqueue mutex
lockdep key dynamically. An example of a false positive lockdep complaint
suppressed by this report can be found below. The root cause of the
lockdep complaint shown below is that the direct I/O code can call
alloc_workqueue() from inside a work item created by another
alloc_workqueue() call and that both workqueues share the same lockdep
key. This patch avoids that that lockdep complaint is triggered by
allocating the work queue lockdep keys dynamically. In other words, this
patch guarantees that a unique lockdep key is associated with each work
queue mutex.

==
WARNING: possible circular locking dependency detected
4.19.0-dbg+ #1 Not tainted
--
fio/4129 is trying to acquire lock:
a01cfe1a ((wq_completion)"dio/%s"sb->s_id){+.+.}, at: 
flush_workqueue+0xd0/0x970

but task is already holding lock:
a0acecf9 (>s_type->i_mutex_key#14){+.+.}, at: 
ext4_file_write_iter+0x154/0x710

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #2 (>s_type->i_mutex_key#14){+.+.}:
   down_write+0x3d/0x80
   __generic_file_fsync+0x77/0xf0
   ext4_sync_file+0x3c9/0x780
   vfs_fsync_range+0x66/0x100
   dio_complete+0x2f5/0x360
   dio_aio_complete_work+0x1c/0x20
   process_one_work+0x481/0x9f0
   worker_thread+0x63/0x5a0
   kthread+0x1cf/0x1f0
   ret_from_fork+0x24/0x30

-> #1 ((work_completion)(>complete_work)){+.+.}:
   process_one_work+0x447/0x9f0
   worker_thread+0x63/0x5a0
   kthread+0x1cf/0x1f0
   ret_from_fork+0x24/0x30

-> #0 ((wq_completion)"dio/%s"sb->s_id){+.+.}:
   lock_acquire+0xc5/0x200
   flush_workqueue+0xf3/0x970
   drain_workqueue+0xec/0x220
   destroy_workqueue+0x23/0x350
   sb_init_dio_done_wq+0x6a/0x80
   do_blockdev_direct_IO+0x1f33/0x4be0
   __blockdev_direct_IO+0x79/0x86
   ext4_direct_IO+0x5df/0xbb0
   generic_file_direct_write+0x119/0x220
   __generic_file_write_iter+0x131/0x2d0
   ext4_file_write_iter+0x3fa/0x710
   aio_write+0x235/0x330
   io_submit_one+0x510/0xeb0
   __x64_sys_io_submit+0x122/0x340
   do_syscall_64+0x71/0x220
   entry_SYSCALL_64_after_hwframe+0x49/0xbe

other info that might help us debug this:

Chain exists of:
  (wq_completion)"dio/%s"sb->s_id --> (work_completion)(>complete_work) 
--> >s_type->i_mutex_key#14

 Possible unsafe locking scenario:

   CPU0CPU1
   
  lock(>s_type->i_mutex_key#14);
   lock((work_completion)(>complete_work));
   lock(>s_type->i_mutex_key#14);
  lock((wq_completion)"dio/%s"sb->s_id);

 *** DEADLOCK ***

1 lock held by fio/4129:
 #0: a0acecf9 (>s_type->i_mutex_key#14){+.+.}, at: 
ext4_file_write_iter+0x154/0x710

stack backtrace:
CPU: 3 PID: 4129 Comm: fio Not tainted 4.19.0-dbg+ #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
Call Trace:
 dump_stack+0x86/0xc5
 print_circular_bug.isra.32+0x20a/0x218
 __lock_acquire+0x1c68/0x1cf0
 lock_acquire+0xc5/0x200
 flush_workqueue+0xf3/0x970
 drain_workqueue+0xec/0x220
 destroy_workqueue+0x23/0x350
 sb_init_dio_done_wq+0x6a/0x80
 do_blockdev_direct_IO+0x1f33/0x4be0
 __blockdev_direct_IO+0x79/0x86
 ext4_direct_IO+0x5df/0xbb0
 generic_file_direct_write+0x119/0x220
 __generic_file_write_iter+0x131/0x2d0
 ext4_file_write_iter+0x3fa/0x710
 aio_write+0x235/0x330
 io_submit_one+0x510/0xeb0
 __x64_sys_io_submit+0x122/0x340
 do_syscall_64+0x71/0x220
 entry_SYSCALL_64_after_hwframe+0x49/0xbe

Cc: Ingo Molnar 
Cc: Peter Zijlstra 
Cc: Will Deacon 
Cc: Tejun Heo 
Cc: Waiman Long 
Cc: Johannes Berg 
Signed-off-by: Bart Van Assche 
---
 include/linux/workqueue.h | 28 +++---
 kernel/workqueue.c| 60 +--
 2 files changed, 55 insertions(+), 33 deletions(-)

diff --git a/include/linux/workqueue.h b/include/linux/workqueue.h
index 60d673e15632..d9a1a480e920 100644
--- a/include/linux/workqueue.h
+++ b/include/linux/workqueue.h
@@ -390,43 +390,23 @@ extern struct workqueue_struct *system_freezable_wq;
 extern struct workqueue_struct *system_power_efficient_wq;
 extern struct workqueue_struct *system_freezable_power_efficient_wq;
 
-extern struct workqueue_struct *
-__alloc_workqueue_key(const char *fmt, unsigned int flags, int max_active,
-   struct lock_class_key *key, const char *lock_name, ...) __printf(1, 6);
-
 /**
  * alloc_workqueue - allocate a workqueue
  * @fmt: printf format for the name of the workqueue
  * @flags: WQ_* flags
  * 

[PATCH v2 21/24] locking/lockdep: Verify whether lock objects are small enough to be used as class keys

2018-12-03 Thread Bart Van Assche
Cc: Peter Zijlstra 
Cc: Waiman Long 
Cc: Johannes Berg 
Signed-off-by: Bart Van Assche 
---
 kernel/locking/lockdep.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index c936fce5b9d7..b4772e5fc176 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -727,6 +727,15 @@ static bool assign_lock_key(struct lockdep_map *lock)
 {
unsigned long can_addr, addr = (unsigned long)lock;
 
+   /*
+* lockdep_free_key_range() assumes that struct lock_class_key
+* objects do not overlap. Since we use the address of lock
+* objects as class key for static objects, check whether the
+* size of lock_class_key objects does not exceed the size of
+* the smallest lock object.
+*/
+   BUILD_BUG_ON(sizeof(struct lock_class_key) > sizeof(raw_spinlock_t));
+
if (__is_kernel_percpu_address(addr, _addr))
lock->key = (void *)can_addr;
else if (__is_module_percpu_address(addr, _addr))
-- 
2.20.0.rc1.387.gf8505762e3-goog



[PATCH v2 24/24] lockdep tests: Test dynamic key registration

2018-12-03 Thread Bart Van Assche
Make sure that the lockdep_register_key() and lockdep_unregister_key()
code is tested when running the lockdep tests.

Cc: Peter Zijlstra 
Cc: Waiman Long 
Cc: Johannes Berg 
Signed-off-by: Bart Van Assche 
---
 tools/lib/lockdep/include/liblockdep/common.h |  2 ++
 tools/lib/lockdep/include/liblockdep/mutex.h  | 11 ++-
 tools/lib/lockdep/tests/ABBA.c|  9 +
 3 files changed, 17 insertions(+), 5 deletions(-)

diff --git a/tools/lib/lockdep/include/liblockdep/common.h 
b/tools/lib/lockdep/include/liblockdep/common.h
index d640a9761f09..a81d91d4fc78 100644
--- a/tools/lib/lockdep/include/liblockdep/common.h
+++ b/tools/lib/lockdep/include/liblockdep/common.h
@@ -45,6 +45,8 @@ void lock_acquire(struct lockdep_map *lock, unsigned int 
subclass,
 void lock_release(struct lockdep_map *lock, int nested,
unsigned long ip);
 void lockdep_reset_lock(struct lockdep_map *lock);
+void lockdep_register_key(struct lock_class_key *key);
+void lockdep_unregister_key(struct lock_class_key *key);
 extern void debug_check_no_locks_freed(const void *from, unsigned long len);
 
 #define STATIC_LOCKDEP_MAP_INIT(_name, _key) \
diff --git a/tools/lib/lockdep/include/liblockdep/mutex.h 
b/tools/lib/lockdep/include/liblockdep/mutex.h
index 2073d4e1f2f0..783dd0df06f9 100644
--- a/tools/lib/lockdep/include/liblockdep/mutex.h
+++ b/tools/lib/lockdep/include/liblockdep/mutex.h
@@ -7,6 +7,7 @@
 
 struct liblockdep_pthread_mutex {
pthread_mutex_t mutex;
+   struct lock_class_key key;
struct lockdep_map dep_map;
 };
 
@@ -27,11 +28,10 @@ static inline int __mutex_init(liblockdep_pthread_mutex_t 
*lock,
return pthread_mutex_init(>mutex, __mutexattr);
 }
 
-#define liblockdep_pthread_mutex_init(mutex, mutexattr)\
-({ \
-   static struct lock_class_key __key; \
-   \
-   __mutex_init((mutex), #mutex, &__key, (mutexattr)); \
+#define liblockdep_pthread_mutex_init(mutex, mutexattr)
\
+({ \
+   lockdep_register_key(&(mutex)->key);\
+   __mutex_init((mutex), #mutex, &(mutex)->key, (mutexattr));  \
 })
 
 static inline int liblockdep_pthread_mutex_lock(liblockdep_pthread_mutex_t 
*lock)
@@ -55,6 +55,7 @@ static inline int 
liblockdep_pthread_mutex_trylock(liblockdep_pthread_mutex_t *l
 static inline int liblockdep_pthread_mutex_destroy(liblockdep_pthread_mutex_t 
*lock)
 {
lockdep_reset_lock(>dep_map);
+   lockdep_unregister_key(>key);
return pthread_mutex_destroy(>mutex);
 }
 
diff --git a/tools/lib/lockdep/tests/ABBA.c b/tools/lib/lockdep/tests/ABBA.c
index 623313f54720..543789bc3e37 100644
--- a/tools/lib/lockdep/tests/ABBA.c
+++ b/tools/lib/lockdep/tests/ABBA.c
@@ -14,4 +14,13 @@ void main(void)
 
pthread_mutex_destroy();
pthread_mutex_destroy();
+
+   pthread_mutex_init(, NULL);
+   pthread_mutex_init(, NULL);
+
+   LOCK_UNLOCK_2(a, b);
+   LOCK_UNLOCK_2(b, a);
+
+   pthread_mutex_destroy();
+   pthread_mutex_destroy();
 }
-- 
2.20.0.rc1.387.gf8505762e3-goog



[PATCH v2 20/24] locking/lockdep: Introduce __lockdep_free_key_range()

2018-12-03 Thread Bart Van Assche
This patch does not change any functionality but makes the next patch
in this series easier to read.

Cc: Peter Zijlstra 
Cc: Waiman Long 
Cc: Johannes Berg 
Signed-off-by: Bart Van Assche 
---
 kernel/locking/lockdep.c | 27 +--
 1 file changed, 17 insertions(+), 10 deletions(-)

diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 6d99f3f0757c..c936fce5b9d7 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -4478,14 +4478,12 @@ static void schedule_free_zapped_classes(void)
 }
 
 /*
- * Used in module.c to remove lock classes from memory that is going to be
- * freed; and possibly re-used by other modules.
- *
- * We will have had one sync_sched() before getting here, so we're guaranteed
- * nobody will look up these exact classes -- they're properly dead but still
- * allocated.
+ * Remove all lock classes from the class hash table and from the
+ * all_lock_classes list whose key or name is in the address range
+ * [start, start + size). Move these lock classes to the
+ * @zapped_classes list.
  */
-void lockdep_free_key_range(void *start, unsigned long size)
+static void __lockdep_free_key_range(void *start, unsigned long size)
 {
struct lock_class *class;
struct hlist_head *head;
@@ -4496,9 +4494,6 @@ void lockdep_free_key_range(void *start, unsigned long 
size)
raw_local_irq_save(flags);
locked = graph_lock();
 
-   /*
-* Unhash all classes that were created by this module:
-*/
for (i = 0; i < CLASSHASH_SIZE; i++) {
head = classhash_table + i;
hlist_for_each_entry_rcu(class, head, hash_entry) {
@@ -4513,7 +4508,19 @@ void lockdep_free_key_range(void *start, unsigned long 
size)
if (locked)
graph_unlock();
raw_local_irq_restore(flags);
+}
 
+/*
+ * Used in module.c to remove lock classes from memory that is going to be
+ * freed; and possibly re-used by other modules.
+ *
+ * We will have had one sync_sched() before getting here, so we're guaranteed
+ * nobody will look up these exact classes -- they're properly dead but still
+ * allocated.
+ */
+void lockdep_free_key_range(void *start, unsigned long size)
+{
+   __lockdep_free_key_range(start, size);
/*
 * Do not wait for concurrent look_up_lock_class() calls. If any such
 * concurrent call would return a pointer to one of the lock classes
-- 
2.20.0.rc1.387.gf8505762e3-goog



[PATCH v2 19/24] locking/lockdep: Check data structure consistency

2018-12-03 Thread Bart Van Assche
Debugging lockdep data structure inconsistencies is challenging. Add
disabled code that verifies data structure consistency at runtime.

Cc: Peter Zijlstra 
Cc: Waiman Long 
Cc: Johannes Berg 
Signed-off-by: Bart Van Assche 
---
 kernel/locking/lockdep.c | 147 +++
 1 file changed, 147 insertions(+)

diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index f343e7612a3a..6d99f3f0757c 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -72,6 +72,8 @@ module_param(lock_stat, int, 0644);
 #define lock_stat 0
 #endif
 
+static bool check_data_structure_consistency;
+
 /*
  * lockdep_lock: protects the lockdep graph, the hashes and the
  *   class/list/hash allocators.
@@ -744,6 +746,148 @@ static bool assign_lock_key(struct lockdep_map *lock)
return true;
 }
 
+/* Check whether element @e occurs in list @h */
+static bool in_list(struct list_head *e, struct list_head *h)
+{
+   struct list_head *f;
+
+   list_for_each(f, h) {
+   if (e == f)
+   return true;
+   }
+
+   return false;
+}
+
+/*
+ * Check whether entry @e occurs in any of the locks_after or locks_before
+ * lists.
+ */
+static bool in_any_class_list(struct list_head *e)
+{
+   struct lock_class *class;
+   int i;
+
+   for (i = 0; i < ARRAY_SIZE(lock_classes); i++) {
+   class = _classes[i];
+   if (in_list(e, >locks_after) ||
+   in_list(e, >locks_before))
+   return true;
+   }
+   return false;
+}
+
+static bool class_lock_list_valid(struct lock_class *c, struct list_head *h)
+{
+   struct lock_list *e;
+
+   list_for_each_entry(e, h, entry) {
+   if (e->links_to != c) {
+   printk(KERN_INFO "class %s: mismatch for lock entry 
%ld; class %s <> %s",
+  c->name ? : "(?)",
+  (unsigned long)(e - list_entries),
+  e->links_to && e->links_to->name ?
+  e->links_to->name : "(?)",
+  e->class && e->class->name ? e->class->name :
+  "(?)");
+   return false;
+   }
+   }
+   return true;
+}
+
+static u16 chain_hlocks[];
+
+static bool check_lock_chain_key(struct lock_chain *chain)
+{
+#ifdef CONFIG_PROVE_LOCKING
+   u64 chain_key = 0;
+   int i;
+
+   for (i = chain->base; i < chain->base + chain->depth; i++)
+   chain_key = iterate_chain_key(chain_key, chain_hlocks[i] + 1);
+   /*
+* The 'unsigned long long' casts avoid that a compiler warning
+* is reported when building tools/lib/lockdep.
+*/
+   if (chain->chain_key != chain_key)
+   printk(KERN_INFO "chain %lld: key %#llx <> %#llx\n",
+  (unsigned long long)(chain - lock_chains),
+  (unsigned long long)chain->chain_key,
+  (unsigned long long)chain_key);
+   return chain->chain_key == chain_key;
+#else
+   return true;
+#endif
+}
+
+static bool check_data_structures(void)
+{
+   struct lock_class *class;
+   struct lock_chain *chain;
+   struct hlist_head *head;
+   struct lock_list *e;
+   int i;
+
+   /*
+* Check whether all list entries that are in use occur in a class
+* lock list.
+*/
+   for_each_set_bit(i, list_entries_in_use, ARRAY_SIZE(list_entries)) {
+   if (test_bit(i, list_entries_being_freed))
+   continue;
+   e = list_entries + i;
+   if (!in_any_class_list(>entry)) {
+   printk(KERN_INFO "list entry %d is not in any class 
list; class %s <> %s\n",
+  (unsigned int)(e - list_entries),
+  e->class->name ? : "(?)",
+  e->links_to->name ? : "(?)");
+   return false;
+   }
+   }
+
+   /*
+* Check whether all list entries that are not in use do not occur in
+* a class lock list.
+*/
+   for_each_clear_bit(i, list_entries_in_use, ARRAY_SIZE(list_entries)) {
+   e = list_entries + i;
+   if (WARN_ON_ONCE(test_bit(i, list_entries_being_freed)))
+   return false;
+   if (in_any_class_list(>entry)) {
+   printk(KERN_INFO "list entry %d occurs in a class list; 
class %s <> %s\n",
+  (unsigned int)(e - list_entries),
+  e->class && e->class->name ? e->class->name :
+  "(?)",
+  e->links_to && e->links_to->name ?
+  e->links_to->name : "(?)");
+   return false;
+   }
+   }
+
+   

[PATCH v2 18/24] locking/lockdep: Reuse list entries that are no longer in use

2018-12-03 Thread Bart Van Assche
Instead of abandoning elements of list_entries[] that are no longer in
use, make alloc_list_entry() reuse array elements that have been freed.

Cc: Peter Zijlstra 
Cc: Waiman Long 
Cc: Johannes Berg 
Signed-off-by: Bart Van Assche 
---
 kernel/locking/lockdep.c | 27 ---
 1 file changed, 20 insertions(+), 7 deletions(-)

diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index d907d8bfefdf..f343e7612a3a 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -130,6 +130,8 @@ static inline int debug_locks_off_graph_unlock(void)
 
 unsigned long nr_list_entries;
 static struct lock_list list_entries[MAX_LOCKDEP_ENTRIES];
+static DECLARE_BITMAP(list_entries_in_use, MAX_LOCKDEP_ENTRIES);
+static DECLARE_BITMAP(list_entries_being_freed, MAX_LOCKDEP_ENTRIES);
 
 /*
  * All data structures here are protected by the global debug_lock.
@@ -871,7 +873,10 @@ register_lock_class(struct lockdep_map *lock, unsigned int 
subclass, int force)
  */
 static struct lock_list *alloc_list_entry(void)
 {
-   if (nr_list_entries >= MAX_LOCKDEP_ENTRIES) {
+   int idx = find_first_zero_bit(list_entries_in_use,
+ ARRAY_SIZE(list_entries));
+
+   if (idx >= ARRAY_SIZE(list_entries)) {
if (!debug_locks_off_graph_unlock())
return NULL;
 
@@ -879,7 +884,8 @@ static struct lock_list *alloc_list_entry(void)
dump_stack();
return NULL;
}
-   return list_entries + nr_list_entries++;
+   __set_bit(idx, list_entries_in_use);
+   return list_entries + idx;
 }
 
 /*
@@ -984,7 +990,7 @@ static inline void mark_lock_accessed(struct lock_list 
*lock,
unsigned long nr;
 
nr = lock - list_entries;
-   WARN_ON(nr >= nr_list_entries); /* Out-of-bounds, input fail */
+   WARN_ON(nr >= ARRAY_SIZE(list_entries)); /* Out-of-bounds, input fail */
lock->parent = parent;
lock->class->dep_gen_id = lockdep_dependency_gen_id;
 }
@@ -994,7 +1000,7 @@ static inline unsigned long lock_accessed(struct lock_list 
*lock)
unsigned long nr;
 
nr = lock - list_entries;
-   WARN_ON(nr >= nr_list_entries); /* Out-of-bounds, input fail */
+   WARN_ON(nr >= ARRAY_SIZE(list_entries)); /* Out-of-bounds, input fail */
return lock->class->dep_gen_id == lockdep_dependency_gen_id;
 }
 
@@ -4250,9 +4256,12 @@ static void zap_class(struct lock_class *class)
 * Remove all dependencies this lock is
 * involved in:
 */
-   for (i = 0, entry = list_entries; i < nr_list_entries; i++, entry++) {
+   for_each_set_bit(i, list_entries_in_use, ARRAY_SIZE(list_entries)) {
+   entry = list_entries + i;
if (entry->class != class && entry->links_to != class)
continue;
+   if (__test_and_set_bit(i, list_entries_being_freed))
+   continue;
links_to = entry->links_to;
WARN_ON_ONCE(entry->class == links_to);
list_del_rcu(>entry);
@@ -4286,8 +4295,9 @@ static inline int within(const void *addr, void *start, 
unsigned long size)
 }
 
 /*
- * Free all lock classes that are on the zapped_classes list. Called as an
- * RCU callback function.
+ * Free all lock classes that are on the zapped_classes list and also all list
+ * entries that have been marked as being freed. Called as an RCU callback
+ * function.
  */
 static void free_zapped_classes(struct callback_head *ch)
 {
@@ -4303,6 +4313,9 @@ static void free_zapped_classes(struct callback_head *ch)
nr_lock_classes--;
}
list_splice_init(_classes, _lock_classes);
+   bitmap_andnot(list_entries_in_use, list_entries_in_use,
+ list_entries_being_freed, ARRAY_SIZE(list_entries));
+   bitmap_clear(list_entries_being_freed, 0, ARRAY_SIZE(list_entries));
if (locked)
graph_unlock();
raw_local_irq_restore(flags);
-- 
2.20.0.rc1.387.gf8505762e3-goog



[PATCH V2] sdhci: fix the timeout check window for clock and reset

2018-12-03 Thread Du, Alek
>From 87692fc090978bde8fe872f02d0023a57af6b492 Mon Sep 17 00:00:00 2001
From: Alek Du 
Date: Fri, 30 Nov 2018 14:02:28 +0800
Subject: [PATCH] sdhci: fix the timeout check window for clock and reset

We observed some fake timeouts on some devices, the log is like this:

case 1:
[159525.255629] mmc1: Internal clock never stabilised.
[159525.255818] mmc1: sdhci:  SDHCI REGISTER DUMP ===
[159525.256049] mmc1: sdhci: Sys addr:  0x | Version:  0x1002
[159525.256277] mmc1: sdhci: Blk size:  0x | Blk cnt:  0x
[159525.256523] mmc1: sdhci: Argument:  0x | Trn mode: 0x
[159525.256752] mmc1: sdhci: Present:   0x1fff | Host ctl: 0x
[159525.256979] mmc1: sdhci: Power: 0x000b | Blk gap:  0x0080
[159525.257205] mmc1: sdhci: Wake-up:   0x | Clock:0xfa03
>From the clock control register dump, we are pretty sure the clock was
stablized.

case 2:
[  914.550127] mmc1: Reset 0x2 never completed.
[  914.550321] mmc1: sdhci:  SDHCI REGISTER DUMP ===
[  914.550608] mmc1: sdhci: Sys addr:  0x0010 | Version:  0x1002

After checking the sdhci code, we found the timeout check actually has a
little window that the CPU can be scheduled out and when it comes back,
the original time set or check is not valid.

Signed-off-by: Alek Du 
---
 drivers/mmc/host/sdhci.c | 16 ++--
 1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c
index 99bdae53fa2e..af01f7d16eae 100644
--- a/drivers/mmc/host/sdhci.c
+++ b/drivers/mmc/host/sdhci.c
@@ -218,12 +218,17 @@ void sdhci_reset(struct sdhci_host *host, u8 mask)
/* hw clears the bit when it's done */
while (sdhci_readb(host, SDHCI_SOFTWARE_RESET) & mask) {
if (ktime_after(ktime_get(), timeout)) {
+   /* check it again, since there is a window between
+  bit check and time check */
+   if (!(sdhci_readb(host, SDHCI_SOFTWARE_RESET) & mask))
+   break;
pr_err("%s: Reset 0x%x never completed.\n",
mmc_hostname(host->mmc), (int)mask);
sdhci_dumpregs(host);
return;
+   } else {
+   udelay(10);
}
-   udelay(10);
}
 }
 EXPORT_SYMBOL_GPL(sdhci_reset);
@@ -1611,12 +1616,19 @@ void sdhci_enable_clk(struct sdhci_host *host, u16 clk)
while (!((clk = sdhci_readw(host, SDHCI_CLOCK_CONTROL))
& SDHCI_CLOCK_INT_STABLE)) {
if (ktime_after(ktime_get(), timeout)) {
+   /* check it again since there is a window between
+  status check and time check */
+   if ((clk = sdhci_readw(host, SDHCI_CLOCK_CONTROL))
+   & SDHCI_CLOCK_INT_STABLE)
+   break;
pr_err("%s: Internal clock never stabilised.\n",
   mmc_hostname(host->mmc));
sdhci_dumpregs(host);
return;
}
-   udelay(10);
+   else {
+   udelay(10);
+   }
}
 
clk |= SDHCI_CLOCK_CARD_EN;
-- 
2.17.1


[PATCH V3] kvm:x86 :remove unnecessary recalculate_apic_map

2018-12-03 Thread Peng Hao
In the previous code, the variable apic_sw_disabled influences
recalculate_apic_map. But in "KVM: x86: simplify kvm_apic_map"
(commit:3b5a5ffa928a3f875b0d5dd284eeb7c322e1688a),
the access to apic_sw_disabled in recalculate_apic_map has been
deleted.

Signed-off-by: Peng Hao 
---
 arch/x86/kvm/lapic.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index fbb0e6d..a11fbf9 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -246,10 +246,9 @@ static inline void apic_set_spiv(struct kvm_lapic *apic, 
u32 val)
 
if (enabled != apic->sw_enabled) {
apic->sw_enabled = enabled;
-   if (enabled) {
+   if (enabled)
static_key_slow_dec_deferred(_sw_disabled);
-   recalculate_apic_map(apic->vcpu->kvm);
-   } else
+   else
static_key_slow_inc(_sw_disabled.key);
}
 }
-- 
1.8.3.1



[PATCH] squashfs: enable __GFP_FS in ->readpage to prevent hang in mem alloc

2018-12-03 Thread Hou Tao
There is no need to disable __GFP_FS in ->readpage:
* It's a read-only fs, so there will be no dirty/writeback page and
  there will be no deadlock against the caller's locked page
* It just allocates one page, so compaction will not be invoked
* It doesn't take any inode lock, so the reclamation of inode will be fine

And no __GFP_FS may lead to hang in __alloc_pages_slowpath() if a
squashfs page fault occurs in the context of a memory hogger, because
the hogger will not be killed due to the logic in __alloc_pages_may_oom().

Signed-off-by: Hou Tao 
---
 fs/squashfs/file.c  |  3 ++-
 fs/squashfs/file_direct.c   |  4 +++-
 fs/squashfs/squashfs_fs_f.h | 25 +
 3 files changed, 30 insertions(+), 2 deletions(-)
 create mode 100644 fs/squashfs/squashfs_fs_f.h

diff --git a/fs/squashfs/file.c b/fs/squashfs/file.c
index f1c1430ae721..8603dda4a719 100644
--- a/fs/squashfs/file.c
+++ b/fs/squashfs/file.c
@@ -51,6 +51,7 @@
 #include "squashfs_fs.h"
 #include "squashfs_fs_sb.h"
 #include "squashfs_fs_i.h"
+#include "squashfs_fs_f.h"
 #include "squashfs.h"
 
 /*
@@ -414,7 +415,7 @@ void squashfs_copy_cache(struct page *page, struct 
squashfs_cache_entry *buffer,
TRACE("bytes %d, i %d, available_bytes %d\n", bytes, i, avail);
 
push_page = (i == page->index) ? page :
-   grab_cache_page_nowait(page->mapping, i);
+   squashfs_grab_cache_page_nowait(page->mapping, i);
 
if (!push_page)
continue;
diff --git a/fs/squashfs/file_direct.c b/fs/squashfs/file_direct.c
index 80db1b86a27c..a0fdd6215348 100644
--- a/fs/squashfs/file_direct.c
+++ b/fs/squashfs/file_direct.c
@@ -17,6 +17,7 @@
 #include "squashfs_fs.h"
 #include "squashfs_fs_sb.h"
 #include "squashfs_fs_i.h"
+#include "squashfs_fs_f.h"
 #include "squashfs.h"
 #include "page_actor.h"
 
@@ -60,7 +61,8 @@ int squashfs_readpage_block(struct page *target_page, u64 
block, int bsize,
/* Try to grab all the pages covered by the Squashfs block */
for (missing_pages = 0, i = 0, n = start_index; i < pages; i++, n++) {
page[i] = (n == target_page->index) ? target_page :
-   grab_cache_page_nowait(target_page->mapping, n);
+   squashfs_grab_cache_page_nowait(
+   target_page->mapping, n);
 
if (page[i] == NULL) {
missing_pages++;
diff --git a/fs/squashfs/squashfs_fs_f.h b/fs/squashfs/squashfs_fs_f.h
new file mode 100644
index ..fc5fb7aeb27d
--- /dev/null
+++ b/fs/squashfs/squashfs_fs_f.h
@@ -0,0 +1,25 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef SQUASHFS_FS_F
+#define SQUASHFS_FS_F
+
+/*
+ * No need to use FGP_NOFS here:
+ * 1. It's a read-only fs, so there will be no dirty/writeback page and
+ *there will be no deadlock against the caller's locked page.
+ * 2. It just allocates one page, so compaction will not be invoked.
+ * 3. It doesn't take any inode lock, so the reclamation of inode
+ *will be fine.
+ *
+ * And GFP_NOFS may lead to infinite loop in __alloc_pages_slowpath() if a
+ * squashfs page fault occurs in the context of a memory hogger, because
+ * the hogger will not be killed due to the logic in __alloc_pages_may_oom().
+ */
+static inline struct page *
+squashfs_grab_cache_page_nowait(struct address_space *mapping, pgoff_t index)
+{
+   return pagecache_get_page(mapping, index,
+   FGP_LOCK|FGP_CREAT|FGP_NOWAIT,
+   mapping_gfp_mask(mapping));
+}
+#endif
+
-- 
2.16.2.dirty



Re: [PATCH] pinctrl: meson: fix G12A ao pull registers base address

2018-12-03 Thread Xingyu Chen




On 2018/12/3 18:27, Neil Armstrong wrote:

Hi Xingyu,


On 03/12/2018 04:05, Xingyu Chen wrote:

Since Meson G12A SoC, Introduce new ao registers AO_RTI_PULL_UP_EN_REG
and AO_GPIO_O.

These bits of controlling output level are remapped to the new register
AO_GPIO_O, and the AO_GPIO_O_EN_N support only controlling output enable.

These bits of controlling pull enable are remapped to the new register
AO_RTI_PULL_UP_EN_REG, and the AO_RTI_PULL_UP_REG support only controlling
pull type(up/down).

The new layout of ao gpio/pull registers is as follows:
- AO_GPIO_O_EN_N[offset: 0x9 << 2]
- AO_GPIO_I [offset: 0xa << 2]
- AO_RTI_PULL_UP_REG[offset: 0xb << 2]
- AO_RTI_PULL_UP_EN_REG [offset: 0xc << 2]
- AO_GPIO_O [offset: 0xd << 2]

 From above, we can see ao GPIO registers region has been separated by the
ao pull registers. In order to ensure the continuity of the region on
software, the ao GPIO and ao pull registers use the same base address, but
can be identified by the offset.

Fixes: 29ae0952e85f ("pinctrl: meson-g12a: add pinctrl driver support")
Signed-off-by: Xingyu Chen 
Signed-off-by: Jianxin Pan 
---
  drivers/pinctrl/meson/pinctrl-meson.c | 22 --
  1 file changed, 12 insertions(+), 10 deletions(-)

diff --git a/drivers/pinctrl/meson/pinctrl-meson.c 
b/drivers/pinctrl/meson/pinctrl-meson.c
index 53d449076dee..7ff40cd7a0cb 100644
--- a/drivers/pinctrl/meson/pinctrl-meson.c
+++ b/drivers/pinctrl/meson/pinctrl-meson.c
@@ -31,6 +31,9 @@
   * In some cases the register ranges for pull enable and pull
   * direction are the same and thus there are only 3 register ranges.
   *
+ * Since Meson G12A SoC, the ao register ranges for gpio, pull enable
+ * and pull direction are the same, so there are only 2 register ranges.
+ *
   * For the pull and GPIO configuration every bank uses a contiguous
   * set of bits in the register sets described above; the same register
   * can be shared by more banks with different offsets.
@@ -487,23 +490,22 @@ static int meson_pinctrl_parse_dt(struct meson_pinctrl 
*pc,
return PTR_ERR(pc->reg_mux);
}
  
-	pc->reg_pull = meson_map_resource(pc, gpio_np, "pull");

-   if (IS_ERR(pc->reg_pull)) {
-   dev_err(pc->dev, "pull registers not found\n");
-   return PTR_ERR(pc->reg_pull);
+   pc->reg_gpio = meson_map_resource(pc, gpio_np, "gpio");
+   if (IS_ERR(pc->reg_gpio)) {
+   dev_err(pc->dev, "gpio registers not found\n");
+   return PTR_ERR(pc->reg_gpio);
}
  
+	pc->reg_pull = meson_map_resource(pc, gpio_np, "pull");

+   /* Use gpio region if pull one is not present */
+   if (IS_ERR(pc->reg_pull))
+   pc->reg_pull = pc->reg_gpio;
+
pc->reg_pullen = meson_map_resource(pc, gpio_np, "pull-enable");
/* Use pull region if pull-enable one is not present */
if (IS_ERR(pc->reg_pullen))
pc->reg_pullen = pc->reg_pull;
  
-	pc->reg_gpio = meson_map_resource(pc, gpio_np, "gpio");

-   if (IS_ERR(pc->reg_gpio)) {
-   dev_err(pc->dev, "gpio registers not found\n");
-   return PTR_ERR(pc->reg_gpio);
-   }
-
return 0;
  }
  


Doesn't it need an update of the bindings ?

Neil


Thanks, I will update it in next patch

.



Re: [PATCH] printk: don't unconditionally shortcut print_time()

2018-12-03 Thread Sergey Senozhatsky
On (12/02/18 14:02), Tetsuo Handa wrote:
>  
> @@ -1541,11 +1545,13 @@ int do_syslog(int type, char __user *buf, int len, 
> int source)
>   } else {
>   u64 seq = syslog_seq;
>   u32 idx = syslog_idx;
> + bool f = syslog_partial ? syslog_time : printk_time;
^^

>   while (seq < log_next_seq) {
>   struct printk_log *msg = log_from_idx(idx);
>  
> - error += msg_print_text(msg, true, NULL, 0);
> + error += msg_print_text(msg, true, f, NULL, 0);


> + f = printk_time;
^^^

>   idx = log_next(idx);
>   seq++;

Can we please have something better than 'f'?

-ss


Re: [PATCH 2/2] Input: omap-keypad: Fix idle configration to not block SoC idle states

2018-12-03 Thread Dmitry Torokhov
Hi Tony,

On Mon, Dec 03, 2018 at 03:12:51PM -0800, Tony Lindgren wrote:
> 
> With PM enabled, I noticed that pressing a key on the droid4 keyboard will
> block deeper idle states for the SoC. Looks like we can fix this by
> managing the idle register to gether with the interrupt similar to what
> we already do for the GPIO controller.

Can you show me where exactly we are doing this? I can't seem to find
the matching code.

Thanks!

> 
> And there's no need to keep enabling and disabling interrupts and
> wake-up events for normal use if we use IRQF_ONESHOT as suggested by
> Dmitry Torokhov  so let's do that too.
> 
> Cc: Axel Haslam 
> Cc: Illia Smyrnov 
> Cc: Marcel Partap 
> Cc: Merlijn Wajer 
> Cc: Michael Scott 
> Cc: NeKit 
> Cc: Pavel Machek 
> Cc: Sebastian Reichel 
> Reported-by: Pavel Machek 
> Signed-off-by: Tony Lindgren 
> ---
>  drivers/input/keyboard/omap4-keypad.c | 30 ++-
>  1 file changed, 11 insertions(+), 19 deletions(-)
> 
> diff --git a/drivers/input/keyboard/omap4-keypad.c 
> b/drivers/input/keyboard/omap4-keypad.c
> --- a/drivers/input/keyboard/omap4-keypad.c
> +++ b/drivers/input/keyboard/omap4-keypad.c
> @@ -53,11 +53,12 @@
>  /* OMAP4 bit definitions */
>  #define OMAP4_DEF_IRQENABLE_EVENTEN  BIT(0)
>  #define OMAP4_DEF_IRQENABLE_LONGKEY  BIT(1)
> -#define OMAP4_DEF_WUP_EVENT_ENA  BIT(0)
> -#define OMAP4_DEF_WUP_LONG_KEY_ENA   BIT(1)
>  #define OMAP4_DEF_CTRL_NOSOFTMODEBIT(1)
>  #define OMAP4_DEF_CTRL_PTV_SHIFT 2
>  
> +#define OMAP4_KBD_IRQ_MASK   (OMAP4_DEF_IRQENABLE_LONGKEY | \
> +  OMAP4_DEF_IRQENABLE_EVENTEN)
> +
>  /* OMAP4 values */
>  #define OMAP4_VAL_IRQDISABLE 0x0
>  
> @@ -126,12 +127,8 @@ static irqreturn_t omap4_keypad_irq_handler(int irq, 
> void *dev_id)
>  {
>   struct omap4_keypad *keypad_data = dev_id;
>  
> - if (kbd_read_irqreg(keypad_data, OMAP4_KBD_IRQSTATUS)) {
> - /* Disable interrupts */
> - kbd_write_irqreg(keypad_data, OMAP4_KBD_IRQENABLE,
> -  OMAP4_VAL_IRQDISABLE);
> + if (kbd_read_irqreg(keypad_data, OMAP4_KBD_IRQSTATUS))
>   return IRQ_WAKE_THREAD;
> - }
>  
>   return IRQ_NONE;
>  }
> @@ -173,11 +170,6 @@ static irqreturn_t omap4_keypad_irq_thread_fn(int irq, 
> void *dev_id)
>   kbd_write_irqreg(keypad_data, OMAP4_KBD_IRQSTATUS,
>kbd_read_irqreg(keypad_data, OMAP4_KBD_IRQSTATUS));
>  
> - /* enable interrupts */
> - kbd_write_irqreg(keypad_data, OMAP4_KBD_IRQENABLE,
> - OMAP4_DEF_IRQENABLE_EVENTEN |
> - OMAP4_DEF_IRQENABLE_LONGKEY);
> -
>   return IRQ_HANDLED;
>  }
>  
> @@ -197,11 +189,10 @@ static int omap4_keypad_open(struct input_dev *input)
>   /* clear pending interrupts */
>   kbd_write_irqreg(keypad_data, OMAP4_KBD_IRQSTATUS,
>kbd_read_irqreg(keypad_data, OMAP4_KBD_IRQSTATUS));
> - kbd_write_irqreg(keypad_data, OMAP4_KBD_IRQENABLE,
> - OMAP4_DEF_IRQENABLE_EVENTEN |
> - OMAP4_DEF_IRQENABLE_LONGKEY);
> - kbd_writel(keypad_data, OMAP4_KBD_WAKEUPENABLE,
> - OMAP4_DEF_WUP_EVENT_ENA | OMAP4_DEF_WUP_LONG_KEY_ENA);
> +
> + /* enable interrupts and wake-up events */
> + kbd_write_irqreg(keypad_data, OMAP4_KBD_IRQENABLE, OMAP4_KBD_IRQ_MASK);
> + kbd_writel(keypad_data, OMAP4_KBD_WAKEUPENABLE, OMAP4_KBD_IRQ_MASK);
>  
>   enable_irq(keypad_data->irq);
>  
> @@ -214,9 +205,10 @@ static void omap4_keypad_close(struct input_dev *input)
>  
>   disable_irq(keypad_data->irq);
>  
> - /* Disable interrupts */
> + /* Disable interrupts and wake-up events */
>   kbd_write_irqreg(keypad_data, OMAP4_KBD_IRQENABLE,
>OMAP4_VAL_IRQDISABLE);
> + kbd_writel(keypad_data, OMAP4_KBD_WAKEUPENABLE, 0);
>  
>   /* clear pending interrupts */
>   kbd_write_irqreg(keypad_data, OMAP4_KBD_IRQSTATUS,
> @@ -365,7 +357,7 @@ static int omap4_keypad_probe(struct platform_device 
> *pdev)
>   }
>  
>   error = request_threaded_irq(keypad_data->irq, omap4_keypad_irq_handler,
> -  omap4_keypad_irq_thread_fn, 0,
> +  omap4_keypad_irq_thread_fn, IRQF_ONESHOT,
>"omap4-keypad", keypad_data);
>   if (error) {
>   dev_err(>dev, "failed to register interrupt\n");
> -- 
> 2.19.2

-- 
Dmitry


RE,

2018-12-03 Thread Ms Sharifah Ahmad Mustahfa




--
Hello,

First of all i will like to apologies for my manner of communication 
because you do not know me personally, its due to the fact that i have a 
very important proposal for you.





[PATCH v5 7/8] arm64: dts: sdm845: Add rpmh powercontroller node

2018-12-03 Thread Rajendra Nayak
Add the DT node for the rpmhpd powercontroller.

Signed-off-by: Rajendra Nayak 
---
 arch/arm64/boot/dts/qcom/sdm845.dtsi | 51 
 1 file changed, 51 insertions(+)

diff --git a/arch/arm64/boot/dts/qcom/sdm845.dtsi 
b/arch/arm64/boot/dts/qcom/sdm845.dtsi
index b72bdb0a31a5..a6d0cd8d17b0 100644
--- a/arch/arm64/boot/dts/qcom/sdm845.dtsi
+++ b/arch/arm64/boot/dts/qcom/sdm845.dtsi
@@ -10,6 +10,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
@@ -1324,6 +1325,56 @@
compatible = "qcom,sdm845-rpmh-clk";
#clock-cells = <1>;
};
+
+   rpmhpd: power-controller {
+   compatible = "qcom,sdm845-rpmhpd";
+   #power-domain-cells = <1>;
+   operating-points-v2 = <_opp_table>;
+   };
+
+   rpmhpd_opp_table: opp-table {
+   compatible = "operating-points-v2-qcom-level";
+
+   rpmhpd_opp_ret: opp1 {
+   qcom,level = 
;
+   };
+
+   rpmhpd_opp_min_svs: opp2 {
+   qcom,level = 
;
+   };
+
+   rpmhpd_opp_low_svs: opp3 {
+   qcom,level = 
;
+   };
+
+   rpmhpd_opp_svs: opp4 {
+   qcom,level = ;
+   };
+
+   rpmhpd_opp_svs_l1: opp5 {
+   qcom,level = 
;
+   };
+
+   rpmhpd_opp_nom: opp6 {
+   qcom,level = ;
+   };
+
+   rpmhpd_opp_nom_l1: opp7 {
+   qcom,level = 
;
+   };
+
+   rpmhpd_opp_nom_l2: opp8 {
+   qcom,level = 
;
+   };
+
+   rpmhpd_opp_turbo: opp9 {
+   qcom,level = 
;
+   };
+
+   rpmhpd_opp_turbo_l1: opp10 {
+   qcom,level = 
;
+   };
+   };
};
 
intc: interrupt-controller@17a0 {
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation



[PATCH v5 5/8] arm64: dts: msm8996: Add rpmpd device node

2018-12-03 Thread Rajendra Nayak
Add rpmpd device node and its OPP table

Signed-off-by: Rajendra Nayak 
Signed-off-by: Viresh Kumar 
Reviewed-by: Ulf Hansson 
---
 arch/arm64/boot/dts/qcom/msm8996.dtsi | 34 +++
 1 file changed, 34 insertions(+)

diff --git a/arch/arm64/boot/dts/qcom/msm8996.dtsi 
b/arch/arm64/boot/dts/qcom/msm8996.dtsi
index b29fe80d7288..ed35f0ced699 100644
--- a/arch/arm64/boot/dts/qcom/msm8996.dtsi
+++ b/arch/arm64/boot/dts/qcom/msm8996.dtsi
@@ -306,6 +306,40 @@
#clock-cells = <1>;
};
 
+   rpmpd: power-controller {
+   compatible = "qcom,msm8996-rpmpd";
+   #power-domain-cells = <1>;
+   operating-points-v2 = <_opp_table>;
+   };
+
+   rpmpd_opp_table: opp-table {
+   compatible = "operating-points-v2-qcom-level";
+
+   rpmpd_opp1: opp1 {
+   qcom,level = <1>;
+   };
+
+   rpmpd_opp2: opp2 {
+   qcom,level = <2>;
+   };
+
+   rpmpd_opp3: opp3 {
+   qcom,level = <3>;
+   };
+
+   rpmpd_opp4: opp4 {
+   qcom,level = <4>;
+   };
+
+   rpmpd_opp5: opp5 {
+   qcom,level = <5>;
+   };
+
+   rpmpd_opp6: opp6 {
+   qcom,level = <6>;
+   };
+   };
+
pm8994-regulators {
compatible = "qcom,rpm-pm8994-regulators";
 
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation



[PATCH v5 6/8] soc: qcom: rpmhpd: Add RPMh Power domain driver

2018-12-03 Thread Rajendra Nayak
The RPMh Power domain driver aggregates the corner votes from various
consumers for the ARC resources and communicates it to RPMh.

With RPMh we use 2 different numbering space for corners, one used
by the clients to express their performance needs, and another used
to communicate to RPMh hardware.

The clients express their performance requirements using a sparse
numbering space which are mapped to meaningful levels like RET, SVS,
NOMINAL, TURBO etc which then get mapped to another number space
between 0 and 15 which is communicated to RPMh. The sparse number space,
also referred to as vlvl is mapped to the continuous number space of 0
to 15, also referred to as hlvl, using command DB.

Some power domain clients could request a performance state only while
the CPU is active, while some others could request for a certain
performance state all the time regardless of the state of the CPU.
We handle this by internally aggregating the votes from both type of
clients and then send the aggregated votes to RPMh.

There are also 3 different types of Votes that are comunicated to RPMh
for every resource.
1. ACTIVE_ONLY: This specifies the requirement for the resource when the
CPU is active
2. SLEEP: This specifies the requirement for the resource when the CPU
is going to sleep
3. WAKE_ONLY: This specifies the requirement for the resource when the
CPU is coming out of sleep to active state

We add data for all power domains on sdm845 SoC as part of the patch.
The driver can be extended to support other SoCs which support RPMh

Signed-off-by: Rajendra Nayak 
Reviewed-by: Ulf Hansson 
---
 drivers/soc/qcom/Kconfig  |   9 +
 drivers/soc/qcom/Makefile |   1 +
 drivers/soc/qcom/rpmhpd.c | 431 ++
 3 files changed, 441 insertions(+)
 create mode 100644 drivers/soc/qcom/rpmhpd.c

diff --git a/drivers/soc/qcom/Kconfig b/drivers/soc/qcom/Kconfig
index e9b60695f6e7..a51458022d21 100644
--- a/drivers/soc/qcom/Kconfig
+++ b/drivers/soc/qcom/Kconfig
@@ -103,6 +103,15 @@ config QCOM_RPMH
  of hardware components aggregate requests for these resources and
  help apply the aggregated state on the resource.
 
+config QCOM_RPMHPD
+   bool "Qualcomm RPMh Power domain driver"
+   depends on QCOM_RPMH && QCOM_COMMAND_DB
+   help
+ QCOM RPMh Power domain driver to support power-domains with
+ performance states. The driver communicates a performance state
+ value to RPMh which then translates it into corresponding voltage
+ for the voltage rail.
+
 config QCOM_RPMPD
bool "Qualcomm RPM Power domain driver"
depends on MFD_QCOM_RPM && QCOM_SMD_RPM
diff --git a/drivers/soc/qcom/Makefile b/drivers/soc/qcom/Makefile
index f1b25fdcf2ad..dd6ca92985ee 100644
--- a/drivers/soc/qcom/Makefile
+++ b/drivers/soc/qcom/Makefile
@@ -22,3 +22,4 @@ obj-$(CONFIG_QCOM_APR) += apr.o
 obj-$(CONFIG_QCOM_LLCC) += llcc-slice.o
 obj-$(CONFIG_QCOM_SDM845_LLCC) += llcc-sdm845.o
 obj-$(CONFIG_QCOM_RPMPD) += rpmpd.o
+obj-$(CONFIG_QCOM_RPMHPD) += rpmhpd.o
diff --git a/drivers/soc/qcom/rpmhpd.c b/drivers/soc/qcom/rpmhpd.c
new file mode 100644
index ..10b45b4f4588
--- /dev/null
+++ b/drivers/soc/qcom/rpmhpd.c
@@ -0,0 +1,431 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2018, The Linux Foundation. All rights reserved.*/
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define domain_to_rpmhpd(domain) container_of(domain, struct rpmhpd, pd)
+
+/*
+ * This is the number of bytes used for each command DB aux data entry of an
+ * ARC resource.
+ */
+#define RPMH_ARC_LEVEL_SIZE2
+#define RPMH_ARC_MAX_LEVELS16
+
+/**
+ * struct rpmhpd - top level RPMh power domain resource data structure
+ * @dev:   rpmh power domain controller device
+ * @pd:generic_pm_domain corrresponding to the power domain
+ * @peer:  A peer power domain in case Active only Voting is 
supported
+ * @active_only:   True if it represents an Active only peer
+ * @level: An array of level (vlvl) to corner (hlvl) mappings 
derived from cmd-db
+ * @level_count:   Number of levels supported by the power domain. max 
being 16 (0 - 15)
+ * @enabled:   true if the power domain is enabled
+ * @res_name:  Resource name used for cmd-db lookup
+ * @addr:  Resource address as looped up using resource name from 
cmd-db
+ */
+struct rpmhpd {
+   struct device   *dev;
+   struct generic_pm_domain pd;
+   struct generic_pm_domain *parent;
+   struct rpmhpd   *peer;
+   const bool  active_only;
+   unsigned intcorner;
+   unsigned intactive_corner;
+   u32 level[RPMH_ARC_MAX_LEVELS];
+   int level_count;
+   boolenabled;
+   const char  *res_name;
+   u32 addr;
+};
+

[PATCH v5 3/8] soc: qcom: rpmpd: Add a Power domain driver to model corners

2018-12-03 Thread Rajendra Nayak
The Power domains for corners just pass the performance state set by the
consumers to the RPM (Remote Power manager) which then takes care
of setting the appropriate voltage on the corresponding rails to
meet the performance needs.

We add all Power domain data needed on msm8996 here. This driver can easily
be extended by adding data for other qualcomm SoCs as well.

Signed-off-by: Rajendra Nayak 
Signed-off-by: Viresh Kumar 
Reviewed-by: Ulf Hansson 
Acked-by: Rob Herring 
---
 drivers/soc/qcom/Kconfig  |   9 ++
 drivers/soc/qcom/Makefile |   1 +
 drivers/soc/qcom/rpmpd.c  | 294 ++
 3 files changed, 304 insertions(+)
 create mode 100644 drivers/soc/qcom/rpmpd.c

diff --git a/drivers/soc/qcom/Kconfig b/drivers/soc/qcom/Kconfig
index 684cb51694d1..e9b60695f6e7 100644
--- a/drivers/soc/qcom/Kconfig
+++ b/drivers/soc/qcom/Kconfig
@@ -103,6 +103,15 @@ config QCOM_RPMH
  of hardware components aggregate requests for these resources and
  help apply the aggregated state on the resource.
 
+config QCOM_RPMPD
+   bool "Qualcomm RPM Power domain driver"
+   depends on MFD_QCOM_RPM && QCOM_SMD_RPM
+   help
+ QCOM RPM Power domain driver to support power-domains with
+ performance states. The driver communicates a performance state
+ value to RPM which then translates it into corresponding voltage
+ for the voltage rail.
+
 config QCOM_SMEM
tristate "Qualcomm Shared Memory Manager (SMEM)"
depends on ARCH_QCOM || COMPILE_TEST
diff --git a/drivers/soc/qcom/Makefile b/drivers/soc/qcom/Makefile
index f25b54cd6cf8..f1b25fdcf2ad 100644
--- a/drivers/soc/qcom/Makefile
+++ b/drivers/soc/qcom/Makefile
@@ -21,3 +21,4 @@ obj-$(CONFIG_QCOM_WCNSS_CTRL) += wcnss_ctrl.o
 obj-$(CONFIG_QCOM_APR) += apr.o
 obj-$(CONFIG_QCOM_LLCC) += llcc-slice.o
 obj-$(CONFIG_QCOM_SDM845_LLCC) += llcc-sdm845.o
+obj-$(CONFIG_QCOM_RPMPD) += rpmpd.o
diff --git a/drivers/soc/qcom/rpmpd.c b/drivers/soc/qcom/rpmpd.c
new file mode 100644
index ..a0b9f122d793
--- /dev/null
+++ b/drivers/soc/qcom/rpmpd.c
@@ -0,0 +1,294 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2017-2018, The Linux Foundation. All rights reserved. */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+#include 
+
+#define domain_to_rpmpd(domain) container_of(domain, struct rpmpd, pd)
+
+/* Resource types */
+#define RPMPD_SMPA 0x61706d73
+#define RPMPD_LDOA 0x616f646c
+
+/* Operation Keys */
+#define KEY_CORNER 0x6e726f63 /* corn */
+#define KEY_ENABLE 0x6e657773 /* swen */
+#define KEY_FLOOR_CORNER   0x636676   /* vfc */
+
+#define DEFINE_RPMPD_CORN_SMPA(_platform, _name, _active, r_id)
\
+   static struct rpmpd _platform##_##_active;  \
+   static struct rpmpd _platform##_##_name = { \
+   .pd = { .name = #_name, },  \
+   .peer = &_platform##_##_active, \
+   .res_type = RPMPD_SMPA, \
+   .res_id = r_id, \
+   .key = KEY_CORNER,  \
+   };  \
+   static struct rpmpd _platform##_##_active = {   \
+   .pd = { .name = #_active, },\
+   .peer = &_platform##_##_name,   \
+   .active_only = true,\
+   .res_type = RPMPD_SMPA, \
+   .res_id = r_id, \
+   .key = KEY_CORNER,  \
+   }
+
+#define DEFINE_RPMPD_CORN_LDOA(_platform, _name, r_id) \
+   static struct rpmpd _platform##_##_name = { \
+   .pd = { .name = #_name, },  \
+   .res_type = RPMPD_LDOA, \
+   .res_id = r_id, \
+   .key = KEY_CORNER,  \
+   }
+
+#define DEFINE_RPMPD_VFC(_platform, _name, r_id, r_type)   \
+   static struct rpmpd _platform##_##_name = { \
+   .pd = { .name = #_name, },  \
+   .res_type = r_type, \
+   .res_id = r_id, \
+   .key = KEY_FLOOR_CORNER,\
+   }
+
+#define DEFINE_RPMPD_VFC_SMPA(_platform, _name, r_id)  \
+   DEFINE_RPMPD_VFC(_platform, _name, r_id, RPMPD_SMPA)
+

[PATCH v5 4/8] soc: qcom: rpmpd: Add support for get/set performance state

2018-12-03 Thread Rajendra Nayak
Add support for the .set_performace_state() and .opp_to_performance_state()
callbacks in the rpmpd driver.

Signed-off-by: Rajendra Nayak 
Signed-off-by: Viresh Kumar 
Reviewed-by: Ulf Hansson 
---
 drivers/soc/qcom/rpmpd.c | 46 
 1 file changed, 46 insertions(+)

diff --git a/drivers/soc/qcom/rpmpd.c b/drivers/soc/qcom/rpmpd.c
index a0b9f122d793..eb1cfa6a03d6 100644
--- a/drivers/soc/qcom/rpmpd.c
+++ b/drivers/soc/qcom/rpmpd.c
@@ -12,6 +12,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 #include 
@@ -28,6 +29,8 @@
 #define KEY_ENABLE 0x6e657773 /* swen */
 #define KEY_FLOOR_CORNER   0x636676   /* vfc */
 
+#define MAX_RPMPD_STATE6
+
 #define DEFINE_RPMPD_CORN_SMPA(_platform, _name, _active, r_id)
\
static struct rpmpd _platform##_##_active;  \
static struct rpmpd _platform##_##_name = { \
@@ -221,6 +224,47 @@ static int rpmpd_power_off(struct generic_pm_domain 
*domain)
return ret;
 }
 
+static int rpmpd_set_performance(struct generic_pm_domain *domain,
+unsigned int state)
+{
+   int ret = 0;
+   struct rpmpd *pd = domain_to_rpmpd(domain);
+
+   mutex_lock(_lock);
+
+   if (state > MAX_RPMPD_STATE)
+   goto out;
+
+   pd->corner = state;
+
+   if (!pd->enabled && (pd->key != KEY_FLOOR_CORNER))
+   goto out;
+
+   ret = rpmpd_aggregate_corner(pd);
+
+out:
+   mutex_unlock(_lock);
+
+   return ret;
+}
+
+static unsigned int rpmpd_get_performance(struct generic_pm_domain *genpd,
+ struct dev_pm_opp *opp)
+{
+   struct device_node *np;
+   unsigned int corner = 0;
+
+   np = dev_pm_opp_get_of_node(opp);
+   if (of_property_read_u32(np, "qcom,level", )) {
+   pr_err("%s: missing 'qcom,level' property\n", __func__);
+   return 0;
+   }
+
+   of_node_put(np);
+
+   return corner;
+}
+
 static int rpmpd_probe(struct platform_device *pdev)
 {
int i;
@@ -261,6 +305,8 @@ static int rpmpd_probe(struct platform_device *pdev)
rpmpds[i]->rpm = rpm;
rpmpds[i]->pd.power_off = rpmpd_power_off;
rpmpds[i]->pd.power_on = rpmpd_power_on;
+   rpmpds[i]->pd.set_performance_state = rpmpd_set_performance;
+   rpmpds[i]->pd.opp_to_performance_state = rpmpd_get_performance;
pm_genpd_init([i]->pd, NULL, true);
 
data->domains[i] = [i]->pd;
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation



[PATCH v5 0/8] Add power domain driver for corners on msm8996/sdm845

2018-12-03 Thread Rajendra Nayak
Hi Rob,

This series is mainly pending your review/ack for the DT parts.
Rest of the genpd parts are reviewed and acked by both Viresh
and Ulf.

Changes in v5:
* First 6 patches are unchanged
* Patch 7/8 adds the DT node for rpmh power-controller on sdm845 and the
corresponding OPP tables for it to describe the performance states
* Patch 8/8 adds a parent/child relationship across mx/cx and mx_ao/cx_ao
as needed on sdm845 platform. This patch is dependent on the series from
Viresh [1] which adds support to propogate performance states across the
power domain hierarchy which is still being reviewed

Changes in v4:
* Included the patch to add qcom-opp bindings (dropped accidentally in v3)
* merged the patches to add bindings for rpm and rpmh, added consumer binding 
example
* Made the drivers built in, removed .remove
* Added better description in changelog for PATCH 6/6
* Updated rpmhpd_aggregate_corner() based on Davids feedback
* rpmhpd_set_performance_state() returns max corner, in cases where its called
with an INT_MAX
* Dropped the patch to max vote on all corners at init, the patch did not
work anyway, and it shouldn't be needed now

Changes in v3:
* Bindings split into seperate patches
* Bindings updated to remove duplicate OPP table phandles
* DT headers defining macros for Power domain indexes and OPP levels
* Optimisations to use rpmh_write_async() whereever applicable
* Fixed up handling of ACTIVE_ONLY/WAKE_ONLY/SLEEP voting for RPMh
* Fixed the vlvl to hlvl conversions in set_performance
* Other minor fixes based on review of v2
* TODO: This series does not handle the case where all VDD_MX votes
should be higher than VDD_CX from APPs, as pointed out
by David Collins in v2. This needs support at genpd to propogate performance
state up the parents, if we model these as Parent/Child to handle the
interdependency.

Changes in v2:
* added a power domain driver for sdm845 which supports communicating to RPMh
* dropped the changes to sdhc driver to move over to using OPP
as there is active discussion on using OPP as the interface vs
handling all of it in clock drivers
* Other minor binding updates based on review of v1

With performance state support for genpd/OPP merged, this is an effort
to model a power domain driver to communicate corner/level
values for qualcomm platforms to RPM (Remote Power Manager) and RPMh.

[1] https://lkml.org/lkml/2018/11/26/333

Rajendra Nayak (8):
  dt-bindings: opp: Introduce qcom-opp bindings
  dt-bindings: power: Add qcom rpm power domain driver bindings
  soc: qcom: rpmpd: Add a Power domain driver to model corners
  soc: qcom: rpmpd: Add support for get/set performance state
  arm64: dts: msm8996: Add rpmpd device node
  soc: qcom: rpmhpd: Add RPMh Power domain driver
  arm64: dts: sdm845: Add rpmh powercontroller node
  soc: qcom: rpmhpd: Mark mx as a parent for cx

 .../devicetree/bindings/opp/qcom-opp.txt  |  25 +
 .../devicetree/bindings/power/qcom,rpmpd.txt  | 146 ++
 arch/arm64/boot/dts/qcom/msm8996.dtsi |  34 ++
 arch/arm64/boot/dts/qcom/sdm845.dtsi  |  51 ++
 drivers/soc/qcom/Kconfig  |  18 +
 drivers/soc/qcom/Makefile |   2 +
 drivers/soc/qcom/rpmhpd.c | 442 ++
 drivers/soc/qcom/rpmpd.c  | 340 ++
 include/dt-bindings/power/qcom-rpmpd.h|  39 ++
 9 files changed, 1097 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/opp/qcom-opp.txt
 create mode 100644 Documentation/devicetree/bindings/power/qcom,rpmpd.txt
 create mode 100644 drivers/soc/qcom/rpmhpd.c
 create mode 100644 drivers/soc/qcom/rpmpd.c
 create mode 100644 include/dt-bindings/power/qcom-rpmpd.h

-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation



Re: [PATCH] jbd2: clean up indentation issue, replace spaces with tab

2018-12-03 Thread Theodore Y. Ts'o
On Mon, Nov 26, 2018 at 02:56:32PM +0100, Jan Kara wrote:
> On Fri 23-11-18 16:40:53, Colin King wrote:
> > From: Colin Ian King 
> > 
> > There is a statement that is indented with spaces, replace it with
> > a tab.
> > 
> > Signed-off-by: Colin Ian King 
> 
> Looks good. You can add:
> 
> Reviewed-by: Jan Kara 

Thanks, applied.

- Ted


  1   2   3   4   5   6   7   8   9   10   >