date:20190623

RE: [PATCH] usb: dwc3: Enable the USB snooping

2019-06-23 Thread Felipe Balbi



Hi,

Ran Wang  writes:
>> >> > >> >> >  /* Global Debug Queue/FIFO Space Available Register */
>> >> > >> >> >  #define DWC3_GDBGFIFOSPACE_NUM(n)  ((n) & 0x1f)
>> >> > >> >> >  #define DWC3_GDBGFIFOSPACE_TYPE(n) (((n) << 5) & 0x1e0)
>> >> > >> >> > @@ -859,6 +867,7 @@ struct dwc3_scratchpad_array {
>> >> > >> >> >   * 3   - Reserved
>> >> > >> >> >   * @imod_interval: set the interrupt moderation interval in 
>> >> > >> >> > 250ns
>> >> > >> >> >   * increments or 0 to disable.
>> >> > >> >> > + * @dma_coherent: set if enable dma-coherent.
>> >> > >> >>
>> >> > >> >> you're not enabling dma coherency, you're enabling cache snooping.
>> >> > >> >> And this property should describe that. Also, keep in mind
>> >> > >> >> that different devices may want different cache types for
>> >> > >> >> each of those fields, so your property would have to be a lot
>> >> > >> >> more complex. Something
>> >> > like:
>> >> > >> >>
>> >> > >> >>   snps,cache-type = , , ...
>> >> > >> >>
>> >> > >> >> Then driver would have to parse this properly to setup GSBUSCFG0.
>> >> > >
>> >> > > According to the DesignWare Cores SuperSpeed USB 3.0 Controller
>> >> > > Databook (v2.60a), it has described Type Bit Assignments for all
>> >> > > supported
>> >> > master bus type:
>> >> > > AHB, AXI3, AXI4 and Native. I found the bit definition are
>> >> > > different among
>> >> > them.
>> >> > > So, for the example you gave above, feel a little bit confused.
>> >> > > Did you mean:
>> >> > > snps,cache-type = , > >> > > "cacheable">, , 
>> >> >
>> >> > yeah, something like that.
>> >>
>> >> I think DATA_RD  should be a macro, right? So, where I can put its define?
>> >> Create a dwc3.h in include/dt-bindings/usb/ ?
>> >
>> > Could you please give me some advice here? I'd like to prepare next
>> > version patch after getting this settled.
>> >
>> >> Another question about this remain open is: DWC3 data book's Table
>> >> 6-5 Cache Type Bit Assignments show that bits definition will differ
>> >> per MBUS_TYPEs as
>> >> below:
>> >> 
>> >>  MBUS_TYPE| bit[3]   |bit[2]   |bit[1] |bit[0]
>> >>  
>> >>  AHB  |Cacheable |Bufferable   |Privilegge |Data
>> >>  AXI3 |Write Allocate|Read Allocate|Cacheable  |Bufferable
>> >>  AXI4 |Allocate Other|Allocate |Modifiable |Bufferable
>> >>  AXI4 |Other Allocate|Allocate |Modifiable |Bufferable
>> >>  Native   |Same as AXI   |Same as AXI  |Same as AXI|Same as AXI
>> >>  
>> >>  Note: The AHB, AXI3, AXI4, and PCIe busses use different names for
>> >> certain  signals, which have the same meaning:
>> >>Bufferable = Posted
>> >>Cacheable = Modifiable = Snoop (negation of No Snoop)
>> >>
>> >> For Layerscape SoCs, MBUS_TYPE is AXI3. So I am not sure how to use
>> >> snps,cache-type = , to cover all MBUS_TYPE?
>> >> (you can notice that AHB and AXI3's cacheable are on different bit)
>> >> Or I just need to handle AXI3 case?
>> >
>> > Also on this open. Thank you in advance.
>> 
>> You could pass two strings and let the driver process them. Something
>> like:
>> 
>>  snps,cache_type = <"data_wr" "write allocate">, <"desc_rd"
>> "cacheable">...
>> 
>> And so on. The only thing missing is for the mbus_type to be known by the 
>> driver.
>> Is that something we can figure out on any of the HWPARAMS registers or does
>> it have to be told explicitly?
>
> I have checked Layerscape Reference manual, HWPARAMS0~8 doesn't contain 
> mbus_type
> Info, and I didn't know where have declared it explicitly.
>
>> Another option would be to pass a string followed by one hex digit for the 
>> bits:
>> 
>>  snps,cache_type = <"data_wr" 0x8>, <"desc_rd" 0x2>...;
>> 
>> Then we don't need to describe mbus_type since the bits are what matters.
>
> Yes, it's also what we prefer to use, it will be more flexible, I can add 
> above Table
> 6-5 Cache Type Bit Assignments in binding to help user decide which value they
> would use.
>
> I would submit another version of patch for further review, thank you very 
> much.

cool, thanks

-- 
balbi

Re: [PATCH v2 11/11] arm64: dts: sc9860: Update coresight DT bindings

2019-06-23 Thread Chunyan Zhang

Hi Leo,

Applied the patch 10-11/11 to my tree, thanks!

Chunyan



Chunyan

On Wed, 8 May 2019 at 10:21, Leo Yan  wrote:
>
> CoreSight DT bindings have been updated, thus the old compatible strings
> are obsolete and the drivers will report warning if DTS uses these
> obsolete strings.
>
> This patch switches to the new bindings for CoreSight dynamic funnel,
> so can dismiss warning during initialisation.
>
> Cc: Chunyan Zhang 
> Cc: Orson Zhai 
> Cc: Mathieu Poirier 
> Cc: Suzuki K Poulose 
> Signed-off-by: Leo Yan 
> Acked-by: Chunyan Zhang 
> ---
>  arch/arm64/boot/dts/sprd/sc9860.dtsi | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/arch/arm64/boot/dts/sprd/sc9860.dtsi 
> b/arch/arm64/boot/dts/sprd/sc9860.dtsi
> index b25d19977170..e27eb3ed1d47 100644
> --- a/arch/arm64/boot/dts/sprd/sc9860.dtsi
> +++ b/arch/arm64/boot/dts/sprd/sc9860.dtsi
> @@ -300,7 +300,7 @@
> };
>
> funnel@10001000 { /* SoC Funnel */
> -   compatible = "arm,coresight-funnel", "arm,primecell";
> +   compatible = "arm,coresight-dynamic-funnel", 
> "arm,primecell";
> reg = <0 0x10001000 0 0x1000>;
> clocks = <_26m>;
> clock-names = "apb_pclk";
> @@ -367,7 +367,7 @@
> };
>
> funnel@11001000 { /* Cluster0 Funnel */
> -   compatible = "arm,coresight-funnel", "arm,primecell";
> +   compatible = "arm,coresight-dynamic-funnel", 
> "arm,primecell";
> reg = <0 0x11001000 0 0x1000>;
> clocks = <_26m>;
> clock-names = "apb_pclk";
> @@ -415,7 +415,7 @@
> };
>
> funnel@11002000 { /* Cluster1 Funnel */
> -   compatible = "arm,coresight-funnel", "arm,primecell";
> +   compatible = "arm,coresight-dynamic-funnel", 
> "arm,primecell";
> reg = <0 0x11002000 0 0x1000>;
> clocks = <_26m>;
> clock-names = "apb_pclk";
> @@ -513,7 +513,7 @@
> };
>
> funnel@11005000 { /* Main Funnel */
> -   compatible = "arm,coresight-funnel", "arm,primecell";
> +   compatible = "arm,coresight-dynamic-funnel", 
> "arm,primecell";
> reg = <0 0x11005000 0 0x1000>;
> clocks = <_26m>;
> clock-names = "apb_pclk";
> --
> 2.17.1
>

Re: [PATCH] mm/hugetlb: allow gigantic page allocation to migrate away smaller huge page

2019-06-23 Thread Pingfan Liu

On Mon, Jun 24, 2019 at 1:03 PM Ira Weiny  wrote:
>
> On Mon, Jun 24, 2019 at 12:21:08PM +0800, Pingfan Liu wrote:
> > The current pfn_range_valid_gigantic() rejects the pud huge page allocation
> > if there is a pmd huge page inside the candidate range.
> >
> > But pud huge resource is more rare, which should align on 1GB on x86. It is
> > worth to allow migrating away pmd huge page to make room for a pud huge
> > page.
> >
> > The same logic is applied to pgd and pud huge pages.
>
> I'm sorry but I don't quite understand why we should do this.  Is this a bug 
> or
> an optimization?  It sounds like an optimization.
Yes, an optimization. It can help us to success to allocate a 1GB
hugetlb if there is some 2MB hugetlb sit in the candidate range.
Allocation 1GB hugetlb requires more tough condition, not only a
continuous 1GB range, but also aligned on GB. While allocating a 2MB
range is easier.
>
> >
> > Signed-off-by: Pingfan Liu 
> > Cc: Mike Kravetz 
> > Cc: Oscar Salvador 
> > Cc: David Hildenbrand 
> > Cc: Andrew Morton 
> > Cc: linux-kernel@vger.kernel.org
> > ---
> >  mm/hugetlb.c | 8 +---
> >  1 file changed, 5 insertions(+), 3 deletions(-)
> >
> > diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> > index ac843d3..02d1978 100644
> > --- a/mm/hugetlb.c
> > +++ b/mm/hugetlb.c
> > @@ -1081,7 +1081,11 @@ static bool pfn_range_valid_gigantic(struct zone *z,
> >   unsigned long start_pfn, unsigned long nr_pages)
> >  {
> >   unsigned long i, end_pfn = start_pfn + nr_pages;
> > - struct page *page;
> > + struct page *page = pfn_to_page(start_pfn);
> > +
> > + if (PageHuge(page))
> > + if (compound_order(compound_head(page)) >= nr_pages)
>
> I don't think you want compound_order() here.
Yes, your are right.

Thanks,
  Pingfan
>
> Ira
>
> > + return false;
> >
> >   for (i = start_pfn; i < end_pfn; i++) {
> >   if (!pfn_valid(i))
> > @@ -1098,8 +1102,6 @@ static bool pfn_range_valid_gigantic(struct zone *z,
> >   if (page_count(page) > 0)
> >   return false;
> >
> > - if (PageHuge(page))
> > - return false;
> >   }
> >
> >   return true;
> > --
> > 2.7.5
> >

[PATCH 07/12] xfs: don't preallocate a transaction for file size updates

2019-06-23 Thread Christoph Hellwig

We have historically decided that we want to preallocate the xfs_trans
structure at writeback time so that we don't have to allocate on in
the I/O completion handler.  But we treat unwrittent extent and COW
fork conversions different already, which proves that the transaction
allocations in the end I/O handler are not a problem.  Removing the
preallocation gets rid of a lot of corner case code, and also ensures
we only allocate one and log a transaction when actually required,
as the ioend merging can reduce the number of actual i_size updates
significantly.

Signed-off-by: Christoph Hellwig 
---
 fs/xfs/xfs_aops.c | 110 +-
 fs/xfs/xfs_aops.h |   1 -
 2 files changed, 12 insertions(+), 99 deletions(-)

diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c
index 633baaaff7ae..017b87b7765f 100644
--- a/fs/xfs/xfs_aops.c
+++ b/fs/xfs/xfs_aops.c
@@ -130,44 +130,23 @@ static inline bool xfs_ioend_is_append(struct xfs_ioend 
*ioend)
XFS_I(ioend->io_inode)->i_d.di_size;
 }
 
-STATIC int
-xfs_setfilesize_trans_alloc(
-   struct xfs_ioend*ioend)
-{
-   struct xfs_mount*mp = XFS_I(ioend->io_inode)->i_mount;
-   struct xfs_trans*tp;
-   int error;
-
-   error = xfs_trans_alloc(mp, _RES(mp)->tr_fsyncts, 0, 0, 0, );
-   if (error)
-   return error;
-
-   ioend->io_append_trans = tp;
-
-   /*
-* We may pass freeze protection with a transaction.  So tell lockdep
-* we released it.
-*/
-   __sb_writers_release(ioend->io_inode->i_sb, SB_FREEZE_FS);
-   /*
-* We hand off the transaction to the completion thread now, so
-* clear the flag here.
-*/
-   current_restore_flags_nested(>t_pflags, PF_MEMALLOC_NOFS);
-   return 0;
-}
-
 /*
  * Update on-disk file size now that data has been written to disk.
  */
-STATIC int
-__xfs_setfilesize(
+int
+xfs_setfilesize(
struct xfs_inode*ip,
-   struct xfs_trans*tp,
xfs_off_t   offset,
size_t  size)
 {
+   struct xfs_mount*mp = ip->i_mount;
+   struct xfs_trans*tp;
xfs_fsize_t isize;
+   int error;
+
+   error = xfs_trans_alloc(mp, _RES(mp)->tr_fsyncts, 0, 0, 0, );
+   if (error)
+   return error;
 
xfs_ilock(ip, XFS_ILOCK_EXCL);
isize = xfs_new_eof(ip, offset + size);
@@ -186,48 +165,6 @@ __xfs_setfilesize(
return xfs_trans_commit(tp);
 }
 
-int
-xfs_setfilesize(
-   struct xfs_inode*ip,
-   xfs_off_t   offset,
-   size_t  size)
-{
-   struct xfs_mount*mp = ip->i_mount;
-   struct xfs_trans*tp;
-   int error;
-
-   error = xfs_trans_alloc(mp, _RES(mp)->tr_fsyncts, 0, 0, 0, );
-   if (error)
-   return error;
-
-   return __xfs_setfilesize(ip, tp, offset, size);
-}
-
-STATIC int
-xfs_setfilesize_ioend(
-   struct xfs_ioend*ioend,
-   int error)
-{
-   struct xfs_inode*ip = XFS_I(ioend->io_inode);
-   struct xfs_trans*tp = ioend->io_append_trans;
-
-   /*
-* The transaction may have been allocated in the I/O submission thread,
-* thus we need to mark ourselves as being in a transaction manually.
-* Similarly for freeze protection.
-*/
-   current_set_flags_nested(>t_pflags, PF_MEMALLOC_NOFS);
-   __sb_writers_acquired(VFS_I(ip)->i_sb, SB_FREEZE_FS);
-
-   /* we abort the update if there was an IO error */
-   if (error) {
-   xfs_trans_cancel(tp);
-   return error;
-   }
-
-   return __xfs_setfilesize(ip, tp, ioend->io_offset, ioend->io_size);
-}
-
 /*
  * IO write completion.
  */
@@ -267,12 +204,9 @@ xfs_end_ioend(
error = xfs_reflink_end_cow(ip, offset, size);
else if (ioend->io_type == IOMAP_UNWRITTEN)
error = xfs_iomap_write_unwritten(ip, offset, size, false);
-   else
-   ASSERT(!xfs_ioend_is_append(ioend) || ioend->io_append_trans);
-
+   if (!error && xfs_ioend_is_append(ioend))
+   error = xfs_setfilesize(ip, offset, size);
 done:
-   if (ioend->io_append_trans)
-   error = xfs_setfilesize_ioend(ioend, error);
list_replace_init(>io_list, _list);
xfs_destroy_ioend(ioend, error);
 
@@ -307,8 +241,6 @@ xfs_ioend_can_merge(
return false;
if (ioend->io_offset + ioend->io_size != next->io_offset)
return false;
-   if (xfs_ioend_is_append(ioend) != xfs_ioend_is_append(next))
-   return false;
return true;
 }
 
@@ -320,7 +252,6 @@ xfs_ioend_try_merge(
 {
struct xfs_ioend*next_ioend;
int ioend_error;
-   int

[PATCH 11/12] iomap: move the xfs writeback code to iomap.c

2019-06-23 Thread Christoph Hellwig

Takes the xfs writeback code and move it to iomap.c.  A new structure
with three methods is added as the abstraction from the generic
writeback code to the file system.  These methods are used to map
blocks, submit an ioend, and cancel a page that encountered an error
before it was added to an ioend.

Note that we temporarily lose the writepage tracing, but that will
be added back soon.

Signed-off-by: Christoph Hellwig 
---
 fs/iomap.c| 521 -
 fs/xfs/xfs_aops.c | 584 --
 fs/xfs/xfs_aops.h |  16 --
 fs/xfs/xfs_super.c|  11 +-
 include/linux/iomap.h |  41 +++
 5 files changed, 605 insertions(+), 568 deletions(-)

diff --git a/fs/iomap.c b/fs/iomap.c
index 23ef63fd1669..72a1b622e634 100644
--- a/fs/iomap.c
+++ b/fs/iomap.c
@@ -1,7 +1,7 @@
 // SPDX-License-Identifier: GPL-2.0
 /*
  * Copyright (C) 2010 Red Hat, Inc.
- * Copyright (c) 2016-2018 Christoph Hellwig.
+ * Copyright (c) 2016-2019 Christoph Hellwig.
  */
 #include 
 #include 
@@ -12,6 +12,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -25,6 +26,8 @@
 
 #include "internal.h"
 
+static struct bio_set iomap_ioend_bioset;
+
 /*
  * Execute a iomap write on a segment of the mapping that spans a
  * contiguous range of pages that have identical block mapping state.
@@ -2192,3 +2195,519 @@ iomap_bmap(struct address_space *mapping, sector_t bno,
return bno;
 }
 EXPORT_SYMBOL_GPL(iomap_bmap);
+
+static void
+iomap_finish_page_writeback(struct inode *inode, struct bio_vec *bvec,
+   int error)
+{
+   struct iomap_page *iop = to_iomap_page(bvec->bv_page);
+
+   if (error) {
+   SetPageError(bvec->bv_page);
+   mapping_set_error(inode->i_mapping, -EIO);
+   }
+
+   WARN_ON_ONCE(i_blocksize(inode) < PAGE_SIZE && !iop);
+   WARN_ON_ONCE(iop && atomic_read(>write_count) <= 0);
+
+   if (!iop || atomic_dec_and_test(>write_count))
+   end_page_writeback(bvec->bv_page);
+}
+
+/*
+ * We're now finished for good with this ioend structure.  Update the page
+ * state, release holds on bios, and finally free up memory.  Do not use the
+ * ioend after this.
+ */
+void
+iomap_finish_ioend(struct iomap_ioend *ioend, int error)
+{
+   struct inode *inode = ioend->io_inode;
+   struct bio *bio = >io_inline_bio;
+   struct bio *last = ioend->io_bio, *next;
+   u64 start = bio->bi_iter.bi_sector;
+   bool quiet = bio_flagged(bio, BIO_QUIET);
+
+   for (bio = >io_inline_bio; bio; bio = next) {
+   struct bio_vec  *bvec;
+   struct bvec_iter_all iter_all;
+
+   /*
+* For the last bio, bi_private points to the ioend, so we
+* need to explicitly end the iteration here.
+*/
+   if (bio == last)
+   next = NULL;
+   else
+   next = bio->bi_private;
+
+   /* walk each page on bio, ending page IO on them */
+   bio_for_each_segment_all(bvec, bio, iter_all)
+   iomap_finish_page_writeback(inode, bvec, error);
+   bio_put(bio);
+   }
+
+   if (unlikely(error && !quiet)) {
+   printk_ratelimited(KERN_ERR
+   "%s: writeback error on sector %llu",
+   inode->i_sb->s_id, start);
+   }
+}
+EXPORT_SYMBOL_GPL(iomap_finish_ioend);
+
+void
+iomap_finish_ioends(struct iomap_ioend *ioend, int error)
+{
+   struct list_head tmp;
+
+   list_replace_init(>io_list, );
+   iomap_finish_ioend(ioend, error);
+   while ((ioend = list_pop(, struct iomap_ioend, io_list)))
+   iomap_finish_ioend(ioend, error);
+}
+EXPORT_SYMBOL_GPL(iomap_finish_ioends);
+
+/*
+ * We can merge two adjacent ioends if they have the same set of work to do.
+ */
+static bool
+iomap_ioend_can_merge(struct iomap_ioend *ioend, struct iomap_ioend *next)
+{
+   if (ioend->io_bio->bi_status != next->io_bio->bi_status)
+   return false;
+   if ((ioend->io_flags & IOMAP_F_SHARED) ^
+   (next->io_flags & IOMAP_F_SHARED))
+   return false;
+   if ((ioend->io_type == IOMAP_UNWRITTEN) ^
+   (next->io_type == IOMAP_UNWRITTEN))
+   return false;
+   if (ioend->io_offset + ioend->io_size != next->io_offset)
+   return false;
+   return true;
+}
+
+void
+iomap_ioend_try_merge(struct iomap_ioend *ioend, struct list_head *more_ioends)
+{
+   struct iomap_ioend *next;
+
+   INIT_LIST_HEAD(>io_list);
+
+   while ((next = list_first_entry_or_null(more_ioends, struct iomap_ioend,
+   io_list))) {
+   if (!iomap_ioend_can_merge(ioend, next))
+   break;
+   list_move_tail(>io_list, >io_list);
+   ioend->io_size += next->io_size;
+   }
+}

Re: [RFC PATCH] arm64: dts: fsl: wandboard: Add a device tree for the PICO-PI-IMX8M

2019-06-23 Thread Matti Vaittinen

Hello Richard,

Nice to see you upstreaming this! Thumbs up!

Just few remarks to pmic node from me:

On Thu, Jun 20, 2019 at 04:32:52PM +0300, Andra Danciu wrote:
> From: Richard Hu 
> 
> The current level of support yields a working console and is able to boot
> userspace from an initial ramdisk copied via u-boot in RAM.
> 
> Additional subsystems that are active :
>   - Ethernet
>   - USB
> 
> Cc: Daniel Baluta 
> Signed-off-by: Richard Hu 
> Signed-off-by: Andra Danciu 
> ---
>  I am using pico-pi-8mxm board to work on my project for Google Summer of 
> Code.
>  This is based on patches from https://github.com/wandboard-org.
> 
>  arch/arm64/boot/dts/freescale/Makefile   |   1 +
>  arch/arm64/boot/dts/freescale/wand-pi-8m.dts | 590 
> +++
>  2 files changed, 591 insertions(+)
>  create mode 100644 arch/arm64/boot/dts/freescale/wand-pi-8m.dts
> 
> diff --git a/arch/arm64/boot/dts/freescale/Makefile 
> b/arch/arm64/boot/dts/freescale/Makefile
> index 984554343c83..5904d6a8a033 100644
> --- a/arch/arm64/boot/dts/freescale/Makefile
> +++ b/arch/arm64/boot/dts/freescale/Makefile
> @@ -23,3 +23,4 @@ dtb-$(CONFIG_ARCH_LAYERSCAPE) += fsl-lx2160a-rdb.dtb
>  dtb-$(CONFIG_ARCH_MXC) += imx8mm-evk.dtb
>  dtb-$(CONFIG_ARCH_MXC) += imx8mq-evk.dtb
>  dtb-$(CONFIG_ARCH_MXC) += imx8qxp-mek.dtb
> +dtb-$(CONFIG_ARCH_MXC) += wand-pi-8m.dtb
> diff --git a/arch/arm64/boot/dts/freescale/wand-pi-8m.dts 
> b/arch/arm64/boot/dts/freescale/wand-pi-8m.dts
> new file mode 100644
> index ..9f7121014722
> --- /dev/null
> +++ b/arch/arm64/boot/dts/freescale/wand-pi-8m.dts
> @@ -0,0 +1,590 @@

// snip

> +
> + {
> + clock-frequency = <10>;
> + pinctrl-names = "default";
> + pinctrl-0 = <_i2c1>;
> + status = "okay";
> +
> + typec_tusb320:tusb320@47 {
> + compatible = "ti,tusb320";
> + pinctrl-names = "default";
> + pinctrl-0 = <_tusb320_irq _typec_ss_sel>;
> + reg = <0x47>;
> + vbus-supply = <_usb_otg_vbus>;
> + ss-sel-gpios = < 5 GPIO_ACTIVE_HIGH>;
> + tusb320,int-gpio = < 6 GPIO_ACTIVE_LOW>;
> + tusb320,select-mode = <0>;
> + tusb320,dfp-power = <0>;
> + };
> +
> + pmic: bd71837@4b {

I was once told the node names should be generic :] So, I'd suggest
using "pmic@4b".

> + reg = <0x4b>;
> + compatible = "rohm,bd71837";
> + /* PMIC BD71837 PMIC_nINT GPIO1_IO12 */
> + pinctrl-0 = <_pmic>;
> + gpio_intr = < 3 GPIO_ACTIVE_LOW>;
> +
> + bd71837,pmic-buck1-uses-i2c-dvs;
> + bd71837,pmic-buck1-dvs-voltage = <90>, <85>, <80>; 
> /* VDD_SOC: Run-Idle-Suspend */
> + bd71837,pmic-buck2-uses-i2c-dvs;
> + bd71837,pmic-buck2-dvs-voltage = <100>, <90>, <0>; /* 
> VDD_ARM: Run-Idle */
> + bd71837,pmic-buck3-uses-i2c-dvs;
> + bd71837,pmic-buck3-dvs-voltage = <100>, <0>, <0>; /* 
> VDD_GPU: Run */
> + bd71837,pmic-buck4-uses-i2c-dvs;
> + bd71837,pmic-buck4-dvs-voltage = <100>, <0>, <0>; /* 
> VDD_VPU: Run */

These entries should be replaced by proper properties for run-level voltage
configuration. Please see the
Documentation/devicetree/bindings/mfd/rohm,bd71837-pmic.txt and
Documentation/devicetree/bindings/regulator/rohm,bd71837-regulator.txt.

I think you wish to use rohm,dvs-run-voltage, rohm,dvs-idle-voltage,
and rohm,dvs-suspend-voltage instead.

Furthermore, I see you are not specifying rohm,reset-snvs-powered.
I wonder if it is intentional to not use SNVS as reset target. Seeing you
use i.MX8 and seeing used those unsupported run-level configuration properties
which were present only in some very first proprietary driver draft - I
expect this may not be intentional. I think that early driver defaulted
to SNVS while it also failed to provide any regulator enable/disable
control.

> +
> + gpo {
> + rohm,drv = <0x0C>;  /* 0b_1100 all gpos with 
> cmos output mode */
> + };

What is this?

> +
> + regulators {
> + #address-cells = <1>;
> + #size-cells = <0>;
> +
> + buck1_reg: regulator@0 {

I don't think the node names are correct. As far as I know the regulator
core uses node names - please see the valid names from documentation.

> + reg = <0>;
> + regulator-compatible = "buck1";
I think you shouldn't use regulator-compatible. On the other hand, I
think you should use regulator-name.
> + regulator-min-microvolt = <70>;
> + regulator-max-microvolt = <130>;
> + regulator-boot-on;
> + regulator-always-on;
> + regulator-ramp-delay = <1250>;
> +

[PATCH 03/12] xfs: fix a comment typo in xfs_submit_ioend

2019-06-23 Thread Christoph Hellwig

The fail argument is long gone, update the comment.

Signed-off-by: Christoph Hellwig 
---
 fs/xfs/xfs_aops.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c
index 9cceb90e77c5..dc60aec0c5a7 100644
--- a/fs/xfs/xfs_aops.c
+++ b/fs/xfs/xfs_aops.c
@@ -626,7 +626,7 @@ xfs_map_blocks(
  * reference to the ioend to ensure that the ioend completion is only done once
  * all bios have been submitted and the ioend is really done.
  *
- * If @fail is non-zero, it means that we have a situation where some part of
+ * If @status is non-zero, it means that we have a situation where some part of
  * the submission process has failed after we have marked paged for writeback
  * and unlocked them. In this situation, we need to fail the bio and ioend
  * rather than submit it to IO. This typically only happens on a filesystem
-- 
2.20.1

[PATCH 06/12] xfs: remove XFS_TRANS_NOFS

2019-06-23 Thread Christoph Hellwig

Instead of a magic flag for xfs_trans_alloc, just ensure all callers
that can't relclaim through the file system use memalloc_nofs_save to
set the per-task nofs flag.

Signed-off-by: Christoph Hellwig 
---
 fs/xfs/libxfs/xfs_shared.h |  1 -
 fs/xfs/xfs_aops.c  | 12 +---
 fs/xfs/xfs_file.c  | 12 +---
 fs/xfs/xfs_iomap.c |  2 +-
 fs/xfs/xfs_reflink.c   |  4 ++--
 fs/xfs/xfs_trans.c |  4 +---
 6 files changed, 22 insertions(+), 13 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_shared.h b/fs/xfs/libxfs/xfs_shared.h
index 4e909791aeac..1f2b5a0c71b4 100644
--- a/fs/xfs/libxfs/xfs_shared.h
+++ b/fs/xfs/libxfs/xfs_shared.h
@@ -65,7 +65,6 @@ void  xfs_log_get_max_trans_res(struct xfs_mount *mp,
 #define XFS_TRANS_DQ_DIRTY 0x10/* at least one dquot in trx dirty */
 #define XFS_TRANS_RESERVE  0x20/* OK to use reserved data blocks */
 #define XFS_TRANS_NO_WRITECOUNT 0x40   /* do not elevate SB writecount */
-#define XFS_TRANS_NOFS 0x80/* pass KM_NOFS to kmem_alloc */
 /*
  * LOWMODE is used by the allocator to activate the lowspace algorithm - when
  * free space is running low the extent allocator may choose to allocate an
diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c
index 93a760f13017..633baaaff7ae 100644
--- a/fs/xfs/xfs_aops.c
+++ b/fs/xfs/xfs_aops.c
@@ -138,8 +138,7 @@ xfs_setfilesize_trans_alloc(
struct xfs_trans*tp;
int error;
 
-   error = xfs_trans_alloc(mp, _RES(mp)->tr_fsyncts, 0, 0,
-   XFS_TRANS_NOFS, );
+   error = xfs_trans_alloc(mp, _RES(mp)->tr_fsyncts, 0, 0, 0, );
if (error)
return error;
 
@@ -236,6 +235,7 @@ STATIC void
 xfs_end_ioend(
struct xfs_ioend*ioend)
 {
+   unsigned intnofs_flag = memalloc_nofs_save();
struct list_headioend_list;
struct xfs_inode*ip = XFS_I(ioend->io_inode);
xfs_off_t   offset = ioend->io_offset;
@@ -282,6 +282,8 @@ xfs_end_ioend(
list_del_init(>io_list);
xfs_destroy_ioend(ioend, error);
}
+
+   memalloc_nofs_restore(nofs_flag);
 }
 
 /*
@@ -663,8 +665,12 @@ xfs_submit_ioend(
(ioend->io_fork == XFS_COW_FORK ||
 ioend->io_type != IOMAP_UNWRITTEN) &&
xfs_ioend_is_append(ioend) &&
-   !ioend->io_append_trans)
+   !ioend->io_append_trans) {
+   unsigned nofs_flag = memalloc_nofs_save();
+
status = xfs_setfilesize_trans_alloc(ioend);
+   memalloc_nofs_restore(nofs_flag);
+   }
 
ioend->io_bio->bi_private = ioend;
ioend->io_bio->bi_end_io = xfs_end_bio;
diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
index 916a35cae5e9..f2d806ef8f06 100644
--- a/fs/xfs/xfs_file.c
+++ b/fs/xfs/xfs_file.c
@@ -379,6 +379,7 @@ xfs_dio_write_end_io(
struct inode*inode = file_inode(iocb->ki_filp);
struct xfs_inode*ip = XFS_I(inode);
loff_t  offset = iocb->ki_pos;
+   unsigned intnofs_flag;
int error = 0;
 
trace_xfs_end_io_direct_write(ip, offset, size);
@@ -395,10 +396,11 @@ xfs_dio_write_end_io(
 */
XFS_STATS_ADD(ip->i_mount, xs_write_bytes, size);
 
+   nofs_flag = memalloc_nofs_save();
if (flags & IOMAP_DIO_COW) {
error = xfs_reflink_end_cow(ip, offset, size);
if (error)
-   return error;
+   goto out;
}
 
/*
@@ -407,8 +409,10 @@ xfs_dio_write_end_io(
 * earlier allows a racing dio read to find unwritten extents before
 * they are converted.
 */
-   if (flags & IOMAP_DIO_UNWRITTEN)
-   return xfs_iomap_write_unwritten(ip, offset, size, true);
+   if (flags & IOMAP_DIO_UNWRITTEN) {
+   error = xfs_iomap_write_unwritten(ip, offset, size, true);
+   goto out;
+   }
 
/*
 * We need to update the in-core inode size here so that we don't end up
@@ -430,6 +434,8 @@ xfs_dio_write_end_io(
spin_unlock(>i_flags_lock);
}
 
+out:
+   memalloc_nofs_restore(nofs_flag);
return error;
 }
 
diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
index 6b29452bfba0..461ea023b910 100644
--- a/fs/xfs/xfs_iomap.c
+++ b/fs/xfs/xfs_iomap.c
@@ -782,7 +782,7 @@ xfs_iomap_write_unwritten(
 * complete here and might deadlock on the iolock.
 */
error = xfs_trans_alloc(mp, _RES(mp)->tr_write, resblks, 0,
-   XFS_TRANS_RESERVE | XFS_TRANS_NOFS, );
+   XFS_TRANS_RESERVE, );
if (error)
return error;
 
diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c
index 680ae7662a78..0b23c2b29609 100644
--- a/fs/xfs/xfs_reflink.c

[PATCH 05/12] xfs: use a struct iomap in xfs_writepage_ctx

2019-06-23 Thread Christoph Hellwig

In preparation for moving the XFS writeback code to fs/iomap.c, switch
it to use struct iomap instead of the XFS-specific struct xfs_bmbt_irec.

Signed-off-by: Christoph Hellwig 
---
 fs/xfs/libxfs/xfs_bmap.c | 14 +--
 fs/xfs/libxfs/xfs_bmap.h |  3 +-
 fs/xfs/xfs_aops.c| 80 +++-
 fs/xfs/xfs_aops.h|  2 +-
 4 files changed, 50 insertions(+), 49 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
index 4133bc461e3e..de35a0376156 100644
--- a/fs/xfs/libxfs/xfs_bmap.c
+++ b/fs/xfs/libxfs/xfs_bmap.c
@@ -39,6 +39,7 @@
 #include "xfs_ag_resv.h"
 #include "xfs_refcount.h"
 #include "xfs_icache.h"
+#include "xfs_iomap.h"
 
 
 kmem_zone_t*xfs_bmap_free_item_zone;
@@ -4457,16 +4458,21 @@ int
 xfs_bmapi_convert_delalloc(
struct xfs_inode*ip,
int whichfork,
-   xfs_fileoff_t   offset_fsb,
-   struct xfs_bmbt_irec*imap,
+   xfs_off_t   offset,
+   struct iomap*iomap,
unsigned int*seq)
 {
struct xfs_ifork*ifp = XFS_IFORK_PTR(ip, whichfork);
struct xfs_mount*mp = ip->i_mount;
+   xfs_fileoff_t   offset_fsb = XFS_B_TO_FSBT(mp, offset);
struct xfs_bmalloca bma = { NULL };
+   u16 flags = 0;
struct xfs_trans*tp;
int error;
 
+   if (whichfork == XFS_COW_FORK)
+   flags |= IOMAP_F_SHARED;
+
/*
 * Space for the extent and indirect blocks was reserved when the
 * delalloc extent was created so there's no need to do so here.
@@ -4496,7 +4502,7 @@ xfs_bmapi_convert_delalloc(
 * the extent.  Just return the real extent at this offset.
 */
if (!isnullstartblock(bma.got.br_startblock)) {
-   *imap = bma.got;
+   xfs_bmbt_to_iomap(ip, iomap, , flags);
*seq = READ_ONCE(ifp->if_seq);
goto out_trans_cancel;
}
@@ -4529,7 +4535,7 @@ xfs_bmapi_convert_delalloc(
XFS_STATS_INC(mp, xs_xstrat_quick);
 
ASSERT(!isnullstartblock(bma.got.br_startblock));
-   *imap = bma.got;
+   xfs_bmbt_to_iomap(ip, iomap, , flags);
*seq = READ_ONCE(ifp->if_seq);
 
if (whichfork == XFS_COW_FORK) {
diff --git a/fs/xfs/libxfs/xfs_bmap.h b/fs/xfs/libxfs/xfs_bmap.h
index 8f597f9abdbe..3c3470f11648 100644
--- a/fs/xfs/libxfs/xfs_bmap.h
+++ b/fs/xfs/libxfs/xfs_bmap.h
@@ -220,8 +220,7 @@ int xfs_bmapi_reserve_delalloc(struct xfs_inode *ip, int 
whichfork,
struct xfs_bmbt_irec *got, struct xfs_iext_cursor *cur,
int eof);
 intxfs_bmapi_convert_delalloc(struct xfs_inode *ip, int whichfork,
-   xfs_fileoff_t offset_fsb, struct xfs_bmbt_irec *imap,
-   unsigned int *seq);
+   xfs_off_t offset, struct iomap *iomap, unsigned int *seq);
 intxfs_bmap_add_extent_unwritten_real(struct xfs_trans *tp,
struct xfs_inode *ip, int whichfork,
struct xfs_iext_cursor *icur, struct xfs_btree_cur **curp,
diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c
index dc60aec0c5a7..93a760f13017 100644
--- a/fs/xfs/xfs_aops.c
+++ b/fs/xfs/xfs_aops.c
@@ -27,7 +27,7 @@
  * structure owned by writepages passed to individual writepage calls
  */
 struct xfs_writepage_ctx {
-   struct xfs_bmbt_irecimap;
+   struct iomapiomap;
int fork;
unsigned intdata_seq;
unsigned intcow_seq;
@@ -265,7 +265,7 @@ xfs_end_ioend(
 */
if (ioend->io_fork == XFS_COW_FORK)
error = xfs_reflink_end_cow(ip, offset, size);
-   else if (ioend->io_state == XFS_EXT_UNWRITTEN)
+   else if (ioend->io_type == IOMAP_UNWRITTEN)
error = xfs_iomap_write_unwritten(ip, offset, size, false);
else
ASSERT(!xfs_ioend_is_append(ioend) || ioend->io_append_trans);
@@ -300,8 +300,8 @@ xfs_ioend_can_merge(
return false;
if ((ioend->io_fork == XFS_COW_FORK) ^ (next->io_fork == XFS_COW_FORK))
return false;
-   if ((ioend->io_state == XFS_EXT_UNWRITTEN) ^
-   (next->io_state == XFS_EXT_UNWRITTEN))
+   if ((ioend->io_type == IOMAP_UNWRITTEN) ^
+   (next->io_type == IOMAP_UNWRITTEN))
return false;
if (ioend->io_offset + ioend->io_size != next->io_offset)
return false;
@@ -395,7 +395,7 @@ xfs_end_bio(
unsigned long   flags;
 
if (ioend->io_fork == XFS_COW_FORK ||
-   ioend->io_state == XFS_EXT_UNWRITTEN ||
+   ioend->io_type == IOMAP_UNWRITTEN ||
ioend->io_append_trans != NULL) {
spin_lock_irqsave(>i_ioend_lock, flags);
if (list_empty(>i_ioend_list))
@@ -415,10 +415,10 @@ static bool
 xfs_imap_valid(

[PATCH 04/12] xfs: initialize ioma->flags in xfs_bmbt_to_iomap

2019-06-23 Thread Christoph Hellwig

Currently we don't overwrite the flags field in the iomap in
xfs_bmbt_to_iomap.  This works fine with 0-initialized iomaps on stack,
but is harmful once we want to be able to reuse an iomap in the
writeback code.  Replace the shared paramter with a set of initial
flags an thus ensures the flags field is always reinitialized.

Signed-off-by: Christoph Hellwig 
---
 fs/xfs/xfs_iomap.c | 28 +---
 fs/xfs/xfs_iomap.h |  2 +-
 fs/xfs/xfs_pnfs.c  |  2 +-
 3 files changed, 19 insertions(+), 13 deletions(-)

diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
index 63d323916bba..6b29452bfba0 100644
--- a/fs/xfs/xfs_iomap.c
+++ b/fs/xfs/xfs_iomap.c
@@ -57,7 +57,7 @@ xfs_bmbt_to_iomap(
struct xfs_inode*ip,
struct iomap*iomap,
struct xfs_bmbt_irec*imap,
-   boolshared)
+   u16 flags)
 {
struct xfs_mount*mp = ip->i_mount;
 
@@ -82,12 +82,11 @@ xfs_bmbt_to_iomap(
iomap->length = XFS_FSB_TO_B(mp, imap->br_blockcount);
iomap->bdev = xfs_find_bdev_for_inode(VFS_I(ip));
iomap->dax_dev = xfs_find_daxdev_for_inode(VFS_I(ip));
+   iomap->flags = flags;
 
if (xfs_ipincount(ip) &&
(ip->i_itemp->ili_fsync_fields & ~XFS_ILOG_TIMESTAMP))
iomap->flags |= IOMAP_F_DIRTY;
-   if (shared)
-   iomap->flags |= IOMAP_F_SHARED;
return 0;
 }
 
@@ -543,6 +542,7 @@ xfs_file_iomap_begin_delay(
struct xfs_iext_cursor  icur, ccur;
xfs_fsblock_t   prealloc_blocks = 0;
booleof = false, cow_eof = false, shared = false;
+   u16 iomap_flags = 0;
int whichfork = XFS_DATA_FORK;
int error = 0;
 
@@ -710,7 +710,7 @@ xfs_file_iomap_begin_delay(
 * Flag newly allocated delalloc blocks with IOMAP_F_NEW so we punch
 * them out if the write happens to fail.
 */
-   iomap->flags |= IOMAP_F_NEW;
+   iomap_flags |= IOMAP_F_NEW;
trace_xfs_iomap_alloc(ip, offset, count, whichfork,
whichfork == XFS_DATA_FORK ?  : );
 done:
@@ -718,14 +718,17 @@ xfs_file_iomap_begin_delay(
if (imap.br_startoff > offset_fsb) {
xfs_trim_extent(, offset_fsb,
imap.br_startoff - offset_fsb);
-   error = xfs_bmbt_to_iomap(ip, iomap, , true);
+   error = xfs_bmbt_to_iomap(ip, iomap, ,
+   IOMAP_F_SHARED);
goto out_unlock;
}
/* ensure we only report blocks we have a reservation for */
xfs_trim_extent(, cmap.br_startoff, cmap.br_blockcount);
shared = true;
}
-   error = xfs_bmbt_to_iomap(ip, iomap, , shared);
+   if (shared)
+   iomap_flags |= IOMAP_F_SHARED;
+   error = xfs_bmbt_to_iomap(ip, iomap, , iomap_flags);
 out_unlock:
xfs_iunlock(ip, XFS_ILOCK_EXCL);
return error;
@@ -933,6 +936,7 @@ xfs_file_iomap_begin(
xfs_fileoff_t   offset_fsb, end_fsb;
int nimaps = 1, error = 0;
boolshared = false;
+   u16 iomap_flags = 0;
unsignedlockmode;
 
if (XFS_FORCED_SHUTDOWN(mp))
@@ -1048,11 +1052,13 @@ xfs_file_iomap_begin(
if (error)
return error;
 
-   iomap->flags |= IOMAP_F_NEW;
+   iomap_flags |= IOMAP_F_NEW;
trace_xfs_iomap_alloc(ip, offset, length, XFS_DATA_FORK, );
 
 out_finish:
-   return xfs_bmbt_to_iomap(ip, iomap, , shared);
+   if (shared)
+   iomap_flags |= IOMAP_F_SHARED;
+   return xfs_bmbt_to_iomap(ip, iomap, , iomap_flags);
 
 out_found:
ASSERT(nimaps);
@@ -1196,7 +1202,7 @@ xfs_seek_iomap_begin(
if (data_fsb < cow_fsb + cmap.br_blockcount)
end_fsb = min(end_fsb, data_fsb);
xfs_trim_extent(, offset_fsb, end_fsb);
-   error = xfs_bmbt_to_iomap(ip, iomap, , true);
+   error = xfs_bmbt_to_iomap(ip, iomap, , IOMAP_F_SHARED);
/*
 * This is a COW extent, so we must probe the page cache
 * because there could be dirty page cache being backed
@@ -1218,7 +1224,7 @@ xfs_seek_iomap_begin(
imap.br_state = XFS_EXT_NORM;
 done:
xfs_trim_extent(, offset_fsb, end_fsb);
-   error = xfs_bmbt_to_iomap(ip, iomap, , false);
+   error = xfs_bmbt_to_iomap(ip, iomap, , 0);
 out_unlock:
xfs_iunlock(ip, lockmode);
return error;
@@ -1264,7 +1270,7 @@ xfs_xattr_iomap_begin(
if (error)
return error;
ASSERT(nimaps);
-   return xfs_bmbt_to_iomap(ip, iomap, , false);
+   return xfs_bmbt_to_iomap(ip, iomap, , 0);

[PATCH 12/12] iomap: add tracing for the address space operations

2019-06-23 Thread Christoph Hellwig

Lift the xfs code for tracing address space operations to the iomap
layer.

Signed-off-by: Christoph Hellwig 
---
 fs/iomap.c   | 13 +-
 fs/xfs/xfs_aops.c| 27 ++--
 fs/xfs/xfs_trace.h   | 65 
 include/trace/events/iomap.h | 82 
 4 files changed, 97 insertions(+), 90 deletions(-)
 create mode 100644 include/trace/events/iomap.h

diff --git a/fs/iomap.c b/fs/iomap.c
index 72a1b622e634..c98107a6bf81 100644
--- a/fs/iomap.c
+++ b/fs/iomap.c
@@ -23,7 +23,8 @@
 #include 
 #include 
 #include 
-
+#define CREATE_TRACE_POINTS
+#include 
 #include "internal.h"
 
 static struct bio_set iomap_ioend_bioset;
@@ -369,6 +370,8 @@ iomap_readpage(struct page *page, const struct iomap_ops 
*ops)
unsigned poff;
loff_t ret;
 
+   trace_iomap_readpage(page->mapping->host, 1);
+
for (poff = 0; poff < PAGE_SIZE; poff += ret) {
ret = iomap_apply(inode, page_offset(page) + poff,
PAGE_SIZE - poff, 0, ops, ,
@@ -465,6 +468,8 @@ iomap_readpages(struct address_space *mapping, struct 
list_head *pages,
loff_t last = page_offset(list_entry(pages->next, struct page, lru));
loff_t length = last - pos + PAGE_SIZE, ret = 0;
 
+   trace_iomap_readpages(mapping->host, nr_pages);
+
while (length > 0) {
ret = iomap_apply(mapping->host, pos, length, 0, ops,
, iomap_readpages_actor);
@@ -531,6 +536,8 @@ EXPORT_SYMBOL_GPL(iomap_is_partially_uptodate);
 int
 iomap_releasepage(struct page *page, gfp_t gfp_mask)
 {
+   trace_iomap_releasepage(page->mapping->host, page, 0, 0);
+
/*
 * mm accommodates an old ext3 case where clean pages might not have had
 * the dirty bit cleared. Thus, it can send actual dirty pages to
@@ -546,6 +553,8 @@ EXPORT_SYMBOL_GPL(iomap_releasepage);
 void
 iomap_invalidatepage(struct page *page, unsigned int offset, unsigned int len)
 {
+   trace_iomap_invalidatepage(page->mapping->host, page, offset, len);
+
/*
 * If we are invalidating the entire page, clear the dirty state from it
 * and release it to avoid unnecessary buildup of the LRU.
@@ -2579,6 +2588,8 @@ iomap_do_writepage(struct page *page, struct 
writeback_control *wbc, void *data)
u64 end_offset;
loff_t offset;
 
+   trace_iomap_writepage(inode, page, 0, 0);
+
/*
 * Refuse to write the page out if we are called from reclaim context.
 *
diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c
index 26b838aea2db..a27ecce31c88 100644
--- a/fs/xfs/xfs_aops.c
+++ b/fs/xfs/xfs_aops.c
@@ -440,16 +440,6 @@ xfs_submit_ioend(
return status;
 }
 
-STATIC void
-xfs_vm_invalidatepage(
-   struct page *page,
-   unsigned intoffset,
-   unsigned intlength)
-{
-   trace_xfs_invalidatepage(page->mapping->host, page, offset, length);
-   iomap_invalidatepage(page, offset, length);
-}
-
 /*
  * If the page has delalloc blocks on it, we need to punch them out before we
  * invalidate the page.  If we don't, we leave a stale delalloc mapping on the
@@ -484,7 +474,7 @@ xfs_discard_page(
if (error && !XFS_FORCED_SHUTDOWN(mp))
xfs_alert(mp, "page discard unable to remove delalloc 
mapping.");
 out_invalidate:
-   xfs_vm_invalidatepage(page, 0, PAGE_SIZE);
+   iomap_invalidatepage(page, 0, PAGE_SIZE);
 }
 
 static const struct iomap_writeback_ops xfs_writeback_ops = {
@@ -524,15 +514,6 @@ xfs_dax_writepages(
xfs_find_bdev_for_inode(mapping->host), wbc);
 }
 
-STATIC int
-xfs_vm_releasepage(
-   struct page *page,
-   gfp_t   gfp_mask)
-{
-   trace_xfs_releasepage(page->mapping->host, page, 0, 0);
-   return iomap_releasepage(page, gfp_mask);
-}
-
 STATIC sector_t
 xfs_vm_bmap(
struct address_space*mapping,
@@ -561,7 +542,6 @@ xfs_vm_readpage(
struct file *unused,
struct page *page)
 {
-   trace_xfs_vm_readpage(page->mapping->host, 1);
return iomap_readpage(page, _iomap_ops);
 }
 
@@ -572,7 +552,6 @@ xfs_vm_readpages(
struct list_head*pages,
unsignednr_pages)
 {
-   trace_xfs_vm_readpages(mapping->host, nr_pages);
return iomap_readpages(mapping, pages, nr_pages, _iomap_ops);
 }
 
@@ -592,8 +571,8 @@ const struct address_space_operations 
xfs_address_space_operations = {
.writepage  = xfs_vm_writepage,
.writepages = xfs_vm_writepages,
.set_page_dirty = iomap_set_page_dirty,
-   .releasepage= xfs_vm_releasepage,
-   .invalidatepage = xfs_vm_invalidatepage,
+   .releasepage= iomap_releasepage,
+   .invalidatepage = iomap_invalidatepage,
.bmap

[PATCH 08/12] xfs: simplify xfs_ioend_can_merge

2019-06-23 Thread Christoph Hellwig

Compare the block layer status directly instead of converting it to
an errno first.

Signed-off-by: Christoph Hellwig 
---
 fs/xfs/xfs_aops.c | 14 ++
 1 file changed, 2 insertions(+), 12 deletions(-)

diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c
index 017b87b7765f..acbd73976067 100644
--- a/fs/xfs/xfs_aops.c
+++ b/fs/xfs/xfs_aops.c
@@ -226,13 +226,9 @@ xfs_end_ioend(
 static bool
 xfs_ioend_can_merge(
struct xfs_ioend*ioend,
-   int ioend_error,
struct xfs_ioend*next)
 {
-   int next_error;
-
-   next_error = blk_status_to_errno(next->io_bio->bi_status);
-   if (ioend_error != next_error)
+   if (ioend->io_bio->bi_status != next->io_bio->bi_status)
return false;
if ((ioend->io_fork == XFS_COW_FORK) ^ (next->io_fork == XFS_COW_FORK))
return false;
@@ -251,17 +247,11 @@ xfs_ioend_try_merge(
struct list_head*more_ioends)
 {
struct xfs_ioend*next_ioend;
-   int ioend_error;
-
-   if (list_empty(more_ioends))
-   return;
-
-   ioend_error = blk_status_to_errno(ioend->io_bio->bi_status);
 
while (!list_empty(more_ioends)) {
next_ioend = list_first_entry(more_ioends, struct xfs_ioend,
io_list);
-   if (!xfs_ioend_can_merge(ioend, ioend_error, next_ioend))
+   if (!xfs_ioend_can_merge(ioend, next_ioend))
break;
list_move_tail(_ioend->io_list, >io_list);
ioend->io_size += next_ioend->io_size;
-- 
2.20.1

[PATCH 09/12] xfs: refactor the ioend merging code

2019-06-23 Thread Christoph Hellwig

Introduce two nicely abstracted helper, which can be moved to the
iomap code later.  Also use list_pop and list_first_entry_or_null
to simplify the code a bit.

Signed-off-by: Christoph Hellwig 
---
 fs/xfs/xfs_aops.c | 66 ++-
 1 file changed, 36 insertions(+), 30 deletions(-)

diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c
index acbd73976067..5d302ebe2a33 100644
--- a/fs/xfs/xfs_aops.c
+++ b/fs/xfs/xfs_aops.c
@@ -121,6 +121,19 @@ xfs_destroy_ioend(
}
 }
 
+static void
+xfs_destroy_ioends(
+   struct xfs_ioend*ioend,
+   int error)
+{
+   struct list_headtmp;
+
+   list_replace_init(>io_list, );
+   xfs_destroy_ioend(ioend, error);
+   while ((ioend = list_pop(, struct xfs_ioend, io_list)))
+   xfs_destroy_ioend(ioend, error);
+}
+
 /*
  * Fast and loose check if this write could update the on-disk inode size.
  */
@@ -173,7 +186,6 @@ xfs_end_ioend(
struct xfs_ioend*ioend)
 {
unsigned intnofs_flag = memalloc_nofs_save();
-   struct list_headioend_list;
struct xfs_inode*ip = XFS_I(ioend->io_inode);
xfs_off_t   offset = ioend->io_offset;
size_t  size = ioend->io_size;
@@ -207,16 +219,7 @@ xfs_end_ioend(
if (!error && xfs_ioend_is_append(ioend))
error = xfs_setfilesize(ip, offset, size);
 done:
-   list_replace_init(>io_list, _list);
-   xfs_destroy_ioend(ioend, error);
-
-   while (!list_empty(_list)) {
-   ioend = list_first_entry(_list, struct xfs_ioend,
-   io_list);
-   list_del_init(>io_list);
-   xfs_destroy_ioend(ioend, error);
-   }
-
+   xfs_destroy_ioends(ioend, error);
memalloc_nofs_restore(nofs_flag);
 }
 
@@ -246,15 +249,16 @@ xfs_ioend_try_merge(
struct xfs_ioend*ioend,
struct list_head*more_ioends)
 {
-   struct xfs_ioend*next_ioend;
+   struct xfs_ioend*next;
 
-   while (!list_empty(more_ioends)) {
-   next_ioend = list_first_entry(more_ioends, struct xfs_ioend,
-   io_list);
-   if (!xfs_ioend_can_merge(ioend, next_ioend))
+   INIT_LIST_HEAD(>io_list);
+
+   while ((next = list_first_entry_or_null(more_ioends, struct xfs_ioend,
+   io_list))) {
+   if (!xfs_ioend_can_merge(ioend, next))
break;
-   list_move_tail(_ioend->io_list, >io_list);
-   ioend->io_size += next_ioend->io_size;
+   list_move_tail(>io_list, >io_list);
+   ioend->io_size += next->io_size;
}
 }
 
@@ -277,29 +281,31 @@ xfs_ioend_compare(
return 0;
 }
 
+static void
+xfs_sort_ioends(
+   struct list_head*ioend_list)
+{
+   list_sort(NULL, ioend_list, xfs_ioend_compare);
+}
+
 /* Finish all pending io completions. */
 void
 xfs_end_io(
struct work_struct  *work)
 {
-   struct xfs_inode*ip;
+   struct xfs_inode*ip =
+   container_of(work, struct xfs_inode, i_ioend_work);
struct xfs_ioend*ioend;
-   struct list_headcompletion_list;
+   struct list_headtmp;
unsigned long   flags;
 
-   ip = container_of(work, struct xfs_inode, i_ioend_work);
-
spin_lock_irqsave(>i_ioend_lock, flags);
-   list_replace_init(>i_ioend_list, _list);
+   list_replace_init(>i_ioend_list, );
spin_unlock_irqrestore(>i_ioend_lock, flags);
 
-   list_sort(NULL, _list, xfs_ioend_compare);
-
-   while (!list_empty(_list)) {
-   ioend = list_first_entry(_list, struct xfs_ioend,
-   io_list);
-   list_del_init(>io_list);
-   xfs_ioend_try_merge(ioend, _list);
+   xfs_sort_ioends();
+   while ((ioend = list_pop(, struct xfs_ioend, io_list))) {
+   xfs_ioend_try_merge(ioend, );
xfs_end_ioend(ioend);
}
 }
-- 
2.20.1

[PATCH 10/12] xfs: remove the fork fields in the writepage_ctx and ioend

2019-06-23 Thread Christoph Hellwig

In preparation for moving the writeback code to iomap.c, replace the
XFS-specific COW fork concept with the iomap IOMAP_F_SHARED flag.

Signed-off-by: Christoph Hellwig 
---
 fs/xfs/xfs_aops.c | 40 +---
 fs/xfs/xfs_aops.h |  2 +-
 2 files changed, 22 insertions(+), 20 deletions(-)

diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c
index 5d302ebe2a33..d9a7a9e6b912 100644
--- a/fs/xfs/xfs_aops.c
+++ b/fs/xfs/xfs_aops.c
@@ -28,7 +28,6 @@
  */
 struct xfs_writepage_ctx {
struct iomapiomap;
-   int fork;
unsigned intdata_seq;
unsigned intcow_seq;
struct xfs_ioend*ioend;
@@ -204,7 +203,7 @@ xfs_end_ioend(
 */
error = blk_status_to_errno(ioend->io_bio->bi_status);
if (unlikely(error)) {
-   if (ioend->io_fork == XFS_COW_FORK)
+   if (ioend->io_flags & IOMAP_F_SHARED)
xfs_reflink_cancel_cow_range(ip, offset, size, true);
goto done;
}
@@ -212,7 +211,7 @@ xfs_end_ioend(
/*
 * Success: commit the COW or unwritten blocks if needed.
 */
-   if (ioend->io_fork == XFS_COW_FORK)
+   if (ioend->io_flags & IOMAP_F_SHARED)
error = xfs_reflink_end_cow(ip, offset, size);
else if (ioend->io_type == IOMAP_UNWRITTEN)
error = xfs_iomap_write_unwritten(ip, offset, size, false);
@@ -233,7 +232,8 @@ xfs_ioend_can_merge(
 {
if (ioend->io_bio->bi_status != next->io_bio->bi_status)
return false;
-   if ((ioend->io_fork == XFS_COW_FORK) ^ (next->io_fork == XFS_COW_FORK))
+   if ((ioend->io_flags & IOMAP_F_SHARED) ^
+   (next->io_flags & IOMAP_F_SHARED))
return false;
if ((ioend->io_type == IOMAP_UNWRITTEN) ^
(next->io_type == IOMAP_UNWRITTEN))
@@ -319,7 +319,7 @@ xfs_end_bio(
struct xfs_mount*mp = ip->i_mount;
unsigned long   flags;
 
-   if (ioend->io_fork == XFS_COW_FORK ||
+   if ((ioend->io_flags & IOMAP_F_SHARED) ||
ioend->io_type == IOMAP_UNWRITTEN ||
xfs_ioend_is_append(ioend)) {
spin_lock_irqsave(>i_ioend_lock, flags);
@@ -350,7 +350,7 @@ xfs_imap_valid(
 * covers the offset. Be careful to check this first because the caller
 * can revalidate a COW mapping without updating the data seqno.
 */
-   if (wpc->fork == XFS_COW_FORK)
+   if (wpc->iomap.flags & IOMAP_F_SHARED)
return true;
 
/*
@@ -380,6 +380,7 @@ static int
 xfs_convert_blocks(
struct xfs_writepage_ctx *wpc,
struct xfs_inode*ip,
+   int whichfork,
loff_t  offset)
 {
int error;
@@ -391,8 +392,8 @@ xfs_convert_blocks(
 * delalloc extent if free space is sufficiently fragmented.
 */
do {
-   error = xfs_bmapi_convert_delalloc(ip, wpc->fork, offset,
-   >iomap, wpc->fork == XFS_COW_FORK ?
+   error = xfs_bmapi_convert_delalloc(ip, whichfork, offset,
+   >iomap, whichfork == XFS_COW_FORK ?
>cow_seq : >data_seq);
if (error)
return error;
@@ -413,6 +414,7 @@ xfs_map_blocks(
xfs_fileoff_t   offset_fsb = XFS_B_TO_FSBT(mp, offset);
xfs_fileoff_t   end_fsb = XFS_B_TO_FSB(mp, offset + count);
xfs_fileoff_t   cow_fsb = NULLFILEOFF;
+   int whichfork = XFS_DATA_FORK;
struct xfs_bmbt_irecimap;
struct xfs_iext_cursor  icur;
int retries = 0;
@@ -461,7 +463,7 @@ xfs_map_blocks(
wpc->cow_seq = READ_ONCE(ip->i_cowfp->if_seq);
xfs_iunlock(ip, XFS_ILOCK_SHARED);
 
-   wpc->fork = XFS_COW_FORK;
+   whichfork = XFS_COW_FORK;
goto allocate_blocks;
}
 
@@ -484,8 +486,6 @@ xfs_map_blocks(
wpc->data_seq = READ_ONCE(ip->i_df.if_seq);
xfs_iunlock(ip, XFS_ILOCK_SHARED);
 
-   wpc->fork = XFS_DATA_FORK;
-
/* landed in a hole or beyond EOF? */
if (imap.br_startoff > offset_fsb) {
imap.br_blockcount = imap.br_startoff - offset_fsb;
@@ -510,10 +510,10 @@ xfs_map_blocks(
goto allocate_blocks;
 
xfs_bmbt_to_iomap(ip, >iomap, , 0);
-   trace_xfs_map_blocks_found(ip, offset, count, wpc->fork, );
+   trace_xfs_map_blocks_found(ip, offset, count, whichfork, );
return 0;
 allocate_blocks:
-   error = xfs_convert_blocks(wpc, ip, offset);
+   error = xfs_convert_blocks(wpc, ip, whichfork, offset);
if (error) {
/*
 * If we failed to find the extent in the COW fork we might have
@@ -522,7 +522,8

[PATCH 02/12] xfs: simplify xfs_chain_bio

2019-06-23 Thread Christoph Hellwig

Move setting up operation and write hint to xfs_alloc_ioend, and
then just copy over all needed information from the previous bio
in xfs_chain_bio and stop passing various parameters to it.

Signed-off-by: Christoph Hellwig 
---
 fs/xfs/xfs_aops.c | 35 +--
 1 file changed, 17 insertions(+), 18 deletions(-)

diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c
index a6f0f4761a37..9cceb90e77c5 100644
--- a/fs/xfs/xfs_aops.c
+++ b/fs/xfs/xfs_aops.c
@@ -665,7 +665,6 @@ xfs_submit_ioend(
 
ioend->io_bio->bi_private = ioend;
ioend->io_bio->bi_end_io = xfs_end_bio;
-   ioend->io_bio->bi_opf = REQ_OP_WRITE | wbc_to_write_flags(wbc);
 
/*
 * If we are failing the IO now, just mark the ioend with an
@@ -679,7 +678,6 @@ xfs_submit_ioend(
return status;
}
 
-   ioend->io_bio->bi_write_hint = ioend->io_inode->i_write_hint;
submit_bio(ioend->io_bio);
return 0;
 }
@@ -691,7 +689,8 @@ xfs_alloc_ioend(
xfs_exntst_tstate,
xfs_off_t   offset,
struct block_device *bdev,
-   sector_tsector)
+   sector_tsector,
+   struct writeback_control *wbc)
 {
struct xfs_ioend*ioend;
struct bio  *bio;
@@ -699,6 +698,8 @@ xfs_alloc_ioend(
bio = bio_alloc_bioset(GFP_NOFS, BIO_MAX_PAGES, _ioend_bioset);
bio_set_dev(bio, bdev);
bio->bi_iter.bi_sector = sector;
+   bio->bi_opf = REQ_OP_WRITE | wbc_to_write_flags(wbc);
+   bio->bi_write_hint = inode->i_write_hint;
 
ioend = container_of(bio, struct xfs_ioend, io_inline_bio);
INIT_LIST_HEAD(>io_list);
@@ -719,24 +720,22 @@ xfs_alloc_ioend(
  * so that the bi_private linkage is set up in the right direction for the
  * traversal in xfs_destroy_ioend().
  */
-static void
+static struct bio *
 xfs_chain_bio(
-   struct xfs_ioend*ioend,
-   struct writeback_control *wbc,
-   struct block_device *bdev,
-   sector_tsector)
+   struct bio  *prev)
 {
struct bio *new;
 
new = bio_alloc(GFP_NOFS, BIO_MAX_PAGES);
-   bio_set_dev(new, bdev);
-   new->bi_iter.bi_sector = sector;
-   bio_chain(ioend->io_bio, new);
-   bio_get(ioend->io_bio); /* for xfs_destroy_ioend */
-   ioend->io_bio->bi_opf = REQ_OP_WRITE | wbc_to_write_flags(wbc);
-   ioend->io_bio->bi_write_hint = ioend->io_inode->i_write_hint;
-   submit_bio(ioend->io_bio);
-   ioend->io_bio = new;
+   bio_copy_dev(new, prev);
+   new->bi_iter.bi_sector = bio_end_sector(prev);
+   new->bi_opf = prev->bi_opf;
+   new->bi_write_hint = prev->bi_write_hint;
+
+   bio_chain(prev, new);
+   bio_get(prev);  /* for xfs_destroy_ioend */
+   submit_bio(prev);
+   return new;
 }
 
 /*
@@ -771,14 +770,14 @@ xfs_add_to_ioend(
if (wpc->ioend)
list_add(>ioend->io_list, iolist);
wpc->ioend = xfs_alloc_ioend(inode, wpc->fork,
-   wpc->imap.br_state, offset, bdev, sector);
+   wpc->imap.br_state, offset, bdev, sector, wbc);
}
 
if (!__bio_try_merge_page(wpc->ioend->io_bio, page, len, poff, true)) {
if (iop)
atomic_inc(>write_count);
if (bio_full(wpc->ioend->io_bio))
-   xfs_chain_bio(wpc->ioend, wbc, bdev, sector);
+   wpc->ioend->io_bio = xfs_chain_bio(wpc->ioend->io_bio);
bio_add_page(wpc->ioend->io_bio, page, len, poff);
}
 
-- 
2.20.1

[PATCH 01/12] list.h: add a list_pop helper

2019-06-23 Thread Christoph Hellwig

We have a very common pattern where we want to delete the first entry
from a list and return it as the properly typed container structure.

Add a list_pop helper to implement this behavior.

Signed-off-by: Christoph Hellwig 
---
 include/linux/list.h | 22 ++
 1 file changed, 22 insertions(+)

diff --git a/include/linux/list.h b/include/linux/list.h
index e951228db4b2..e07a5f54cc9d 100644
--- a/include/linux/list.h
+++ b/include/linux/list.h
@@ -500,6 +500,28 @@ static inline void list_splice_tail_init(struct list_head 
*list,
pos__ != head__ ? list_entry(pos__, type, member) : NULL; \
 })
 
+/**
+ * list_pop - delete the first entry from a list and return it
+ * @list:  the list to take the element from.
+ * @type:  the type of the struct this is embedded in.
+ * @member:the name of the list_head within the struct.
+ *
+ * Note that if the list is empty, it returns NULL.
+ */
+#define list_pop(list, type, member)   \
+({ \
+   struct list_head *head__ = (list);  \
+   struct list_head *pos__ = READ_ONCE(head__->next);  \
+   type *entry__ = NULL;   \
+   \
+   if (pos__ != head__) {  \
+   entry__ = list_entry(pos__, type, member);  \
+   list_del(pos__);\
+   }   \
+   \
+   entry__;\
+})
+
 /**
  * list_next_entry - get the next element in list
  * @pos:   the type * to cursor
-- 
2.20.1

lift the xfs writepage code into iomap

2019-06-23 Thread Christoph Hellwig

Hi all,

this series cleans up the xfs writepage code and then lifts it to
fs/iomap.c so that it could be use by other file system.  I've been
wanting to this for a while so that I could eventually convert gfs2
over to it, but I never got to it.  Now Damien has a new zonefs
file system for semi-raw access to zoned block devices that would
like to use the iomap code instead of reinventing it, so I finally
had to do the work.

linux-next: manual merge of the devicetree tree with Linus' tree

2019-06-23 Thread Stephen Rothwell

Hi all,

Today's linux-next merge of the devicetree tree got conflicts in:

  scripts/dtc/Makefile.dtc
  scripts/dtc/libfdt/Makefile.libfdt

between commit:

  ec8f24b7faaf ("treewide: Add SPDX license identifier - Makefile/Kconfig")

from Linus' tree and commit:

  12869ecd5eef ("scripts/dtc: Update to upstream version 
v1.5.0-30-g702c1b6c0e73")

from the devicetree tree.

I fixed it up (I used the latter's SPDX tags as these come from an
upstream project) and can carry the fix as necessary. This is now fixed
as far as linux-next is concerned, but any non trivial conflicts should
be mentioned to your upstream maintainer when your tree is submitted for
merging.  You may also want to consider cooperating with the maintainer
of the conflicting tree to minimise any particularly complex conflicts.

-- 
Cheers,
Stephen Rothwell


pgpt3rf5TOyZA.pgp
Description: OpenPGP digital signature

RE: [PATCH v2 0/3] scsi: ufs: typo fixes and improvement

2019-06-23 Thread Avri Altman

Hi,

> 
> 
> From: Bean Huo 
> 
> This series patch is to fix several typos and fix one issue of twice
> completing ufs-bsg job in case of UPIU/DME command failed.
> 
> Changed since v1:
> - split v1 patch
> - add fixes tag
> - delete needless blank line
> 
> Bean Huo (3):
>   scsi: ufs: fix typos in comment of ufshcd_uic_change_pwr_mode
>   scsi: ufs-bsg: fix typo in ufs_bsg_request
>   scsi: ufs-bsg: complete ufs-bsg job only if no error

This series looks good to me.
Thanks,
Avri

[PATCH 1/5] arm64: don't use asm-generic/ptrace.h

2019-06-23 Thread Christoph Hellwig

Doing the indirection through macros for the regs accessors just
makes them harder to read, so implement the helpers directly.

Note that only the helpers actually used are implemented now.

Signed-off-by: Christoph Hellwig 
Acked-by: Catalin Marinas 
---
 arch/arm64/include/asm/ptrace.h | 31 +++
 1 file changed, 19 insertions(+), 12 deletions(-)

diff --git a/arch/arm64/include/asm/ptrace.h b/arch/arm64/include/asm/ptrace.h
index dad858b6adc6..5a1e5025db96 100644
--- a/arch/arm64/include/asm/ptrace.h
+++ b/arch/arm64/include/asm/ptrace.h
@@ -217,11 +217,12 @@ static inline void forget_syscall(struct pt_regs *regs)
 #define fast_interrupts_enabled(regs) \
(!((regs)->pstate & PSR_F_BIT))
 
-#define GET_USP(regs) \
-   (!compat_user_mode(regs) ? (regs)->sp : (regs)->compat_sp)
-
-#define SET_USP(ptregs, value) \
-   (!compat_user_mode(regs) ? ((regs)->sp = value) : ((regs)->compat_sp = 
value))
+static inline unsigned long user_stack_pointer(struct pt_regs *regs)
+{
+   if (compat_user_mode(regs))
+   return regs->compat_sp;
+   return regs->sp;
+}
 
 extern int regs_query_register_offset(const char *name);
 extern unsigned long regs_get_kernel_stack_nth(struct pt_regs *regs,
@@ -320,13 +321,20 @@ static inline unsigned long 
regs_get_kernel_argument(struct pt_regs *regs,
 struct task_struct;
 int valid_user_regs(struct user_pt_regs *regs, struct task_struct *task);
 
-#define GET_IP(regs)   ((unsigned long)(regs)->pc)
-#define SET_IP(regs, value)((regs)->pc = ((u64) (value)))
-
-#define GET_FP(ptregs) ((unsigned long)(ptregs)->regs[29])
-#define SET_FP(ptregs, value)  ((ptregs)->regs[29] = ((u64) (value)))
+static inline unsigned long instruction_pointer(struct pt_regs *regs)
+{
+   return regs->pc;
+}
+static inline void instruction_pointer_set(struct pt_regs *regs,
+   unsigned long val)
+{
+   regs->pc = val;
+}
 
-#include 
+static inline unsigned long frame_pointer(struct pt_regs *regs)
+{
+   return regs->regs[29];
+}
 
 #define procedure_link_pointer(regs)   ((regs)->regs[30])
 
@@ -336,7 +344,6 @@ static inline void procedure_link_pointer_set(struct 
pt_regs *regs,
procedure_link_pointer(regs) = val;
 }
 
-#undef profile_pc
 extern unsigned long profile_pc(struct pt_regs *regs);
 
 #endif /* __ASSEMBLY__ */
-- 
2.20.1

[PATCH 3/5] sh: don't use asm-generic/ptrace.h

2019-06-23 Thread Christoph Hellwig

Doing the indirection through macros for the regs accessors just
makes them harder to read, so implement the helpers directly.

Note that only the helpers actually used are implemented now.

Signed-off-by: Christoph Hellwig 
---
 arch/sh/include/asm/ptrace.h | 29 +
 1 file changed, 25 insertions(+), 4 deletions(-)

diff --git a/arch/sh/include/asm/ptrace.h b/arch/sh/include/asm/ptrace.h
index 9143c7babcbe..6c89e3e04cee 100644
--- a/arch/sh/include/asm/ptrace.h
+++ b/arch/sh/include/asm/ptrace.h
@@ -16,8 +16,31 @@
 #define user_mode(regs)(((regs)->sr & 0x4000)==0)
 #define kernel_stack_pointer(_regs)((unsigned long)(_regs)->regs[15])
 
-#define GET_FP(regs)   ((regs)->regs[14])
-#define GET_USP(regs)  ((regs)->regs[15])
+static inline unsigned long instruction_pointer(struct pt_regs *regs)
+{
+   return regs->pc;
+}
+static inline void instruction_pointer_set(struct pt_regs *regs,
+   unsigned long val)
+{
+   regs->pc = val;
+}
+
+static inline unsigned long frame_pointer(struct pt_regs *regs)
+{
+   return regs->regs[14];
+}
+
+static inline unsigned long user_stack_pointer(struct pt_regs *regs)
+{
+   return regs->regs[15];
+}
+
+static inline void user_stack_pointer_set(struct pt_regs *regs,
+   unsigned long val)
+{
+   regs->regs[15] = val;
+}
 
 #define arch_has_single_step() (1)
 
@@ -112,7 +135,5 @@ static inline unsigned long profile_pc(struct pt_regs *regs)
 
return pc;
 }
-#define profile_pc profile_pc
 
-#include 
 #endif /* __ASM_SH_PTRACE_H */
-- 
2.20.1

remove asm-generic/ptrace.h v3

2019-06-23 Thread Christoph Hellwig

Hi all,

asm-generic/ptrace.h is a little weird in that it doesn't actually
implement any functionality, but it provided multiple layers of macros
that just implement trivial inline functions.  We implement those
directly in the few architectures and be off with a much simpler
design.

I'm not sure which tree is the right place, but may this can go through
the asm-generic tree since it removes an asm-generic header?


Changes since v2:
 - rebase to latest Linus' tree that added an SPDX tag to
   asm-generic/ptrace.h
 - collected two more Acks from Oleg
Changes since v1:
 - add a missing empty line between functions

[PATCH 4/5] x86: don't use asm-generic/ptrace.h

2019-06-23 Thread Christoph Hellwig

Doing the indirection through macros for the regs accessors just
makes them harder to read, so implement the helpers directly.

Note that only the helpers actually used are implemented now.

Signed-off-by: Christoph Hellwig 
Acked-by: Ingo Molnar 
Acked-by: Oleg Nesterov 
---
 arch/x86/include/asm/ptrace.h | 30 +-
 1 file changed, 25 insertions(+), 5 deletions(-)

diff --git a/arch/x86/include/asm/ptrace.h b/arch/x86/include/asm/ptrace.h
index 8a7fc0cca2d1..e22816e865ca 100644
--- a/arch/x86/include/asm/ptrace.h
+++ b/arch/x86/include/asm/ptrace.h
@@ -98,7 +98,6 @@ struct cpuinfo_x86;
 struct task_struct;
 
 extern unsigned long profile_pc(struct pt_regs *regs);
-#define profile_pc profile_pc
 
 extern unsigned long
 convert_ip_to_linear(struct task_struct *child, struct pt_regs *regs);
@@ -175,11 +174,32 @@ static inline unsigned long kernel_stack_pointer(struct 
pt_regs *regs)
 }
 #endif
 
-#define GET_IP(regs) ((regs)->ip)
-#define GET_FP(regs) ((regs)->bp)
-#define GET_USP(regs) ((regs)->sp)
+static inline unsigned long instruction_pointer(struct pt_regs *regs)
+{
+   return regs->ip;
+}
+
+static inline void instruction_pointer_set(struct pt_regs *regs,
+   unsigned long val)
+{
+   regs->ip = val;
+}
+
+static inline unsigned long frame_pointer(struct pt_regs *regs)
+{
+   return regs->bp;
+}
 
-#include 
+static inline unsigned long user_stack_pointer(struct pt_regs *regs)
+{
+   return regs->sp;
+}
+
+static inline void user_stack_pointer_set(struct pt_regs *regs,
+   unsigned long val)
+{
+   regs->sp = val;
+}
 
 /* Query offset/name of register from its name/offset */
 extern int regs_query_register_offset(const char *name);
-- 
2.20.1

[PATCH 5/5] asm-generic: remove ptrace.h

2019-06-23 Thread Christoph Hellwig

No one is using this header anymore.

Signed-off-by: Christoph Hellwig 
Acked-by: Arnd Bergmann 
Acked-by: Oleg Nesterov 
---
 MAINTAINERS|  1 -
 arch/mips/include/asm/ptrace.h |  5 ---
 include/asm-generic/ptrace.h   | 73 --
 3 files changed, 79 deletions(-)
 delete mode 100644 include/asm-generic/ptrace.h

diff --git a/MAINTAINERS b/MAINTAINERS
index d0ed735994a5..43e5b6e215a9 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -12778,7 +12778,6 @@ F:  include/linux/regset.h
 F: include/linux/tracehook.h
 F: include/uapi/linux/ptrace.h
 F: include/uapi/linux/ptrace.h
-F: include/asm-generic/ptrace.h
 F: kernel/ptrace.c
 F: arch/*/ptrace*.c
 F: arch/*/*/ptrace*.c
diff --git a/arch/mips/include/asm/ptrace.h b/arch/mips/include/asm/ptrace.h
index b6578611dddb..1e76774b36dd 100644
--- a/arch/mips/include/asm/ptrace.h
+++ b/arch/mips/include/asm/ptrace.h
@@ -56,11 +56,6 @@ static inline unsigned long kernel_stack_pointer(struct 
pt_regs *regs)
return regs->regs[31];
 }
 
-/*
- * Don't use asm-generic/ptrace.h it defines FP accessors that don't make
- * sense on MIPS.  We rather want an error if they get invoked.
- */
-
 static inline void instruction_pointer_set(struct pt_regs *regs,
unsigned long val)
 {
diff --git a/include/asm-generic/ptrace.h b/include/asm-generic/ptrace.h
deleted file mode 100644
index ab16b6cb1028..
--- a/include/asm-generic/ptrace.h
+++ /dev/null
@@ -1,73 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0-or-later */
-/*
- * Common low level (register) ptrace helpers
- *
- * Copyright 2004-2011 Analog Devices Inc.
- */
-
-#ifndef __ASM_GENERIC_PTRACE_H__
-#define __ASM_GENERIC_PTRACE_H__
-
-#ifndef __ASSEMBLY__
-
-/* Helpers for working with the instruction pointer */
-#ifndef GET_IP
-#define GET_IP(regs) ((regs)->pc)
-#endif
-#ifndef SET_IP
-#define SET_IP(regs, val) (GET_IP(regs) = (val))
-#endif
-
-static inline unsigned long instruction_pointer(struct pt_regs *regs)
-{
-   return GET_IP(regs);
-}
-static inline void instruction_pointer_set(struct pt_regs *regs,
-   unsigned long val)
-{
-   SET_IP(regs, val);
-}
-
-#ifndef profile_pc
-#define profile_pc(regs) instruction_pointer(regs)
-#endif
-
-/* Helpers for working with the user stack pointer */
-#ifndef GET_USP
-#define GET_USP(regs) ((regs)->usp)
-#endif
-#ifndef SET_USP
-#define SET_USP(regs, val) (GET_USP(regs) = (val))
-#endif
-
-static inline unsigned long user_stack_pointer(struct pt_regs *regs)
-{
-   return GET_USP(regs);
-}
-static inline void user_stack_pointer_set(struct pt_regs *regs,
-  unsigned long val)
-{
-   SET_USP(regs, val);
-}
-
-/* Helpers for working with the frame pointer */
-#ifndef GET_FP
-#define GET_FP(regs) ((regs)->fp)
-#endif
-#ifndef SET_FP
-#define SET_FP(regs, val) (GET_FP(regs) = (val))
-#endif
-
-static inline unsigned long frame_pointer(struct pt_regs *regs)
-{
-   return GET_FP(regs);
-}
-static inline void frame_pointer_set(struct pt_regs *regs,
- unsigned long val)
-{
-   SET_FP(regs, val);
-}
-
-#endif /* __ASSEMBLY__ */
-
-#endif
-- 
2.20.1

[PATCH 2/5] powerpc: don't use asm-generic/ptrace.h

2019-06-23 Thread Christoph Hellwig

Doing the indirection through macros for the regs accessors just
makes them harder to read, so implement the helpers directly.

Note that only the helpers actually used are implemented now.

Signed-off-by: Christoph Hellwig 
---
 arch/powerpc/include/asm/ptrace.h | 29 ++---
 1 file changed, 22 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/include/asm/ptrace.h 
b/arch/powerpc/include/asm/ptrace.h
index faa5a338ac5a..feee1b21bbd5 100644
--- a/arch/powerpc/include/asm/ptrace.h
+++ b/arch/powerpc/include/asm/ptrace.h
@@ -111,18 +111,33 @@ struct pt_regs
 
 #ifndef __ASSEMBLY__
 
-#define GET_IP(regs)   ((regs)->nip)
-#define GET_USP(regs)  ((regs)->gpr[1])
-#define GET_FP(regs)   (0)
-#define SET_FP(regs, val)
+static inline unsigned long instruction_pointer(struct pt_regs *regs)
+{
+   return regs->nip;
+}
+
+static inline void instruction_pointer_set(struct pt_regs *regs,
+   unsigned long val)
+{
+   regs->nip = val;
+}
+
+static inline unsigned long user_stack_pointer(struct pt_regs *regs)
+{
+   return regs->gpr[1];
+}
+
+static inline unsigned long frame_pointer(struct pt_regs *regs)
+{
+   return 0;
+}
 
 #ifdef CONFIG_SMP
 extern unsigned long profile_pc(struct pt_regs *regs);
-#define profile_pc profile_pc
+#else
+#define profile_pc(regs) instruction_pointer(regs)
 #endif
 
-#include 
-
 #define kernel_stack_pointer(regs) ((regs)->gpr[1])
 static inline int is_syscall_success(struct pt_regs *regs)
 {
-- 
2.20.1

Re: [PATCH] mm: fix setting the high and low watermarks

2019-06-23 Thread Vlastimil Babka

On 6/21/19 4:07 PM, Bharath Vedartham wrote:
> Do you think this could cause a race condition between
> __setup_per_zone_wmarks and pgdat_watermark_boosted which checks whether
> the watermark_boost of each zone is non-zero? pgdat_watermark_boosted is
> not called with a zone lock.
> Here is a probable case scenario:
> watermarks are boosted in steal_suitable_fallback(which happens under a
> zone lock). After that kswapd is woken up by
> wakeup_kswapd(zone,0,0,zone_idx(zone)) in rmqueue without holding a
> zone lock. Lets say someone modified min_kfree_bytes, this would lead to
> all the zone->watermark_boost being set to 0. This may cause
> pgdat_watermark_boosted to return false, which would not wakeup kswapd
> as intended by boosting the watermark. This behaviour is similar to waking up 
> kswapd for a
> balanced node.

Not waking up kswapd shouldn't cause a significant trouble.

> Also if kswapd was woken up successfully because of watermarks being
> boosted. In balance_pgdat, we use nr_boost_reclaim to count number of
> pages to reclaim because of boosting. nr_boost_reclaim is calculated as:
> nr_boost_reclaim = 0;
> for (i = 0; i <= classzone_idx; i++) {
>   zone = pgdat->node_zones + i;
>   if (!managed_zone(zone))
>   continue;
> 
>   nr_boost_reclaim += zone->watermark_boost;
>   zone_boosts[i] = zone->watermark_boost;
> }
> boosted = nr_boost_reclaim;
> 
> This is not under a zone_lock. This could lead to nr_boost_reclaim to
> be 0 if min_kfree_bytes is set to 0. Which would wake up kcompactd
> without reclaiming memory.

Setting min_kfree_bytes to 0 is asking for problems regardless of this
check. Much more trouble than waking up kcompactd spuriously, which is
just a few wasted cpu cycles.

> kcompactd compaction might be spurious if the if the memory reclaim step is 
> not happening?
> 
> Any thoughts?

Unless the races cause either some data corruption, or e.g. spurious
allocation failures, I don't think they are worth adding new spinlock
sections.

Thanks,
Vlastimil

>>  spin_unlock_irqrestore(>lock, flags);
>>
>

[PATCH 01/17] mm: provide a print_vma_addr stub for !CONFIG_MMU

2019-06-23 Thread Christoph Hellwig

Signed-off-by: Christoph Hellwig 
Reviewed-by: Vladimir Murzin 
---
 include/linux/mm.h | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index dd0b5f4e1e45..69843ee0c5f8 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2756,7 +2756,13 @@ extern int randomize_va_space;
 #endif
 
 const char * arch_vma_name(struct vm_area_struct *vma);
+#ifdef CONFIG_MMU
 void print_vma_addr(char *prefix, unsigned long rip);
+#else
+static inline void print_vma_addr(char *prefix, unsigned long rip)
+{
+}
+#endif
 
 void *sparse_buffer_alloc(unsigned long size);
 struct page *sparse_mem_map_populate(unsigned long pnum, int nid,
-- 
2.20.1

[PATCH 10/17] riscv: read the hart ID from mhartid on boot

2019-06-23 Thread Christoph Hellwig

From: Damien Le Moal 

When in M-Mode, we can use the mhartid CSR to get the ID of the running
HART. Doing so, direct M-Mode boot without firmware is possible.

Signed-off-by: Damien Le Moal 
Signed-off-by: Christoph Hellwig 
---
 arch/riscv/kernel/head.S | 8 
 1 file changed, 8 insertions(+)

diff --git a/arch/riscv/kernel/head.S b/arch/riscv/kernel/head.S
index e5fa5481aa99..a4c170e41a34 100644
--- a/arch/riscv/kernel/head.S
+++ b/arch/riscv/kernel/head.S
@@ -18,6 +18,14 @@ ENTRY(_start)
csrw CSR_XIE, zero
csrw CSR_XIP, zero
 
+#ifdef CONFIG_M_MODE
+   /*
+* The hartid in a0 is expected later on, and we have no firmware
+* to hand it to us.
+*/
+   csrr a0, mhartid
+#endif
+
/* Load the global pointer */
 .option push
 .option norelax
-- 
2.20.1

[PATCH 11/17] riscv: provide native clint access for M-mode

2019-06-23 Thread Christoph Hellwig

RISC-V has the concept of a cpu level interrupt controller.  Part of it
is expose as bits in the status registers, and 2 new CSRs per privilege
level in the instruction set, but the machanisms to trigger IPIs and
timer events, as well as reading the actual timer value are not
specified in the RISC-V spec but usually delegated to a block of MMIO
registers.  This patch adds support for those MMIO registers in the
timer and IPI code.  For now only the SiFive layout also supported by
a few other implementations is supported, but the code should be
easily extensible to others in the future.

Signed-off-by: Christoph Hellwig 
---
 arch/riscv/include/asm/clint.h| 40 +++
 arch/riscv/include/asm/timex.h| 17 
 arch/riscv/kernel/Makefile|  1 +
 arch/riscv/kernel/clint.c | 45 +++
 arch/riscv/kernel/setup.c |  2 ++
 arch/riscv/kernel/smp.c   | 24 +
 arch/riscv/kernel/smpboot.c   |  3 +++
 drivers/clocksource/timer-riscv.c | 16 ---
 8 files changed, 144 insertions(+), 4 deletions(-)
 create mode 100644 arch/riscv/include/asm/clint.h
 create mode 100644 arch/riscv/kernel/clint.c

diff --git a/arch/riscv/include/asm/clint.h b/arch/riscv/include/asm/clint.h
new file mode 100644
index ..46d182d9a4db
--- /dev/null
+++ b/arch/riscv/include/asm/clint.h
@@ -0,0 +1,40 @@
+// SPDX-License-Identifier: GPL-2.0
+#ifndef _ASM_CLINT_H
+#define _ASM_CLINT_H 1
+
+#include 
+
+#ifdef CONFIG_M_MODE
+extern u32 __iomem *clint_ipi_base;
+extern u64 __iomem *clint_time_val;
+extern u64 __iomem *clint_time_cmp;
+
+void clint_init_boot_cpu(void);
+
+static inline void clint_send_ipi(unsigned long hartid)
+{
+   writel(1, clint_ipi_base + hartid);
+}
+
+static inline void clint_clear_ipi(unsigned long hartid)
+{
+   writel(0, clint_ipi_base + hartid);
+}
+
+static inline u64 clint_read_timer(void)
+{
+   return readq_relaxed(clint_time_val);
+}
+
+static inline void clint_set_timer(unsigned long delta)
+{
+   writeq_relaxed(clint_read_timer() + delta,
+   clint_time_cmp + cpuid_to_hartid_map(smp_processor_id()));
+}
+
+#else
+#define clint_init_boot_cpu()  do { } while (0)
+#define clint_clear_ipi(hartid)do { } while (0)
+#endif /* CONFIG_M_MODE */
+
+#endif /* _ASM_CLINT_H */
diff --git a/arch/riscv/include/asm/timex.h b/arch/riscv/include/asm/timex.h
index 6a703ec9d796..bf907997f107 100644
--- a/arch/riscv/include/asm/timex.h
+++ b/arch/riscv/include/asm/timex.h
@@ -10,6 +10,22 @@
 
 typedef unsigned long cycles_t;
 
+#ifdef CONFIG_M_MODE
+
+#include 
+#include 
+
+static inline cycles_t get_cycles(void)
+{
+#ifdef CONFIG_64BIT
+   return readq_relaxed(clint_time_val);
+#else
+   return readl_relaxed(clint_time_val);
+#endif
+}
+#define get_cycles get_cycles
+
+#else /* CONFIG_M_MODE */
 static inline cycles_t get_cycles_inline(void)
 {
cycles_t n;
@@ -40,6 +56,7 @@ static inline uint64_t get_cycles64(void)
return ((u64)hi << 32) | lo;
 }
 #endif
+#endif /* CONFIG_M_MODE */
 
 #define ARCH_HAS_READ_CURRENT_TIMER
 
diff --git a/arch/riscv/kernel/Makefile b/arch/riscv/kernel/Makefile
index 2420d37d96de..f933c04f89db 100644
--- a/arch/riscv/kernel/Makefile
+++ b/arch/riscv/kernel/Makefile
@@ -29,6 +29,7 @@ obj-y += vdso.o
 obj-y  += cacheinfo.o
 obj-y  += vdso/
 
+obj-$(CONFIG_M_MODE)   += clint.o
 obj-$(CONFIG_FPU)  += fpu.o
 obj-$(CONFIG_SMP)  += smpboot.o
 obj-$(CONFIG_SMP)  += smp.o
diff --git a/arch/riscv/kernel/clint.c b/arch/riscv/kernel/clint.c
new file mode 100644
index ..15b9e7fa5416
--- /dev/null
+++ b/arch/riscv/kernel/clint.c
@@ -0,0 +1,45 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (c) 2019 Christoph Hellwig.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+/*
+ * This is the layout used by the SiFive clint, which is also shared by the 
qemu
+ * virt platform, and the Kendryte KD210 at least.
+ */
+#define CLINT_IPI_OFF  0
+#define CLINT_TIME_VAL_OFF 0xbff8
+#define CLINT_TIME_CMP_OFF 0x4000;
+
+u32 __iomem *clint_ipi_base;
+u64 __iomem *clint_time_val;
+u64 __iomem *clint_time_cmp;
+
+void clint_init_boot_cpu(void)
+{
+   struct device_node *np;
+   void __iomem *base;
+
+   np = of_find_compatible_node(NULL, NULL, "riscv,clint0");
+   if (!np) {
+   panic("clint not found");
+   return;
+   }
+
+   base = of_iomap(np, 0);
+   if (!base)
+   panic("could not map CLINT");
+
+   clint_ipi_base = base + CLINT_IPI_OFF;
+   clint_time_val = base + CLINT_TIME_VAL_OFF;
+   clint_time_cmp = base + CLINT_TIME_CMP_OFF;
+
+   clint_clear_ipi(boot_cpu_hartid);
+}
diff --git a/arch/riscv/kernel/setup.c b/arch/riscv/kernel/setup.c
index b92e6831d1ec..2892d82f474c 100644
--- a/arch/riscv/kernel/setup.c
+++ b/arch/riscv/kernel/setup.c
@@ -17,6 +17,7 @@

[PATCH 15/17] riscv: use the correct interrupt levels for M-mode

2019-06-23 Thread Christoph Hellwig

The numerical levels for External/Timer/Software interrupts differ
between S-mode and M-mode.

Signed-off-by: Christoph Hellwig 
---
 arch/riscv/kernel/irq.c | 12 +---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/arch/riscv/kernel/irq.c b/arch/riscv/kernel/irq.c
index 804ff70bb853..9566aabbe50b 100644
--- a/arch/riscv/kernel/irq.c
+++ b/arch/riscv/kernel/irq.c
@@ -14,9 +14,15 @@
 /*
  * Possible interrupt causes:
  */
-#define INTERRUPT_CAUSE_SOFTWARE   IRQ_S_SOFT
-#define INTERRUPT_CAUSE_TIMER  IRQ_S_TIMER
-#define INTERRUPT_CAUSE_EXTERNAL   IRQ_S_EXT
+#ifdef CONFIG_M_MODE
+# define INTERRUPT_CAUSE_SOFTWARE  IRQ_M_SOFT
+# define INTERRUPT_CAUSE_TIMER IRQ_M_TIMER
+# define INTERRUPT_CAUSE_EXTERNAL  IRQ_M_EXT
+#else
+# define INTERRUPT_CAUSE_SOFTWARE  IRQ_S_SOFT
+# define INTERRUPT_CAUSE_TIMER IRQ_S_TIMER
+# define INTERRUPT_CAUSE_EXTERNAL  IRQ_S_EXT
+#endif /* CONFIG_M_MODE */
 
 int arch_show_interrupts(struct seq_file *p, int prec)
 {
-- 
2.20.1

Re: [PATCH 1/3] include: linux: i2c: more helpers for declaring i2c drivers

2019-06-23 Thread Enrico Weigelt, metux IT consult

On 21.06.19 23:17, Wolfram Sang wrote:
> On Mon, Jun 17, 2019 at 08:39:37PM +0200, Enrico Weigelt, metux IT consult 
> wrote:
>> From: Enrico Weigelt 
>>
>> Add more helper macros for trivial driver init cases, similar to the
>> already existing module_i2c_driver()+friends - now for those which
>> are initialized at other stages (eg. by subsys_initcall()).
>>
>> This helps to further reduce driver init boilerplate.
> 
> Uh, no! Using subsys_initcall is an old fashioned hack to work around
> boot time dependencies. Unless there are very strong arguments, I
> usually do not accept them anymore. So, any simplification of that sends
> out the wrong message.

Okay, what's the correct initialization method then ?
Just convert it to already existing module_i2c_driver() ?


--mtx

-- 
Enrico Weigelt, metux IT consult
Free software and Linux embedded engineering
i...@metux.net -- +49-151-27565287

[PATCH 03/17] mm/nommu: fix the MAP_UNINITIALIZED flag

2019-06-23 Thread Christoph Hellwig

We can't expose UAPI symbols differently based on CONFIG_ symbols, as
userspace won't have them available.  Instead always define the flag,
but only respect it based on the config option.

Signed-off-by: Christoph Hellwig 
Reviewed-by: Vladimir Murzin 
---
 arch/xtensa/include/uapi/asm/mman.h| 6 +-
 include/uapi/asm-generic/mman-common.h | 8 +++-
 mm/nommu.c | 4 +++-
 3 files changed, 7 insertions(+), 11 deletions(-)

diff --git a/arch/xtensa/include/uapi/asm/mman.h 
b/arch/xtensa/include/uapi/asm/mman.h
index be726062412b..ebbb48842190 100644
--- a/arch/xtensa/include/uapi/asm/mman.h
+++ b/arch/xtensa/include/uapi/asm/mman.h
@@ -56,12 +56,8 @@
 #define MAP_STACK  0x4 /* give out an address that is best 
suited for process/thread stacks */
 #define MAP_HUGETLB0x8 /* create a huge page mapping */
 #define MAP_FIXED_NOREPLACE 0x10   /* MAP_FIXED which doesn't unmap 
underlying mapping */
-#ifdef CONFIG_MMAP_ALLOW_UNINITIALIZED
-# define MAP_UNINITIALIZED 0x400   /* For anonymous mmap, memory could be
+#define MAP_UNINITIALIZED 0x400/* For anonymous mmap, memory could be
 * uninitialized */
-#else
-# define MAP_UNINITIALIZED 0x0 /* Don't support this flag */
-#endif
 
 /*
  * Flags for msync
diff --git a/include/uapi/asm-generic/mman-common.h 
b/include/uapi/asm-generic/mman-common.h
index abd238d0f7a4..cb556b430e71 100644
--- a/include/uapi/asm-generic/mman-common.h
+++ b/include/uapi/asm-generic/mman-common.h
@@ -19,15 +19,13 @@
 #define MAP_TYPE   0x0f/* Mask for type of mapping */
 #define MAP_FIXED  0x10/* Interpret addr exactly */
 #define MAP_ANONYMOUS  0x20/* don't use a file */
-#ifdef CONFIG_MMAP_ALLOW_UNINITIALIZED
-# define MAP_UNINITIALIZED 0x400   /* For anonymous mmap, memory could be 
uninitialized */
-#else
-# define MAP_UNINITIALIZED 0x0 /* Don't support this flag */
-#endif
 
 /* 0x0100 - 0x8 flags are defined in asm-generic/mman.h */
 #define MAP_FIXED_NOREPLACE0x10/* MAP_FIXED which doesn't 
unmap underlying mapping */
 
+#define MAP_UNINITIALIZED 0x400/* For anonymous mmap, memory could be
+* uninitialized */
+
 /*
  * Flags for mlock
  */
diff --git a/mm/nommu.c b/mm/nommu.c
index d8c02fbe03b5..ec75a0dffd4f 100644
--- a/mm/nommu.c
+++ b/mm/nommu.c
@@ -1349,7 +1349,9 @@ unsigned long do_mmap(struct file *file,
add_nommu_region(region);
 
/* clear anonymous mappings that don't ask for uninitialized data */
-   if (!vma->vm_file && !(flags & MAP_UNINITIALIZED))
+   if (!vma->vm_file &&
+   (!IS_ENABLED(CONFIG_MMAP_ALLOW_UNINITIALIZED) ||
+!(flags & MAP_UNINITIALIZED)))
memset((void *)region->vm_start, 0,
   region->vm_end - region->vm_start);
 
-- 
2.20.1

[PATCH v2] tpm: Get TCG log from TPM2 ACPI table for tpm2 systems

2019-06-23 Thread Jordan Hand

For TPM2-based systems, retrieve the TCG log from the TPM2 ACPI table.

Signed-off-by: Jordan Hand 
---
v2:
- Apologies, v1 had a silly compile error

 drivers/char/tpm/eventlog/acpi.c | 67 +++-
 1 file changed, 48 insertions(+), 19 deletions(-)

diff --git a/drivers/char/tpm/eventlog/acpi.c b/drivers/char/tpm/eventlog/acpi.c
index 63ada5e53f13..b945c4ff3af6 100644
--- a/drivers/char/tpm/eventlog/acpi.c
+++ b/drivers/char/tpm/eventlog/acpi.c
@@ -41,17 +41,31 @@ struct acpi_tcpa {
};
 };
 
+struct acpi_tpm2 {
+   struct acpi_table_header hdr;
+   u16 platform_class;
+   u16 reserved;
+   u64 control_area_addr;
+   u32 start_method;
+   u8 start_method_params[12];
+   u32 log_max_len;
+   u64 log_start_addr;
+} __packed;
+
 /* read binary bios log */
 int tpm_read_log_acpi(struct tpm_chip *chip)
 {
-   struct acpi_tcpa *buff;
+   struct acpi_table_header *buff;
+   struct acpi_tcpa *tcpa;
+   struct acpi_tpm2 *tpm2;
+
acpi_status status;
void __iomem *virt;
u64 len, start;
+   int log_type;
struct tpm_bios_log *log;
-
-   if (chip->flags & TPM_CHIP_FLAG_TPM2)
-   return -ENODEV;
+   bool is_tpm2 = chip->flags & TPM_CHIP_FLAG_TPM2;
+   acpi_string table_sig;
 
log = >log;
 
@@ -61,26 +75,41 @@ int tpm_read_log_acpi(struct tpm_chip *chip)
if (!chip->acpi_dev_handle)
return -ENODEV;
 
-   /* Find TCPA entry in RSDT (ACPI_LOGICAL_ADDRESSING) */
-   status = acpi_get_table(ACPI_SIG_TCPA, 1,
-   (struct acpi_table_header **));
+   /* Find TCPA or TPM2 entry in RSDT (ACPI_LOGICAL_ADDRESSING) */
+   table_sig = is_tpm2 ? ACPI_SIG_TPM2 : ACPI_SIG_TCPA;
+   status = acpi_get_table(table_sig, 1, );
 
if (ACPI_FAILURE(status))
return -ENODEV;
 
-   switch(buff->platform_class) {
-   case BIOS_SERVER:
-   len = buff->server.log_max_len;
-   start = buff->server.log_start_addr;
-   break;
-   case BIOS_CLIENT:
-   default:
-   len = buff->client.log_max_len;
-   start = buff->client.log_start_addr;
-   break;
+   /* If log_max_len and log_start_addr are set, start_method_params will
+* be 12 bytes, according to TCG ACPI spec. If start_method_params is
+* fewer than 12 bytes, the TCG log is not available
+*/
+   if (is_tpm2 && (buff->length == sizeof(struct acpi_tpm2))) {
+   tpm2 = (struct acpi_tpm2 *)buff;
+   len = tpm2->log_max_len;
+   start = tpm2->log_start_addr;
+   log_type = EFI_TCG2_EVENT_LOG_FORMAT_TCG_2;
+   } else {
+   tcpa = (struct acpi_tcpa *)buff;
+   switch (tcpa->platform_class) {
+   case BIOS_SERVER:
+   len = tcpa->server.log_max_len;
+   start = tcpa->server.log_start_addr;
+   break;
+   case BIOS_CLIENT:
+   default:
+   len = tcpa->client.log_max_len;
+   start = tcpa->client.log_start_addr;
+   break;
+   }
+   log_type = EFI_TCG2_EVENT_LOG_FORMAT_TCG_1_2;
}
+
if (!len) {
-   dev_warn(>dev, "%s: TCPA log area empty\n", __func__);
+   dev_warn(>dev, "%s: %s log area empty\n",
+   table_sig, __func__);
return -EIO;
}
 
@@ -98,7 +127,7 @@ int tpm_read_log_acpi(struct tpm_chip *chip)
memcpy_fromio(log->bios_event_log, virt, len);
 
acpi_os_unmap_iomem(virt, len);
-   return EFI_TCG2_EVENT_LOG_FORMAT_TCG_1_2;
+   return log_type;
 
 err:
kfree(log->bios_event_log);
-- 
2.20.1

[PATCH 17/17] riscv: add nommu support

2019-06-23 Thread Christoph Hellwig

The kernel runs in M-mode without using page tables, and thus can't run
bare metal without help from additional firmware.

Most of the patch is just stubbing out code not needed without page
tables, but there is an interesting detail in the signals implementation:

 - The normal RISC-V syscall ABI only implements rt_sigreturn as VDSO
   entry point, but the ELF VDSO is not supported for nommu Linux.
   We instead copy the code to call the syscall onto the stack.

In addition to enabling the nommu code a new defconfig for a small
kernel image that can run in nommu mode on qemu is also provided, to run
a kernel in qemu you can use the following command line:

qemu-system-riscv64 -smp 2 -m 64 -machine virt -nographic \
-kernel arch/riscv/boot/loader \
-drive file=rootfs.ext2,format=raw,id=hd0 \
-device virtio-blk-device,drive=hd0

Contains contributions from Damien Le Moal .

Signed-off-by: Christoph Hellwig 
---
 arch/riscv/Kconfig  | 24 +---
 arch/riscv/configs/nommu_virt_defconfig | 78 +
 arch/riscv/include/asm/elf.h|  4 +-
 arch/riscv/include/asm/futex.h  |  6 ++
 arch/riscv/include/asm/io.h |  4 ++
 arch/riscv/include/asm/mmu.h|  3 +
 arch/riscv/include/asm/page.h   | 12 +++-
 arch/riscv/include/asm/pgalloc.h|  2 +
 arch/riscv/include/asm/pgtable.h| 38 
 arch/riscv/include/asm/tlbflush.h   |  7 ++-
 arch/riscv/include/asm/uaccess.h|  4 ++
 arch/riscv/kernel/Makefile  |  3 +-
 arch/riscv/kernel/entry.S   | 11 
 arch/riscv/kernel/head.S|  6 ++
 arch/riscv/kernel/signal.c  | 17 +-
 arch/riscv/lib/Makefile |  8 +--
 arch/riscv/mm/Makefile  |  3 +-
 arch/riscv/mm/cacheflush.c  |  2 +
 arch/riscv/mm/context.c |  2 +
 arch/riscv/mm/init.c|  2 +
 20 files changed, 200 insertions(+), 36 deletions(-)
 create mode 100644 arch/riscv/configs/nommu_virt_defconfig

diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index 2185481d1589..f36f337c7570 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -26,13 +26,13 @@ config RISCV
select GENERIC_IRQ_SHOW
select GENERIC_PCI_IOMAP
select GENERIC_SCHED_CLOCK
-   select GENERIC_STRNCPY_FROM_USER
-   select GENERIC_STRNLEN_USER
+   select GENERIC_STRNCPY_FROM_USER if MMU
+   select GENERIC_STRNLEN_USER if MMU
select GENERIC_SMP_IDLE_THREAD
select GENERIC_ATOMIC64 if !64BIT
select HAVE_ARCH_AUDITSYSCALL
select HAVE_MEMBLOCK_NODE_MAP
-   select HAVE_DMA_CONTIGUOUS
+   select HAVE_DMA_CONTIGUOUS if MMU
select HAVE_FUTEX_CMPXCHG if FUTEX
select HAVE_PERF_EVENTS
select HAVE_SYSCALL_TRACEPOINTS
@@ -47,6 +47,7 @@ config RISCV
select PCI_DOMAINS_GENERIC if PCI
select PCI_MSI if PCI
select RISCV_TIMER
+   select UACCESS_MEMCPY if !MMU
select GENERIC_IRQ_MULTI_HANDLER
select ARCH_HAS_PTE_SPECIAL
select ARCH_HAS_MMIOWB
@@ -55,9 +56,14 @@ config RISCV
 # set if we run in machine mode, cleared if we run in supervisor mode
 config M_MODE
bool
+   default y if !MMU
 
 config MMU
-   def_bool y
+   bool "MMU-based Paged Memory Management Support"
+   default y
+   help
+ Select if you want MMU-based virtualised addressing space
+ support by paged memory management. If unsure, say 'Y'.
 
 config ZONE_DMA32
bool
@@ -66,6 +72,7 @@ config ZONE_DMA32
 config PAGE_OFFSET
hex
default 0xC000 if 32BIT && MAXPHYSMEM_2GB
+   default 0x8000 if 64BIT && !MMU
default 0x8000 if 64BIT && MAXPHYSMEM_2GB
default 0xffe0 if 64BIT && MAXPHYSMEM_128GB
 
@@ -93,7 +100,7 @@ config GENERIC_HWEIGHT
def_bool y
 
 config FIX_EARLYCON_MEM
-   def_bool y
+   def_bool CONFIG_MMU
 
 config PGTABLE_LEVELS
int
@@ -116,6 +123,7 @@ config ARCH_RV32I
select GENERIC_LIB_ASHRDI3
select GENERIC_LIB_LSHRDI3
select GENERIC_LIB_UCMPDI2
+   select MMU
 
 config ARCH_RV64I
bool "RV64I"
@@ -124,9 +132,9 @@ config ARCH_RV64I
select HAVE_FUNCTION_TRACER
select HAVE_FUNCTION_GRAPH_TRACER
select HAVE_FTRACE_MCOUNT_RECORD
-   select HAVE_DYNAMIC_FTRACE
-   select HAVE_DYNAMIC_FTRACE_WITH_REGS
-   select SWIOTLB
+   select HAVE_DYNAMIC_FTRACE if MMU
+   select HAVE_DYNAMIC_FTRACE_WITH_REGS if HAVE_DYNAMIC_FTRACE
+   select SWIOTLB if MMU
 
 endchoice
 
diff --git a/arch/riscv/configs/nommu_virt_defconfig 
b/arch/riscv/configs/nommu_virt_defconfig
new file mode 100644
index ..cf74e179bf90
--- /dev/null
+++ b/arch/riscv/configs/nommu_virt_defconfig
@@ -0,0 +1,78 @@
+# CONFIG_CPU_ISOLATION is not set
+CONFIG_LOG_BUF_SHIFT=16

[PATCH 05/17] riscv: use CSR_SATP instead of the legacy sptbr name in switch_mm

2019-06-23 Thread Christoph Hellwig

Switch to our own constant for the satp register instead of using
the old name from a legacy version of the privileged spec.

Signed-off-by: Christoph Hellwig 
---
 arch/riscv/mm/context.c | 7 +--
 1 file changed, 1 insertion(+), 6 deletions(-)

diff --git a/arch/riscv/mm/context.c b/arch/riscv/mm/context.c
index 89ceb3cbe218..beeb5d7f92ea 100644
--- a/arch/riscv/mm/context.c
+++ b/arch/riscv/mm/context.c
@@ -57,12 +57,7 @@ void switch_mm(struct mm_struct *prev, struct mm_struct 
*next,
cpumask_clear_cpu(cpu, mm_cpumask(prev));
cpumask_set_cpu(cpu, mm_cpumask(next));
 
-   /*
-* Use the old spbtr name instead of using the current satp
-* name to support binutils 2.29 which doesn't know about the
-* privileged ISA 1.10 yet.
-*/
-   csr_write(sptbr, virt_to_pfn(next->pgd) | SATP_MODE);
+   csr_write(CSR_SATP, virt_to_pfn(next->pgd) | SATP_MODE);
local_flush_tlb_all();
 
flush_icache_deferred(next);
-- 
2.20.1

[PATCH 14/17] riscv: don't allow selecting SBI-based drivers for M-mode

2019-06-23 Thread Christoph Hellwig

From: Damien Le Moal 

Do not allow selecting SBI related options with MMU option not set.

Signed-off-by: Damien Le Moal 
Signed-off-by: Christoph Hellwig 
---
 drivers/tty/hvc/Kconfig| 2 +-
 drivers/tty/serial/Kconfig | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/tty/hvc/Kconfig b/drivers/tty/hvc/Kconfig
index 4d22b91f..5a1ab6b536ff 100644
--- a/drivers/tty/hvc/Kconfig
+++ b/drivers/tty/hvc/Kconfig
@@ -89,7 +89,7 @@ config HVC_DCC
 
 config HVC_RISCV_SBI
bool "RISC-V SBI console support"
-   depends on RISCV
+   depends on RISCV && !M_MODE
select HVC_DRIVER
help
  This enables support for console output via RISC-V SBI calls, which
diff --git a/drivers/tty/serial/Kconfig b/drivers/tty/serial/Kconfig
index 0d31251e04cc..59dba9f9e466 100644
--- a/drivers/tty/serial/Kconfig
+++ b/drivers/tty/serial/Kconfig
@@ -88,7 +88,7 @@ config SERIAL_EARLYCON_ARM_SEMIHOST
 
 config SERIAL_EARLYCON_RISCV_SBI
bool "Early console using RISC-V SBI"
-   depends on RISCV
+   depends on RISCV && !M_MODE
select SERIAL_CORE
select SERIAL_CORE_CONSOLE
select SERIAL_EARLYCON
-- 
2.20.1

[PATCH 16/17] riscv: clear the instruction cache and all registers when booting

2019-06-23 Thread Christoph Hellwig

When we get booted we want a clear slate without any leaks from previous
supervisors or the firmware.  Flush the instruction cache and then clear
all registers to known good values.  This is really important for the
upcoming nommu support that runs on M-mode, but can't really harm when
running in S-mode either.  Vaguely based on the concepts from opensbi.

Signed-off-by: Christoph Hellwig 
---
 arch/riscv/kernel/head.S | 85 
 1 file changed, 85 insertions(+)

diff --git a/arch/riscv/kernel/head.S b/arch/riscv/kernel/head.S
index a4c170e41a34..74feb17737b4 100644
--- a/arch/riscv/kernel/head.S
+++ b/arch/riscv/kernel/head.S
@@ -11,6 +11,7 @@
 #include 
 #include 
 #include 
+#include 
 
 __INIT
 ENTRY(_start)
@@ -19,6 +20,12 @@ ENTRY(_start)
csrw CSR_XIP, zero
 
 #ifdef CONFIG_M_MODE
+   /* flush the instruction cache */
+   fence.i
+
+   /* Reset all registers except ra, a0, a1 */
+   call reset_regs
+
/*
 * The hartid in a0 is expected later on, and we have no firmware
 * to hand it to us.
@@ -168,6 +175,84 @@ relocate:
j .Lsecondary_park
 END(_start)
 
+#ifdef CONFIG_M_MODE
+ENTRY(reset_regs)
+   li  sp, 0
+   li  gp, 0
+   li  tp, 0
+   li  t0, 0
+   li  t1, 0
+   li  t2, 0
+   li  s0, 0
+   li  s1, 0
+   li  a2, 0
+   li  a3, 0
+   li  a4, 0
+   li  a5, 0
+   li  a6, 0
+   li  a7, 0
+   li  s2, 0
+   li  s3, 0
+   li  s4, 0
+   li  s5, 0
+   li  s6, 0
+   li  s7, 0
+   li  s8, 0
+   li  s9, 0
+   li  s10, 0
+   li  s11, 0
+   li  t3, 0
+   li  t4, 0
+   li  t5, 0
+   li  t6, 0
+   csrwsscratch, 0
+
+#ifdef CONFIG_FPU
+   csrrt0, misa
+   andit0, t0, (COMPAT_HWCAP_ISA_F | COMPAT_HWCAP_ISA_D)
+   bnezt0, .Lreset_regs_done
+
+   li  t1, SR_FS
+   csrssstatus, t1
+   fmv.s.x f0, zero
+   fmv.s.x f1, zero
+   fmv.s.x f2, zero
+   fmv.s.x f3, zero
+   fmv.s.x f4, zero
+   fmv.s.x f5, zero
+   fmv.s.x f6, zero
+   fmv.s.x f7, zero
+   fmv.s.x f8, zero
+   fmv.s.x f9, zero
+   fmv.s.x f10, zero
+   fmv.s.x f11, zero
+   fmv.s.x f12, zero
+   fmv.s.x f13, zero
+   fmv.s.x f14, zero
+   fmv.s.x f15, zero
+   fmv.s.x f16, zero
+   fmv.s.x f17, zero
+   fmv.s.x f18, zero
+   fmv.s.x f19, zero
+   fmv.s.x f20, zero
+   fmv.s.x f21, zero
+   fmv.s.x f22, zero
+   fmv.s.x f23, zero
+   fmv.s.x f24, zero
+   fmv.s.x f25, zero
+   fmv.s.x f26, zero
+   fmv.s.x f27, zero
+   fmv.s.x f28, zero
+   fmv.s.x f29, zero
+   fmv.s.x f30, zero
+   fmv.s.x f31, zero
+   csrwfcsr, 0
+#endif /* CONFIG_FPU */
+.Lreset_regs_done:
+   ret
+END(reset_regs)
+#endif /* CONFIG_M_MODE */
+
 __PAGE_ALIGNED_BSS
/* Empty zero page */
.balign PAGE_SIZE
-- 
2.20.1

[PATCH 08/17] riscv: improve the default power off implementation

2019-06-23 Thread Christoph Hellwig

Only call the SBI code if we are not running in M mode, and if we didn't
do the SBI call, or it didn't succeed call wfi in a loop to at least
save some power.

Signed-off-by: Christoph Hellwig 
---
 arch/riscv/kernel/reset.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/riscv/kernel/reset.c b/arch/riscv/kernel/reset.c
index d0fe623bfb8f..2f5ca379747e 100644
--- a/arch/riscv/kernel/reset.c
+++ b/arch/riscv/kernel/reset.c
@@ -8,8 +8,11 @@
 
 static void default_power_off(void)
 {
+#ifndef CONFIG_M_MODE
sbi_shutdown();
-   while (1);
+#endif
+   while (1)
+   wait_for_interrupt();
 }
 
 void (*pm_power_off)(void) = default_power_off;
-- 
2.20.1

[PATCH 12/17] riscv: implement remote sfence.i natively for M-mode

2019-06-23 Thread Christoph Hellwig

The RISC-V ISA only supports flushing the instruction cache for the local
CPU core.  For normal S-mode Linux remote flushing is offloaded to
machine mode using ecalls, but for M-mode Linux we'll have to do it
ourselves.  Use the same implementation as all the existing open source
SBI implementations by just doing an IPI to all remote cores to execute
th sfence.i instruction on every live core.

Signed-off-by: Christoph Hellwig 
---
 arch/riscv/mm/cacheflush.c | 31 +++
 1 file changed, 27 insertions(+), 4 deletions(-)

diff --git a/arch/riscv/mm/cacheflush.c b/arch/riscv/mm/cacheflush.c
index 9ebcff8ba263..10875ea1065e 100644
--- a/arch/riscv/mm/cacheflush.c
+++ b/arch/riscv/mm/cacheflush.c
@@ -10,10 +10,35 @@
 
 #include 
 
+#ifdef CONFIG_M_MODE
+static void ipi_remote_fence_i(void *info)
+{
+   return local_flush_icache_all();
+}
+
+void flush_icache_all(void)
+{
+   on_each_cpu(ipi_remote_fence_i, NULL, 1);
+}
+
+static void flush_icache_cpumask(const cpumask_t *mask)
+{
+   on_each_cpu_mask(mask, ipi_remote_fence_i, NULL, 1);
+}
+#else /* CONFIG_M_MODE */
 void flush_icache_all(void)
 {
sbi_remote_fence_i(NULL);
 }
+static void flush_icache_cpumask(const cpumask_t *mask)
+{
+   cpumask_t hmask;
+
+   cpumask_clear();
+   riscv_cpuid_to_hartid_mask(mask, );
+   sbi_remote_fence_i(hmask.bits);
+}
+#endif /* CONFIG_M_MODE */
 
 /*
  * Performs an icache flush for the given MM context.  RISC-V has no direct
@@ -28,7 +53,7 @@ void flush_icache_all(void)
 void flush_icache_mm(struct mm_struct *mm, bool local)
 {
unsigned int cpu;
-   cpumask_t others, hmask, *mask;
+   cpumask_t others, *mask;
 
preempt_disable();
 
@@ -47,9 +72,7 @@ void flush_icache_mm(struct mm_struct *mm, bool local)
cpumask_andnot(, mm_cpumask(mm), cpumask_of(cpu));
local |= cpumask_empty();
if (mm != current->active_mm || !local) {
-   cpumask_clear();
-   riscv_cpuid_to_hartid_mask(, );
-   sbi_remote_fence_i(hmask.bits);
+   flush_icache_cpumask();
} else {
/*
 * It's assumed that at least one strongly ordered operation is
-- 
2.20.1

[PATCH 07/17] riscv: abstract out CSR names for supervisor vs machine mode

2019-06-23 Thread Christoph Hellwig

Many of the privileged CSRs exist in a supervisor and machine version
that are used very similarly.  Provide a new X-naming layer so that
we don't have to ifdef everywhere for M-mode Linux support.

Contains contributions from Damien Le Moal .

Signed-off-by: Christoph Hellwig 
---
 arch/riscv/Kconfig |  4 ++
 arch/riscv/include/asm/asm.h   |  6 +++
 arch/riscv/include/asm/csr.h   | 58 ++--
 arch/riscv/include/asm/irqflags.h  | 12 +++---
 arch/riscv/include/asm/processor.h |  2 +-
 arch/riscv/include/asm/ptrace.h| 16 
 arch/riscv/include/asm/switch_to.h |  8 ++--
 arch/riscv/kernel/asm-offsets.c|  8 ++--
 arch/riscv/kernel/entry.S  | 62 --
 arch/riscv/kernel/fpu.S|  8 ++--
 arch/riscv/kernel/head.S   | 12 +++---
 arch/riscv/kernel/irq.c|  4 +-
 arch/riscv/kernel/process.c| 15 
 arch/riscv/kernel/signal.c | 21 +-
 arch/riscv/kernel/traps.c  | 16 
 arch/riscv/lib/uaccess.S   | 12 +++---
 arch/riscv/mm/extable.c|  4 +-
 arch/riscv/mm/fault.c  |  6 +--
 drivers/clocksource/timer-riscv.c  |  8 ++--
 drivers/irqchip/irq-sifive-plic.c  |  4 +-
 20 files changed, 177 insertions(+), 109 deletions(-)

diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index 2c19baa8d6c3..2185481d1589 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -52,6 +52,10 @@ config RISCV
select ARCH_HAS_MMIOWB
select HAVE_EBPF_JIT if 64BIT
 
+# set if we run in machine mode, cleared if we run in supervisor mode
+config M_MODE
+   bool
+
 config MMU
def_bool y
 
diff --git a/arch/riscv/include/asm/asm.h b/arch/riscv/include/asm/asm.h
index 5a02b7d50940..14604f01e9f8 100644
--- a/arch/riscv/include/asm/asm.h
+++ b/arch/riscv/include/asm/asm.h
@@ -65,4 +65,10 @@
 #error "Unexpected __SIZEOF_SHORT__"
 #endif
 
+#ifdef CONFIG_M_MODE
+# define Xret  mret
+#else
+# define Xret  sret
+#endif
+
 #endif /* _ASM_RISCV_ASM_H */
diff --git a/arch/riscv/include/asm/csr.h b/arch/riscv/include/asm/csr.h
index a18923fa23c8..026a761835b7 100644
--- a/arch/riscv/include/asm/csr.h
+++ b/arch/riscv/include/asm/csr.h
@@ -11,8 +11,11 @@
 
 /* Status register flags */
 #define SR_SIE _AC(0x0002, UL) /* Supervisor Interrupt Enable */
+#define SR_MIE _AC(0x0008, UL) /* Machine Interrupt Enable */
 #define SR_SPIE_AC(0x0020, UL) /* Previous Supervisor IE */
+#define SR_MPIE_AC(0x0080, UL) /* Previous Machine IE */
 #define SR_SPP _AC(0x0100, UL) /* Previously Supervisor */
+#define SR_MPP _AC(0x1800, UL) /* Previously Machine */
 #define SR_SUM _AC(0x0004, UL) /* Supervisor User Memory Access */
 
 #define SR_FS  _AC(0x6000, UL) /* Floating-point Status */
@@ -44,8 +47,8 @@
 #define SATP_MODE  SATP_MODE_39
 #endif
 
-/* SCAUSE */
-#define SCAUSE_IRQ_FLAG(_AC(1, UL) << (__riscv_xlen - 1))
+/* *CAUSE */
+#define XCAUSE_IRQ_FLAG(_AC(1, UL) << (__riscv_xlen - 1))
 
 #define IRQ_U_SOFT 0
 #define IRQ_S_SOFT 1
@@ -67,11 +70,26 @@
 #define EXC_LOAD_PAGE_FAULT13
 #define EXC_STORE_PAGE_FAULT   15
 
-/* SIE (Interrupt Enable) and SIP (Interrupt Pending) flags */
+/* MIE / MIP flags: */
+#define MIE_MSIE   (_AC(0x1, UL) << IRQ_M_SOFT)
+#define MIE_MTIE   (_AC(0x1, UL) << IRQ_M_TIMER)
+#define MIE_MEIE   (_AC(0x1, UL) << IRQ_M_EXT)
+
+/* SIE / SIP flags: */
 #define SIE_SSIE   (_AC(0x1, UL) << IRQ_S_SOFT)
 #define SIE_STIE   (_AC(0x1, UL) << IRQ_S_TIMER)
 #define SIE_SEIE   (_AC(0x1, UL) << IRQ_S_EXT)
 
+/* symbolic CSR names: */
+#define CSR_MSTATUS0x300
+#define CSR_MIE0x304
+#define CSR_MTVEC  0x305
+#define CSR_MSCRATCH   0x340
+#define CSR_MEPC   0x341
+#define CSR_MCAUSE 0x342
+#define CSR_MTVAL  0x343
+#define CSR_MIP0x344
+
 #define CSR_CYCLE  0xc00
 #define CSR_TIME   0xc01
 #define CSR_INSTRET0xc02
@@ -89,6 +107,40 @@
 #define CSR_TIMEH  0xc81
 #define CSR_INSTRETH   0xc82
 
+#ifdef CONFIG_M_MODE
+# define CSR_XSTATUS   CSR_MSTATUS
+# define CSR_XIE   CSR_MIE
+# define CSR_XTVEC CSR_MTVEC
+# define CSR_XSCRATCH  CSR_MSCRATCH
+# define CSR_XEPC  CSR_MEPC
+# define CSR_XCAUSECSR_MCAUSE
+# define CSR_XTVAL CSR_MTVAL
+# define CSR_XIP   CSR_MIP
+
+# define SR_XIESR_MIE
+# define SR_XPIE   SR_MPIE
+# define SR_XPPSR_MPP
+
+# define XIE_XTIE  MIE_MTIE
+# define XIE_XEIE  MIE_MEIE
+#else /* CONFIG_M_MODE */
+# define CSR_XSTATUS   CSR_SSTATUS
+# define CSR_XIE   CSR_SIE
+# define CSR_XTVEC CSR_STVEC
+# define CSR_XSCRATCH

[PATCH 13/17] riscv: poison SBI calls for M-mode

2019-06-23 Thread Christoph Hellwig

There is no SBI when we run in M-mode, so fail the compile for any code
trying to use SBI calls.

Signed-off-by: Christoph Hellwig 
---
 arch/riscv/include/asm/sbi.h | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/riscv/include/asm/sbi.h b/arch/riscv/include/asm/sbi.h
index 21134b3ef404..1e17f07eadaf 100644
--- a/arch/riscv/include/asm/sbi.h
+++ b/arch/riscv/include/asm/sbi.h
@@ -8,6 +8,7 @@
 
 #include 
 
+#ifndef CONFIG_M_MODE
 #define SBI_SET_TIMER 0
 #define SBI_CONSOLE_PUTCHAR 1
 #define SBI_CONSOLE_GETCHAR 2
@@ -94,4 +95,5 @@ static inline void sbi_remote_sfence_vma_asid(const unsigned 
long *hart_mask,
SBI_CALL_4(SBI_REMOTE_SFENCE_VMA_ASID, hart_mask, start, size, asid);
 }
 
-#endif
+#endif /* CONFIG_M_MODE */
+#endif /* _ASM_RISCV_SBI_H */
-- 
2.20.1

[PATCH 09/17] riscv: provide a flat entry loader

2019-06-23 Thread Christoph Hellwig

This allows just loading the kernel at a pre-set address without
qemu going bonkers trying to map the ELF file.

Signed-off-by: Christoph Hellwig 
---
 arch/riscv/Makefile| 13 +
 arch/riscv/boot/Makefile   |  7 ++-
 arch/riscv/boot/loader.S   |  8 
 arch/riscv/boot/loader.lds | 14 ++
 4 files changed, 37 insertions(+), 5 deletions(-)
 create mode 100644 arch/riscv/boot/loader.S
 create mode 100644 arch/riscv/boot/loader.lds

diff --git a/arch/riscv/Makefile b/arch/riscv/Makefile
index 6b0741c9f348..69dbb6cb72f3 100644
--- a/arch/riscv/Makefile
+++ b/arch/riscv/Makefile
@@ -84,13 +84,18 @@ PHONY += vdso_install
 vdso_install:
$(Q)$(MAKE) $(build)=arch/riscv/kernel/vdso $@
 
-all: Image.gz
+ifeq ($(CONFIG_M_MODE),y)
+KBUILD_IMAGE := $(boot)/loader
+else
+KBUILD_IMAGE := $(boot)/Image.gz
+endif
+BOOT_TARGETS := Image Image.gz loader
 
-Image: vmlinux
-   $(Q)$(MAKE) $(build)=$(boot) $(boot)/$@
+all:   $(notdir $(KBUILD_IMAGE))
 
-Image.%: Image
+$(BOOT_TARGETS): vmlinux
$(Q)$(MAKE) $(build)=$(boot) $(boot)/$@
+   @$(kecho) '  Kernel: $(boot)/$@ is ready'
 
 zinstall install:
$(Q)$(MAKE) $(build)=$(boot) $@
diff --git a/arch/riscv/boot/Makefile b/arch/riscv/boot/Makefile
index 0990a9fdbe5d..32d2addeddba 100644
--- a/arch/riscv/boot/Makefile
+++ b/arch/riscv/boot/Makefile
@@ -16,7 +16,7 @@
 
 OBJCOPYFLAGS_Image :=-O binary -R .note -R .note.gnu.build-id -R .comment -S
 
-targets := Image
+targets := Image loader
 
 $(obj)/Image: vmlinux FORCE
$(call if_changed,objcopy)
@@ -24,6 +24,11 @@ $(obj)/Image: vmlinux FORCE
 $(obj)/Image.gz: $(obj)/Image FORCE
$(call if_changed,gzip)
 
+loader.o: $(src)/loader.S $(obj)/Image
+
+$(obj)/loader: $(obj)/loader.o $(obj)/Image FORCE
+   $(Q)$(LD) -T $(src)/loader.lds -o $@ $(obj)/loader.o
+
 install:
$(CONFIG_SHELL) $(srctree)/$(src)/install.sh $(KERNELRELEASE) \
$(obj)/Image System.map "$(INSTALL_PATH)"
diff --git a/arch/riscv/boot/loader.S b/arch/riscv/boot/loader.S
new file mode 100644
index ..5586e2610dbb
--- /dev/null
+++ b/arch/riscv/boot/loader.S
@@ -0,0 +1,8 @@
+// SPDX-License-Identifier: GPL-2.0
+
+   .align 4
+   .section .payload, "ax", %progbits
+   .globl _start
+_start:
+   .incbin "arch/riscv/boot/Image"
+
diff --git a/arch/riscv/boot/loader.lds b/arch/riscv/boot/loader.lds
new file mode 100644
index ..da9efd57bf44
--- /dev/null
+++ b/arch/riscv/boot/loader.lds
@@ -0,0 +1,14 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+OUTPUT_ARCH(riscv)
+ENTRY(_start)
+
+SECTIONS
+{
+   . = 0x8000;
+
+   .payload : {
+   *(.payload)
+   . = ALIGN(8);
+   }
+}
-- 
2.20.1

[PATCH 06/17] riscv: refactor the IPI code

2019-06-23 Thread Christoph Hellwig

This prepare for adding native non-SBI IPI code.

Signed-off-by: Christoph Hellwig 
---
 arch/riscv/kernel/smp.c | 55 +++--
 1 file changed, 31 insertions(+), 24 deletions(-)

diff --git a/arch/riscv/kernel/smp.c b/arch/riscv/kernel/smp.c
index 5a9834503a2f..8cd730239613 100644
--- a/arch/riscv/kernel/smp.c
+++ b/arch/riscv/kernel/smp.c
@@ -78,13 +78,38 @@ static void ipi_stop(void)
wait_for_interrupt();
 }
 
+static void send_ipi_mask(const struct cpumask *mask, enum ipi_message_type op)
+{
+   int cpuid, hartid;
+   struct cpumask hartid_mask;
+
+   cpumask_clear(_mask);
+   mb();
+   for_each_cpu(cpuid, mask) {
+   set_bit(op, _data[cpuid].bits);
+   hartid = cpuid_to_hartid_map(cpuid);
+   cpumask_set_cpu(hartid, _mask);
+   }
+   mb();
+   sbi_send_ipi(cpumask_bits(_mask));
+}
+
+static void send_ipi_single(int cpu, enum ipi_message_type op)
+{
+   send_ipi_mask(cpumask_of(cpu), op);
+}
+
+static inline void clear_ipi(void)
+{
+   csr_clear(CSR_SIP, SIE_SSIE);
+}
+
 void riscv_software_interrupt(void)
 {
unsigned long *pending_ipis = _data[smp_processor_id()].bits;
unsigned long *stats = ipi_data[smp_processor_id()].stats;
 
-   /* Clear pending IPI */
-   csr_clear(CSR_SIP, SIE_SSIE);
+   clear_ipi();
 
while (true) {
unsigned long ops;
@@ -118,23 +143,6 @@ void riscv_software_interrupt(void)
}
 }
 
-static void
-send_ipi_message(const struct cpumask *to_whom, enum ipi_message_type 
operation)
-{
-   int cpuid, hartid;
-   struct cpumask hartid_mask;
-
-   cpumask_clear(_mask);
-   mb();
-   for_each_cpu(cpuid, to_whom) {
-   set_bit(operation, _data[cpuid].bits);
-   hartid = cpuid_to_hartid_map(cpuid);
-   cpumask_set_cpu(hartid, _mask);
-   }
-   mb();
-   sbi_send_ipi(cpumask_bits(_mask));
-}
-
 static const char * const ipi_names[] = {
[IPI_RESCHEDULE]= "Rescheduling interrupts",
[IPI_CALL_FUNC] = "Function call interrupts",
@@ -156,12 +164,12 @@ void show_ipi_stats(struct seq_file *p, int prec)
 
 void arch_send_call_function_ipi_mask(struct cpumask *mask)
 {
-   send_ipi_message(mask, IPI_CALL_FUNC);
+   send_ipi_mask(mask, IPI_CALL_FUNC);
 }
 
 void arch_send_call_function_single_ipi(int cpu)
 {
-   send_ipi_message(cpumask_of(cpu), IPI_CALL_FUNC);
+   send_ipi_single(cpu, IPI_CALL_FUNC);
 }
 
 void smp_send_stop(void)
@@ -176,7 +184,7 @@ void smp_send_stop(void)
 
if (system_state <= SYSTEM_RUNNING)
pr_crit("SMP: stopping secondary CPUs\n");
-   send_ipi_message(, IPI_CPU_STOP);
+   send_ipi_mask(, IPI_CPU_STOP);
}
 
/* Wait up to one second for other CPUs to stop */
@@ -191,6 +199,5 @@ void smp_send_stop(void)
 
 void smp_send_reschedule(int cpu)
 {
-   send_ipi_message(cpumask_of(cpu), IPI_RESCHEDULE);
+   send_ipi_single(cpu, IPI_RESCHEDULE);
 }
-
-- 
2.20.1

[PATCH 04/17] irqchip/sifive-plic: set max threshold for ignored handlers

2019-06-23 Thread Christoph Hellwig

When running in M-mode we still the S-mode plic handlers in the DT.
Ignore them by setting the maximum threshold.

Signed-off-by: Christoph Hellwig 
---
 drivers/irqchip/irq-sifive-plic.c | 12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/drivers/irqchip/irq-sifive-plic.c 
b/drivers/irqchip/irq-sifive-plic.c
index cf755964f2f8..c72c036aea76 100644
--- a/drivers/irqchip/irq-sifive-plic.c
+++ b/drivers/irqchip/irq-sifive-plic.c
@@ -244,6 +244,7 @@ static int __init plic_init(struct device_node *node,
struct plic_handler *handler;
irq_hw_number_t hwirq;
int cpu, hartid;
+   u32 threshold = 0;
 
if (of_irq_parse_one(node, i, )) {
pr_err("failed to parse parent for context %d.\n", i);
@@ -266,10 +267,16 @@ static int __init plic_init(struct device_node *node,
continue;
}
 
+   /*
+* When running in M-mode we need to ignore the S-mode handler.
+* Here we assume it always comes later, but that might be a
+* little fragile.
+*/
handler = per_cpu_ptr(_handlers, cpu);
if (handler->present) {
pr_warn("handler already present for context %d.\n", i);
-   continue;
+   threshold = 0x;
+   goto done;
}
 
handler->present = true;
@@ -279,8 +286,9 @@ static int __init plic_init(struct device_node *node,
handler->enable_base =
plic_regs + ENABLE_BASE + i * ENABLE_PER_HART;
 
+done:
/* priority must be > threshold to trigger an interrupt */
-   writel(0, handler->hart_base + CONTEXT_THRESHOLD);
+   writel(threshold, handler->hart_base + CONTEXT_THRESHOLD);
for (hwirq = 1; hwirq <= nr_irqs; hwirq++)
plic_toggle(handler, hwirq, 0);
nr_handlers++;
-- 
2.20.1

[PATCH 02/17] mm: stub out all of swapops.h for !CONFIG_MMU

2019-06-23 Thread Christoph Hellwig

The whole header file deals with swap entries and PTEs, none of which
can exist for nommu builds.

Signed-off-by: Christoph Hellwig 
---
 include/linux/swapops.h | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/include/linux/swapops.h b/include/linux/swapops.h
index 4d961668e5fc..b02922556846 100644
--- a/include/linux/swapops.h
+++ b/include/linux/swapops.h
@@ -6,6 +6,8 @@
 #include 
 #include 
 
+#ifdef CONFIG_MMU
+
 /*
  * swapcache pages are stored in the swapper_space radix tree.  We want to
  * get good packing density in that tree, so the index should be dense in
@@ -50,13 +52,11 @@ static inline pgoff_t swp_offset(swp_entry_t entry)
return entry.val & SWP_OFFSET_MASK;
 }
 
-#ifdef CONFIG_MMU
 /* check whether a pte points to a swap entry */
 static inline int is_swap_pte(pte_t pte)
 {
return !pte_none(pte) && !pte_present(pte);
 }
-#endif
 
 /*
  * Convert the arch-dependent pte representation of a swp_entry_t into an
@@ -375,4 +375,5 @@ static inline int non_swap_entry(swp_entry_t entry)
 }
 #endif
 
+#endif /* CONFIG_MMU */
 #endif /* _LINUX_SWAPOPS_H */
-- 
2.20.1

RISC-V nommu support v2

2019-06-23 Thread Christoph Hellwig

Hi all,

below is a series to support nommu mode on RISC-V.  For now this series
just works under qemu with the qemu-virt platform, but Damien has also
been able to get kernel based on this tree with additional driver hacks
to work on the Kendryte KD210, but that will take a while to cleanup
an upstream.

To be useful this series also require the RISC-V binfmt_flat support,
which I've sent out separately.

A branch that includes this series and the binfmt_flat support is
available here:

git://git.infradead.org/users/hch/riscv.git riscv-nommu.2

Gitweb:


http://git.infradead.org/users/hch/riscv.git/shortlog/refs/heads/riscv-nommu.2

I've also pushed out a builtroot branch that can build a RISC-V nommu
root filesystem here:

   git://git.infradead.org/users/hch/buildroot.git riscv-nommu.2

Gitweb:

   
http://git.infradead.org/users/hch/buildroot.git/shortlog/refs/heads/riscv-nommu.2

Changes since v1:
 - fixes so that a kernel with this series still work on builds with an
   IOMMU
 - small clint cleanups
 - the binfmt_flat base and buildroot now don't put arguments on the stack

Re: [PATCH] usb: dwc2: use a longer AHB idle timeout in dwc2_core_reset()

2019-06-23 Thread Minas Harutyunyan


On 6/20/2019 9:51 PM, Martin Blumenstingl wrote:

Use a 1us AHB idle timeout in dwc2_core_reset() and make it
consistent with the other "wait for AHB master IDLE state" ocurrences.

This fixes a problem for me where dwc2 would not want to initialize when
updating to 4.19 on a MIPS Lantiq VRX200 SoC. dwc2 worked fine with
4.14.
Testing on my board shows that it takes 180us until AHB master IDLE
state is signalled. The very old vendor driver for this SoC (ifxhcd)
used a 1 second timeout.
Use the same timeout that is used everywhere when polling for
GRSTCTL_AHBIDLE instead of using a timeout that "works for one board"
(180us in my case) to have consistent behavior across the dwc2 driver.

Cc: linux-stable  # 4.19+
Signed-off-by: Martin Blumenstingl 
---


Acked-by: Minas Harutyunyan 


  drivers/usb/dwc2/core.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/usb/dwc2/core.c b/drivers/usb/dwc2/core.c
index 8b499d643461..8e41d70fd298 100644
--- a/drivers/usb/dwc2/core.c
+++ b/drivers/usb/dwc2/core.c
@@ -531,7 +531,7 @@ int dwc2_core_reset(struct dwc2_hsotg *hsotg, bool 
skip_wait)
}
  
  	/* Wait for AHB master IDLE state */

-   if (dwc2_hsotg_wait_bit_set(hsotg, GRSTCTL, GRSTCTL_AHBIDLE, 50)) {
+   if (dwc2_hsotg_wait_bit_set(hsotg, GRSTCTL, GRSTCTL_AHBIDLE, 1)) {
dev_warn(hsotg->dev, "%s: HANG! AHB Idle timeout GRSTCTL 
GRSTCTL_AHBIDLE\n",
 __func__);
return -EBUSY;

siox: driver init boilerplate reduction v3

2019-06-23 Thread Enrico Weigelt, metux IT consult

Hi folks,


this is v3 of my siox/gpio series from last week.

v3: fixed subject and formatting pointed out by Uwe,
second patch (gpio-siox.c) already acked by him

v2: fixed the typos pointed out by Uwe.


--mtx

[PATCH 1/2] siox: add helper macro to simplify driver registration

2019-06-23 Thread Enrico Weigelt, metux IT consult

From: Enrico Weigelt 

Add more helper macros for trivial driver init cases, similar to the
already existing module_platform_driver() or module_i2c_driver().

This helps to reduce driver init boilerplate.

Signed-off-by: Enrico Weigelt 
---
 include/linux/siox.h | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/include/linux/siox.h b/include/linux/siox.h
index a860cb8..da7225b 100644
--- a/include/linux/siox.h
+++ b/include/linux/siox.h
@@ -72,3 +72,13 @@ static inline void siox_driver_unregister(struct siox_driver 
*sdriver)
 {
return driver_unregister(>driver);
 }
+
+/*
+ * module_siox_driver() - Helper macro for drivers that don't do
+ * anything special in module init/exit.  This eliminates a lot of
+ * boilerplate.  Each module may only use this macro once, and
+ * calling it replaces module_init() and module_exit()
+ */
+#define module_siox_driver(__siox_driver) \
+   module_driver(__siox_driver, siox_driver_register, \
+   siox_driver_unregister)
-- 
1.9.1

[PATCH 2/2] drivers: gpio: siox: use module_siox_driver()

2019-06-23 Thread Enrico Weigelt, metux IT consult

From: Enrico Weigelt 

Reduce driver init boilerplate by using the new
module_siox_driver() macro.

Signed-off-by: Enrico Weigelt 
---
 drivers/gpio/gpio-siox.c | 13 +
 1 file changed, 1 insertion(+), 12 deletions(-)

diff --git a/drivers/gpio/gpio-siox.c b/drivers/gpio/gpio-siox.c
index 571b2a8..fb4e318 100644
--- a/drivers/gpio/gpio-siox.c
+++ b/drivers/gpio/gpio-siox.c
@@ -275,18 +275,7 @@ static int gpio_siox_remove(struct siox_device *sdevice)
.name = "gpio-siox",
},
 };
-
-static int __init gpio_siox_init(void)
-{
-   return siox_driver_register(_siox_driver);
-}
-module_init(gpio_siox_init);
-
-static void __exit gpio_siox_exit(void)
-{
-   siox_driver_unregister(_siox_driver);
-}
-module_exit(gpio_siox_exit);
+module_siox_driver(gpio_siox_driver);
 
 MODULE_AUTHOR("Uwe Kleine-Koenig ");
 MODULE_DESCRIPTION("SIOX gpio driver");
-- 
1.9.1

Re: [RFC v3 0/2] clocksource: davinci-timer: new driver

2019-06-23 Thread Daniel Lezcano



Sekhar, Bartosz,

if the sparse warning is not fixed, the driver won't hit this kernel
version. Please fix it before the two next days otherwise it won't make
it for v5.4.

Thanks

  -- Daniel


On 14/06/2019 12:39, Sekhar Nori wrote:
> Hi Daniel,
> 
> On 05/06/19 2:03 PM, Bartosz Golaszewski wrote:
>> From: Bartosz Golaszewski 
>>
>> This is another version of the new davinci clocksource driver. After much
>> discussion this contains many changes to simplify and improve the driver.
> 
> Does this look good to you now? If yes, can you please merge and provide
> an immutable branch to me so I can merge dependent mach-davinci patches?


-- 
  Linaro.org │ Open source software for ARM SoCs

Follow Linaro:   Facebook |
 Twitter |
 Blog

Re: [PATCHv2] mm/gup: speed up check_and_migrate_cma_pages() on huge page

2019-06-23 Thread Pingfan Liu

On Mon, Jun 24, 2019 at 1:32 PM Pingfan Liu  wrote:
>
> On Mon, Jun 24, 2019 at 12:43 PM Ira Weiny  wrote:
> >
> > On Mon, Jun 24, 2019 at 12:12:41PM +0800, Pingfan Liu wrote:
> > > Both hugetlb and thp locate on the same migration type of pageblock, since
> > > they are allocated from a free_list[]. Based on this fact, it is enough to
> > > check on a single subpage to decide the migration type of the whole huge
> > > page. By this way, it saves (2M/4K - 1) times loop for pmd_huge on x86,
> > > similar on other archs.
> > >
> > > Furthermore, when executing isolate_huge_page(), it avoid taking global
> > > hugetlb_lock many times, and meanless remove/add to the local link list
> > > cma_page_list.
> > >
> > > Signed-off-by: Pingfan Liu 
> > > Cc: Andrew Morton 
> > > Cc: Ira Weiny 
> > > Cc: Mike Rapoport 
> > > Cc: "Kirill A. Shutemov" 
> > > Cc: Thomas Gleixner 
> > > Cc: John Hubbard 
> > > Cc: "Aneesh Kumar K.V" 
> > > Cc: Christoph Hellwig 
> > > Cc: Keith Busch 
> > > Cc: Mike Kravetz 
> > > Cc: Linux-kernel@vger.kernel.org
> > > ---
> > >  mm/gup.c | 19 ---
> > >  1 file changed, 12 insertions(+), 7 deletions(-)
> > >
> > > diff --git a/mm/gup.c b/mm/gup.c
> > > index ddde097..544f5de 100644
> > > --- a/mm/gup.c
> > > +++ b/mm/gup.c
> > > @@ -1342,19 +1342,22 @@ static long check_and_migrate_cma_pages(struct 
> > > task_struct *tsk,
> > >   LIST_HEAD(cma_page_list);
> > >
> > >  check_again:
> > > - for (i = 0; i < nr_pages; i++) {
> > > + for (i = 0; i < nr_pages;) {
> > > +
> > > + struct page *head = compound_head(pages[i]);
> > > + long step = 1;
> > > +
> > > + if (PageCompound(head))
> > > + step = compound_order(head) - (pages[i] - head);
> >
> > Sorry if I missed this last time.  compound_order() is not correct here.
> For thp, prep_transhuge_page()->prep_compound_page()->set_compound_order().
> For smaller hugetlb,
> prep_new_huge_page()->prep_compound_page()->set_compound_order().
> For gigantic page, prep_compound_gigantic_page()->set_compound_order().
>
> Do I miss anything?
>
Oh, got it. It should be 1< Thanks,
>   Pingfan
> [...]

[PATCH] staging: bcm2835-camera: Avoid apotential sleep while holding a spin_lock

2019-06-23 Thread Christophe JAILLET

Do not allocate memory with GFP_KERNEL when holding a spin_lock, it may
sleep. Use GFP_NOWAIT instead.

Fixes: 950fd867c635 ("staging: bcm2835-camera: Replace open-coded idr with a 
struct idr.")
Signed-off-by: Christophe JAILLET 
---
 drivers/staging/vc04_services/bcm2835-camera/mmal-vchiq.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/staging/vc04_services/bcm2835-camera/mmal-vchiq.c 
b/drivers/staging/vc04_services/bcm2835-camera/mmal-vchiq.c
index 16af735af5c3..438d548c6e24 100644
--- a/drivers/staging/vc04_services/bcm2835-camera/mmal-vchiq.c
+++ b/drivers/staging/vc04_services/bcm2835-camera/mmal-vchiq.c
@@ -186,7 +186,7 @@ get_msg_context(struct vchiq_mmal_instance *instance)
 */
spin_lock(>context_map_lock);
handle = idr_alloc(>context_map, msg_context,
-  0, 0, GFP_KERNEL);
+  0, 0, GFP_NOWAIT);
spin_unlock(>context_map_lock);
 
if (handle < 0) {
-- 
2.20.1

Re: [PATCH v2 1/2] include: linux: siox: more for declaring siox drivers

2019-06-23 Thread Enrico Weigelt, metux IT consult

On 18.06.19 18:17, Uwe Kleine-König wrote:

Hi,

> I like the change. Just noticed that the Subject line is a bit strange> 
> though. if "more for" is proper English then it's news to me. I'd
write:> >   siox: add helper macro to simplify driver registration
Good point, seems I've must have been totally under-coffeined, and
some words on nasty phone interrupts :o

I'll fix that.



>> diff --git a/include/linux/siox.h b/include/linux/siox.h>> index 
>> d79624e..d53b2b2 100644>> --- a/include/linux/siox.h>> +++
b/include/linux/siox.h>> @@ -75,3 +75,12 @@ static inline void
siox_driver_unregister(struct siox_driver *sdriver)>>  {>>  return
driver_unregister(>driver);>>  }>> +>> +/* module_siox_driver()
- Helper macro for drivers that don't do> > I'd prefer /* on a separate
line as documented in> Documentation/process/coding-style.rst (for
non-net code).
Done.

Do we have a tool to check for that ? checkpatch doesn't seem to care
about it.

>> + * anything special in module init/exit.  This eliminates a lot of>> + * 
>> boilerplate.  Each module may only use this macro once, and>> + *
calling it replaces module_init() and module_exit()>> + */>> +#define
module_siox_driver(__siox_driver) \>> + module_driver(__siox_driver,
siox_driver_register, \>> + siox_driver_unregister)>> -- > 
> Sorry I
didn't notice these two things in the first round already.
No problem, that's why we have multiple rounds :)


I'll send v3 in a few minutes ...


--mtx

-- 
Enrico Weigelt, metux IT consult
Free software and Linux embedded engineering
i...@metux.net -- +49-151-27565287

Re: [PATCHv2] mm/gup: speed up check_and_migrate_cma_pages() on huge page

2019-06-23 Thread Pingfan Liu

On Mon, Jun 24, 2019 at 12:43 PM Ira Weiny  wrote:
>
> On Mon, Jun 24, 2019 at 12:12:41PM +0800, Pingfan Liu wrote:
> > Both hugetlb and thp locate on the same migration type of pageblock, since
> > they are allocated from a free_list[]. Based on this fact, it is enough to
> > check on a single subpage to decide the migration type of the whole huge
> > page. By this way, it saves (2M/4K - 1) times loop for pmd_huge on x86,
> > similar on other archs.
> >
> > Furthermore, when executing isolate_huge_page(), it avoid taking global
> > hugetlb_lock many times, and meanless remove/add to the local link list
> > cma_page_list.
> >
> > Signed-off-by: Pingfan Liu 
> > Cc: Andrew Morton 
> > Cc: Ira Weiny 
> > Cc: Mike Rapoport 
> > Cc: "Kirill A. Shutemov" 
> > Cc: Thomas Gleixner 
> > Cc: John Hubbard 
> > Cc: "Aneesh Kumar K.V" 
> > Cc: Christoph Hellwig 
> > Cc: Keith Busch 
> > Cc: Mike Kravetz 
> > Cc: Linux-kernel@vger.kernel.org
> > ---
> >  mm/gup.c | 19 ---
> >  1 file changed, 12 insertions(+), 7 deletions(-)
> >
> > diff --git a/mm/gup.c b/mm/gup.c
> > index ddde097..544f5de 100644
> > --- a/mm/gup.c
> > +++ b/mm/gup.c
> > @@ -1342,19 +1342,22 @@ static long check_and_migrate_cma_pages(struct 
> > task_struct *tsk,
> >   LIST_HEAD(cma_page_list);
> >
> >  check_again:
> > - for (i = 0; i < nr_pages; i++) {
> > + for (i = 0; i < nr_pages;) {
> > +
> > + struct page *head = compound_head(pages[i]);
> > + long step = 1;
> > +
> > + if (PageCompound(head))
> > + step = compound_order(head) - (pages[i] - head);
>
> Sorry if I missed this last time.  compound_order() is not correct here.
For thp, prep_transhuge_page()->prep_compound_page()->set_compound_order().
For smaller hugetlb,
prep_new_huge_page()->prep_compound_page()->set_compound_order().
For gigantic page, prep_compound_gigantic_page()->set_compound_order().

Do I miss anything?

Thanks,
  Pingfan
[...]

[PATCH V8 2/3] PCI: dwc: Cleanup DBI,ATU read and write APIs

2019-06-23 Thread Vidya Sagar

Cleanup DBI read and write APIs by removing "__" (underscore) from their
names as there are no no-underscore versions and the underscore versions
are already doing what no-underscore versions typically do. It also removes
passing dbi/dbi2 base address as one of the arguments as the same can be
derived with in read and write APIs. Since dw_pcie_{readl/writel}_dbi()
APIs can't be used for ATU read/write as ATU base address could be
different from DBI base address, this patch attempts to implement
ATU read/write APIs using ATU base address without using
dw_pcie_{readl/writel}_dbi() APIs.

Signed-off-by: Vidya Sagar 
---
Changes from v7:
* Based on suggestion from Jingoo Han, moved implementation of readl, writel 
for ATU
  region to separate APIs dw_pcie_{read/write}_atu() in pcie-designware.c file 
and
  calling them from pcie-designware.h file.

Changes from v6:
* Modified ATU read/write APIs to use implementation specific DBI read/write
  APIs if present.

Changes from v5:
* Removed passing base address as one of the arguments as the same can be 
derived within
  the API itself.
* Modified ATU read/write APIs to call dw_pcie_{write/read}() API

Changes from v4:
* This is a new patch in this series

 drivers/pci/controller/dwc/pcie-designware.c | 28 +--
 drivers/pci/controller/dwc/pcie-designware.h | 51 +---
 2 files changed, 45 insertions(+), 34 deletions(-)

diff --git a/drivers/pci/controller/dwc/pcie-designware.c 
b/drivers/pci/controller/dwc/pcie-designware.c
index 9d7c51c32b3b..665a76f11318 100644
--- a/drivers/pci/controller/dwc/pcie-designware.c
+++ b/drivers/pci/controller/dwc/pcie-designware.c
@@ -52,68 +52,95 @@ int dw_pcie_write(void __iomem *addr, int size, u32 val)
return PCIBIOS_SUCCESSFUL;
 }
 
-u32 __dw_pcie_read_dbi(struct dw_pcie *pci, void __iomem *base, u32 reg,
-  size_t size)
+u32 dw_pcie_read_dbi(struct dw_pcie *pci, u32 reg, size_t size)
 {
int ret;
u32 val;
 
if (pci->ops->read_dbi)
-   return pci->ops->read_dbi(pci, base, reg, size);
+   return pci->ops->read_dbi(pci, pci->dbi_base, reg, size);
 
-   ret = dw_pcie_read(base + reg, size, );
+   ret = dw_pcie_read(pci->dbi_base + reg, size, );
if (ret)
dev_err(pci->dev, "Read DBI address failed\n");
 
return val;
 }
 
-void __dw_pcie_write_dbi(struct dw_pcie *pci, void __iomem *base, u32 reg,
-size_t size, u32 val)
+void dw_pcie_write_dbi(struct dw_pcie *pci, u32 reg, size_t size, u32 val)
 {
int ret;
 
if (pci->ops->write_dbi) {
-   pci->ops->write_dbi(pci, base, reg, size, val);
+   pci->ops->write_dbi(pci, pci->dbi_base, reg, size, val);
return;
}
 
-   ret = dw_pcie_write(base + reg, size, val);
+   ret = dw_pcie_write(pci->dbi_base + reg, size, val);
if (ret)
dev_err(pci->dev, "Write DBI address failed\n");
 }
 
-u32 __dw_pcie_read_dbi2(struct dw_pcie *pci, void __iomem *base, u32 reg,
-   size_t size)
+u32 dw_pcie_read_dbi2(struct dw_pcie *pci, u32 reg, size_t size)
 {
int ret;
u32 val;
 
if (pci->ops->read_dbi2)
-   return pci->ops->read_dbi2(pci, base, reg, size);
+   return pci->ops->read_dbi2(pci, pci->dbi_base2, reg, size);
 
-   ret = dw_pcie_read(base + reg, size, );
+   ret = dw_pcie_read(pci->dbi_base2 + reg, size, );
if (ret)
dev_err(pci->dev, "read DBI address failed\n");
 
return val;
 }
 
-void __dw_pcie_write_dbi2(struct dw_pcie *pci, void __iomem *base, u32 reg,
- size_t size, u32 val)
+void dw_pcie_write_dbi2(struct dw_pcie *pci, u32 reg, size_t size, u32 val)
 {
int ret;
 
if (pci->ops->write_dbi2) {
-   pci->ops->write_dbi2(pci, base, reg, size, val);
+   pci->ops->write_dbi2(pci, pci->dbi_base2, reg, size, val);
return;
}
 
-   ret = dw_pcie_write(base + reg, size, val);
+   ret = dw_pcie_write(pci->dbi_base2 + reg, size, val);
if (ret)
dev_err(pci->dev, "write DBI address failed\n");
 }
 
+u32 dw_pcie_read_atu(struct dw_pcie *pci, u32 reg, size_t size)
+{
+   int ret;
+   u32 val;
+
+   if (pci->ops->read_dbi)
+   return pci->ops->read_dbi(pci, pci->atu_base, reg, size);
+
+   ret = dw_pcie_read(pci->atu_base + reg, size, );
+   if (ret)
+   dev_err(pci->dev, "Read ATU address failed\n");
+
+   return val;
+}
+EXPORT_SYMBOL_GPL(dw_pcie_read_atu);
+
+void dw_pcie_write_atu(struct dw_pcie *pci, u32 reg, size_t size, u32 val)
+{
+   int ret;
+
+   if (pci->ops->write_dbi) {
+   pci->ops->write_dbi(pci, pci->atu_base, reg, size, val);
+   return;
+   }
+
+   ret = dw_pcie_write(pci->atu_base + reg, size, val);
+   if (ret)
+

[PATCH V8 1/3] PCI: dwc: Add API support to de-initialize host

2019-06-23 Thread Vidya Sagar

Add an API to group all the tasks to be done to de-initialize host which
can then be called by any DesignWare core based driver implementations
while adding .remove() support in their respective drivers.

Signed-off-by: Vidya Sagar 
Acked-by: Gustavo Pimentel 
---
Changes from v7:
* None

Changes from v6:
* None

Changes from v5:
* None

Changes from v4:
* None

Changes from v3:
* Added check if (pci_msi_enabled() && !pp->ops->msi_host_init) before calling
  dw_pcie_free_msi() API to mimic init path

Changes from v2:
* Rebased on top of linux-next top of the tree branch

Changes from v1:
* s/Designware/DesignWare

 drivers/pci/controller/dwc/pcie-designware-host.c | 8 
 drivers/pci/controller/dwc/pcie-designware.h  | 5 +
 2 files changed, 13 insertions(+)

diff --git a/drivers/pci/controller/dwc/pcie-designware-host.c 
b/drivers/pci/controller/dwc/pcie-designware-host.c
index 77db32529319..d069e4290180 100644
--- a/drivers/pci/controller/dwc/pcie-designware-host.c
+++ b/drivers/pci/controller/dwc/pcie-designware-host.c
@@ -496,6 +496,14 @@ int dw_pcie_host_init(struct pcie_port *pp)
return ret;
 }
 
+void dw_pcie_host_deinit(struct pcie_port *pp)
+{
+   pci_stop_root_bus(pp->root_bus);
+   pci_remove_root_bus(pp->root_bus);
+   if (pci_msi_enabled() && !pp->ops->msi_host_init)
+   dw_pcie_free_msi(pp);
+}
+
 static int dw_pcie_access_other_conf(struct pcie_port *pp, struct pci_bus *bus,
 u32 devfn, int where, int size, u32 *val,
 bool write)
diff --git a/drivers/pci/controller/dwc/pcie-designware.h 
b/drivers/pci/controller/dwc/pcie-designware.h
index b8993f2b78df..14762e262758 100644
--- a/drivers/pci/controller/dwc/pcie-designware.h
+++ b/drivers/pci/controller/dwc/pcie-designware.h
@@ -351,6 +351,7 @@ void dw_pcie_msi_init(struct pcie_port *pp);
 void dw_pcie_free_msi(struct pcie_port *pp);
 void dw_pcie_setup_rc(struct pcie_port *pp);
 int dw_pcie_host_init(struct pcie_port *pp);
+void dw_pcie_host_deinit(struct pcie_port *pp);
 int dw_pcie_allocate_domains(struct pcie_port *pp);
 #else
 static inline irqreturn_t dw_handle_msi_irq(struct pcie_port *pp)
@@ -375,6 +376,10 @@ static inline int dw_pcie_host_init(struct pcie_port *pp)
return 0;
 }
 
+static inline void dw_pcie_host_deinit(struct pcie_port *pp)
+{
+}
+
 static inline int dw_pcie_allocate_domains(struct pcie_port *pp)
 {
return 0;
-- 
2.17.1

[PATCH V8 3/3] PCI: dwc: Export APIs to support .remove() implementation

2019-06-23 Thread Vidya Sagar

Export all configuration space access APIs and also other APIs to
support host controller drivers of DesignWare core based implementations
while adding support for .remove() hook to build their respective drivers
as modules

Signed-off-by: Vidya Sagar 
Acked-by: Gustavo Pimentel 
---
Changes from v7:
* None

Changes from v6:
* None

Changes from v5:
* None

Changes from v4:
* Removed __ (underscore) from dw_pcie_{write/read}_dbi API names

Changes from v3:
* Exported only __dw_pcie_{read/write}_dbi() APIs instead of
  dw_pcie_read{l/w/b}_dbi & dw_pcie_write{l/w/b}_dbi APIs.

Changes from v2:
* Rebased on top of linux-next top of the tree branch

Changes from v1:
* s/Designware/DesignWare

 drivers/pci/controller/dwc/pcie-designware-host.c | 4 
 drivers/pci/controller/dwc/pcie-designware.c  | 4 
 2 files changed, 8 insertions(+)

diff --git a/drivers/pci/controller/dwc/pcie-designware-host.c 
b/drivers/pci/controller/dwc/pcie-designware-host.c
index d069e4290180..f93252d0da5b 100644
--- a/drivers/pci/controller/dwc/pcie-designware-host.c
+++ b/drivers/pci/controller/dwc/pcie-designware-host.c
@@ -311,6 +311,7 @@ void dw_pcie_msi_init(struct pcie_port *pp)
dw_pcie_wr_own_conf(pp, PCIE_MSI_ADDR_HI, 4,
upper_32_bits(msi_target));
 }
+EXPORT_SYMBOL_GPL(dw_pcie_msi_init);
 
 int dw_pcie_host_init(struct pcie_port *pp)
 {
@@ -495,6 +496,7 @@ int dw_pcie_host_init(struct pcie_port *pp)
dw_pcie_free_msi(pp);
return ret;
 }
+EXPORT_SYMBOL_GPL(dw_pcie_host_init);
 
 void dw_pcie_host_deinit(struct pcie_port *pp)
 {
@@ -503,6 +505,7 @@ void dw_pcie_host_deinit(struct pcie_port *pp)
if (pci_msi_enabled() && !pp->ops->msi_host_init)
dw_pcie_free_msi(pp);
 }
+EXPORT_SYMBOL_GPL(dw_pcie_host_deinit);
 
 static int dw_pcie_access_other_conf(struct pcie_port *pp, struct pci_bus *bus,
 u32 devfn, int where, int size, u32 *val,
@@ -695,3 +698,4 @@ void dw_pcie_setup_rc(struct pcie_port *pp)
val |= PORT_LOGIC_SPEED_CHANGE;
dw_pcie_wr_own_conf(pp, PCIE_LINK_WIDTH_SPEED_CONTROL, 4, val);
 }
+EXPORT_SYMBOL_GPL(dw_pcie_setup_rc);
diff --git a/drivers/pci/controller/dwc/pcie-designware.c 
b/drivers/pci/controller/dwc/pcie-designware.c
index 665a76f11318..b832a49de9c0 100644
--- a/drivers/pci/controller/dwc/pcie-designware.c
+++ b/drivers/pci/controller/dwc/pcie-designware.c
@@ -34,6 +34,7 @@ int dw_pcie_read(void __iomem *addr, int size, u32 *val)
 
return PCIBIOS_SUCCESSFUL;
 }
+EXPORT_SYMBOL_GPL(dw_pcie_read);
 
 int dw_pcie_write(void __iomem *addr, int size, u32 val)
 {
@@ -51,6 +52,7 @@ int dw_pcie_write(void __iomem *addr, int size, u32 val)
 
return PCIBIOS_SUCCESSFUL;
 }
+EXPORT_SYMBOL_GPL(dw_pcie_write);
 
 u32 dw_pcie_read_dbi(struct dw_pcie *pci, u32 reg, size_t size)
 {
@@ -66,6 +68,7 @@ u32 dw_pcie_read_dbi(struct dw_pcie *pci, u32 reg, size_t 
size)
 
return val;
 }
+EXPORT_SYMBOL_GPL(dw_pcie_read_dbi);
 
 void dw_pcie_write_dbi(struct dw_pcie *pci, u32 reg, size_t size, u32 val)
 {
@@ -80,6 +83,7 @@ void dw_pcie_write_dbi(struct dw_pcie *pci, u32 reg, size_t 
size, u32 val)
if (ret)
dev_err(pci->dev, "Write DBI address failed\n");
 }
+EXPORT_SYMBOL_GPL(dw_pcie_write_dbi);
 
 u32 dw_pcie_read_dbi2(struct dw_pcie *pci, u32 reg, size_t size)
 {
-- 
2.17.1

[PATCH bpf-next] MAINTAINERS: add reviewer to maintainers entry

2019-06-23 Thread Björn Töpel

From: Björn Töpel 

Jonathan Lemon has volunteered as an official AF_XDP reviewer. Thank
you, Jonathan!

Signed-off-by: Björn Töpel 
---
 MAINTAINERS | 1 +
 1 file changed, 1 insertion(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 0cfe98a6761a..dd875578d53c 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -17284,6 +17284,7 @@ N:  xdp
 XDP SOCKETS (AF_XDP)
 M: Björn Töpel 
 M: Magnus Karlsson 
+R: Jonathan Lemon 
 L: net...@vger.kernel.org
 L: b...@vger.kernel.org
 S: Maintained
-- 
2.20.1

Reminder: 25 open syzbot bugs in kvm subsystem

2019-06-23 Thread Eric Biggers

[This email was generated by a script.  Let me know if you have any suggestions
to make it better.]

Of the currently open syzbot reports against the upstream kernel, I've manually
marked 25 of them as possibly being bugs in the kvm subsystem.  I've listed
these reports below, sorted by an algorithm that tries to list first the reports
most likely to be still valid, important, and actionable.

Of these 25 bugs, 4 were seen in mainline in the last week.

If you believe a bug is no longer valid, please close the syzbot report by
sending a '#syz fix', '#syz dup', or '#syz invalid' command in reply to the
original thread, as explained at https://goo.gl/tpsmEJ#status

If you believe I misattributed a bug to the kvm subsystem, please let me know,
and if possible forward the report to the correct people or mailing list.

Here are the bugs:


Title:  unexpected kernel reboot (3)
Last occurred:  0 days ago
Reported:   345 days ago
Branches:   Mainline and others
Dashboard link: 
https://syzkaller.appspot.com/bug?id=321861b1588b44d064b779b92293c5d55cfe8430
Original thread:
https://lkml.kernel.org/lkml/eb546f0570e84...@google.com/T/#u

This bug has a C reproducer.

The original thread for this bug received 2 replies; the last was 342 days ago.

If you fix this bug, please add the following tag to the commit:
Reported-by: syzbot+cce9ef2dd25246f81...@syzkaller.appspotmail.com

If you send any email or patch for this bug, please consider replying to the
original thread.  For the git send-email command to use, or tips on how to reply
if the thread isn't in your mailbox, see the "Reply instructions" at
https://lkml.kernel.org/r/eb546f0570e84...@google.com


Title:  WARNING in kvm_arch_vcpu_ioctl_run (3)
Last occurred:  6 days ago
Reported:   452 days ago
Branches:   Mainline and others
Dashboard link: 
https://syzkaller.appspot.com/bug?id=4d7de0e6a195b6a5ffef01d2776e737a52c7de60
Original thread:
https://lkml.kernel.org/lkml/d05a78056873b...@google.com/T/#u

This bug has a C reproducer.

syzbot has bisected this bug, but I think the bisection result is incorrect.

The original thread for this bug received 1 reply, 452 days ago.

If you fix this bug, please add the following tag to the commit:
Reported-by: syzbot+760a73552f47a8cd0...@syzkaller.appspotmail.com

If you send any email or patch for this bug, please consider replying to the
original thread.  For the git send-email command to use, or tips on how to reply
if the thread isn't in your mailbox, see the "Reply instructions" at
https://lkml.kernel.org/r/d05a78056873b...@google.com


Title:  INFO: rcu detected stall in kvm_vcpu_ioctl
Last occurred:  6 days ago
Reported:   285 days ago
Branches:   Mainline and others
Dashboard link: 
https://syzkaller.appspot.com/bug?id=ab7b91f104d7f018e85924d8d109ec7f895d8b61
Original thread:
https://lkml.kernel.org/lkml/e0d7940575921...@google.com/T/#u

This bug has a syzkaller reproducer only.

No one replied to the original thread for this bug.

If you fix this bug, please add the following tag to the commit:
Reported-by: syzbot+e9b1e8f574404b6e4...@syzkaller.appspotmail.com

If you send any email or patch for this bug, please consider replying to the
original thread.  For the git send-email command to use, or tips on how to reply
if the thread isn't in your mailbox, see the "Reply instructions" at
https://lkml.kernel.org/r/e0d7940575921...@google.com


Title:  BUG: unable to handle kernel paging request in 
init_srcu_struct_fields
Last occurred:  0 days ago
Reported:   174 days ago
Branches:   Mainline and others
Dashboard link: 
https://syzkaller.appspot.com/bug?id=213ca2ed63e07dd093373791a18f27ad08e91820
Original thread:
https://lkml.kernel.org/lkml/23f74b057e4c0...@google.com/T/#u

Unfortunately, this bug does not have a reproducer.

No one replied to the original thread for this bug.

If you fix this bug, please add the following tag to the commit:
Reported-by: syzbot+010232b93d20ef8ab...@syzkaller.appspotmail.com

If you send any email or patch for this bug, please consider replying to the
original thread.  For the git send-email command to use, or tips on how to reply
if the thread isn't in your mailbox, see the "Reply instructions" at
https://lkml.kernel.org/r/23f74b057e4c0...@google.com


Title:  KASAN: use-after-free Read in do_general_protection
Last

Re: [PATCH] mm/hugetlb: allow gigantic page allocation to migrate away smaller huge page

2019-06-23 Thread Anshuman Khandual

On 06/24/2019 09:51 AM, Pingfan Liu wrote:
> The current pfn_range_valid_gigantic() rejects the pud huge page allocation
> if there is a pmd huge page inside the candidate range.
> 
> But pud huge resource is more rare, which should align on 1GB on x86. It is
> worth to allow migrating away pmd huge page to make room for a pud huge
> page.
> 
> The same logic is applied to pgd and pud huge pages.

The huge page in the range can either be a THP or HugeTLB and migrating them has
different costs and chances of success. THP migration will involve splitting if
THP migration is not enabled and all related TLB related costs. Are you sure
that a PUD HugeTLB allocation really should go through these ? Is there any
guarantee that after migration of multiple PMD sized THP/HugeTLB pages on the
given range, the allocation request for PUD will succeed ?

> 
> Signed-off-by: Pingfan Liu 
> Cc: Mike Kravetz 
> Cc: Oscar Salvador 
> Cc: David Hildenbrand 
> Cc: Andrew Morton 
> Cc: linux-kernel@vger.kernel.org
> ---
>  mm/hugetlb.c | 8 +---
>  1 file changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index ac843d3..02d1978 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -1081,7 +1081,11 @@ static bool pfn_range_valid_gigantic(struct zone *z,
>   unsigned long start_pfn, unsigned long nr_pages)
>  {
>   unsigned long i, end_pfn = start_pfn + nr_pages;
> - struct page *page;
> + struct page *page = pfn_to_page(start_pfn);
> +
> + if (PageHuge(page))
> + if (compound_order(compound_head(page)) >= nr_pages)
> + return false;
>  
>   for (i = start_pfn; i < end_pfn; i++) {
>   if (!pfn_valid(i))
> @@ -1098,8 +1102,6 @@ static bool pfn_range_valid_gigantic(struct zone *z,
>   if (page_count(page) > 0)
>   return false;
>  
> - if (PageHuge(page))
> - return false;
>   }
>  
>   return true;
> 

So except in the case where there is a bigger huge page in the range this will
attempt migrating everything on the way. As mentioned before if it all this is
a good idea, it needs to differentiate between HugeTLB and THP and also take
into account costs of migrations and chance of subsequence allocation attempt
into account.

[PATCH V2] net: ethernet: ti: cpsw: Fix suspend/resume break

2019-06-23 Thread Keerthy

Commit bfe59032bd6127ee190edb30be9381a01765b958 ("net: ethernet:
ti: cpsw: use cpsw as drv data")changes
the driver data to struct cpsw_common *cpsw. This is done
only in probe/remove but the suspend/resume functions are
still left with struct net_device *ndev. Hence fix both
suspend & resume also to fetch the updated driver data.

Fixes: bfe59032bd6127ee1 ("net: ethernet: ti: cpsw: use cpsw as drv data")
Signed-off-by: Keerthy 
---

Change in v2:

  * Added NULL Checks for cpsw->slaves[i].ndev in suspend/resume functions.

 drivers/net/ethernet/ti/cpsw.c | 30 +-
 1 file changed, 9 insertions(+), 21 deletions(-)

diff --git a/drivers/net/ethernet/ti/cpsw.c b/drivers/net/ethernet/ti/cpsw.c
index 7bdd287074fc..32b7b3b74a6b 100644
--- a/drivers/net/ethernet/ti/cpsw.c
+++ b/drivers/net/ethernet/ti/cpsw.c
@@ -2590,20 +2590,13 @@ static int cpsw_remove(struct platform_device *pdev)
 #ifdef CONFIG_PM_SLEEP
 static int cpsw_suspend(struct device *dev)
 {
-   struct net_device   *ndev = dev_get_drvdata(dev);
-   struct cpsw_common  *cpsw = ndev_to_cpsw(ndev);
-
-   if (cpsw->data.dual_emac) {
-   int i;
+   struct cpsw_common *cpsw = dev_get_drvdata(dev);
+   int i;
 
-   for (i = 0; i < cpsw->data.slaves; i++) {
+   for (i = 0; i < cpsw->data.slaves; i++)
+   if (cpsw->slaves[i].ndev)
if (netif_running(cpsw->slaves[i].ndev))
cpsw_ndo_stop(cpsw->slaves[i].ndev);
-   }
-   } else {
-   if (netif_running(ndev))
-   cpsw_ndo_stop(ndev);
-   }
 
/* Select sleep pin state */
pinctrl_pm_select_sleep_state(dev);
@@ -2613,25 +2606,20 @@ static int cpsw_suspend(struct device *dev)
 
 static int cpsw_resume(struct device *dev)
 {
-   struct net_device   *ndev = dev_get_drvdata(dev);
-   struct cpsw_common  *cpsw = ndev_to_cpsw(ndev);
+   struct cpsw_common *cpsw = dev_get_drvdata(dev);
+   int i;
 
/* Select default pin state */
pinctrl_pm_select_default_state(dev);
 
/* shut up ASSERT_RTNL() warning in netif_set_real_num_tx/rx_queues */
rtnl_lock();
-   if (cpsw->data.dual_emac) {
-   int i;
 
-   for (i = 0; i < cpsw->data.slaves; i++) {
+   for (i = 0; i < cpsw->data.slaves; i++)
+   if (cpsw->slaves[i].ndev)
if (netif_running(cpsw->slaves[i].ndev))
cpsw_ndo_open(cpsw->slaves[i].ndev);
-   }
-   } else {
-   if (netif_running(ndev))
-   cpsw_ndo_open(ndev);
-   }
+
rtnl_unlock();
 
return 0;
-- 
2.17.1

Reminder: 27 open syzbot bugs in bluetooth subsystem

2019-06-23 Thread Eric Biggers

[This email was generated by a script.  Let me know if you have any suggestions
to make it better.]

Of the currently open syzbot reports against the upstream kernel, I've manually
marked 27 of them as possibly being bugs in the bluetooth subsystem.  I've
listed these reports below, sorted by an algorithm that tries to list first the
reports most likely to be still valid, important, and actionable.

Of these 27 bugs, 12 were seen in mainline in the last week.

Of these 27 bugs, 3 were bisected to commits from the following people:

Loic Poulain 
Ben Young Tae Kim 

If you believe a bug is no longer valid, please close the syzbot report by
sending a '#syz fix', '#syz dup', or '#syz invalid' command in reply to the
original thread, as explained at https://goo.gl/tpsmEJ#status

If you believe I misattributed a bug to the bluetooth subsystem, please let me
know, and if possible forward the report to the correct people or mailing list.

Here are the bugs:


Title:  WARNING in tty_set_termios
Last occurred:  0 days ago
Reported:   162 days ago
Branches:   Mainline and others
Dashboard link: 
https://syzkaller.appspot.com/bug?id=2410d22f1d8e5984217329dd0884b01d99e3e48d
Original thread:
https://lkml.kernel.org/lkml/bcd434057f4eb...@google.com/T/#u

This bug has a C reproducer.

This bug was bisected to:

commit 162f812f23bab583f5d514ca0e4df67797ac9cdf
Author: Loic Poulain 
Date:   Mon Sep 19 14:29:27 2016 +

  Bluetooth: hci_uart: Add Marvell support

No one replied to the original thread for this bug.

If you fix this bug, please add the following tag to the commit:
Reported-by: syzbot+a950165cbb86bdd02...@syzkaller.appspotmail.com

If you send any email or patch for this bug, please consider replying to the
original thread.  For the git send-email command to use, or tips on how to reply
if the thread isn't in your mailbox, see the "Reply instructions" at
https://lkml.kernel.org/r/bcd434057f4eb...@google.com


Title:  WARNING: refcount bug in kobject_get
Last occurred:  1 day ago
Reported:   286 days ago
Branches:   Mainline and others
Dashboard link: 
https://syzkaller.appspot.com/bug?id=06c8522152c9325bf0f1a3dc5b33d1b95a47431f
Original thread:
https://lkml.kernel.org/lkml/37743205757f3...@google.com/T/#u

This bug has a C reproducer.

No one replied to the original thread for this bug.

If you fix this bug, please add the following tag to the commit:
Reported-by: syzbot+b74b8b6e712f33454...@syzkaller.appspotmail.com

If you send any email or patch for this bug, please consider replying to the
original thread.  For the git send-email command to use, or tips on how to reply
if the thread isn't in your mailbox, see the "Reply instructions" at
https://lkml.kernel.org/r/37743205757f3...@google.com


Title:  WARNING in kernfs_get
Last occurred:  0 days ago
Reported:   286 days ago
Branches:   Mainline and others
Dashboard link: 
https://syzkaller.appspot.com/bug?id=b52dec65c1aaaec9b3893458b13a3304303de321
Original thread:
https://lkml.kernel.org/lkml/f921ae05757f5...@google.com/T/#u

This bug has a C reproducer.

The original thread for this bug received 1 reply, 235 days ago.

If you fix this bug, please add the following tag to the commit:
Reported-by: syzbot+3dcb532381f98c86a...@syzkaller.appspotmail.com

If you send any email or patch for this bug, please consider replying to the
original thread.  For the git send-email command to use, or tips on how to reply
if the thread isn't in your mailbox, see the "Reply instructions" at
https://lkml.kernel.org/r/f921ae05757f5...@google.com


Title:  general protection fault in skb_put
Last occurred:  2 days ago
Reported:   139 days ago
Branches:   Mainline and others
Dashboard link: 
https://syzkaller.appspot.com/bug?id=9abc0fdcdea0effb7b27984dbc1f336155cdad3f
Original thread:
https://lkml.kernel.org/lkml/b9e68e0581142...@google.com/T/#u

This bug has a C reproducer.

syzbot has bisected this bug, but I think the bisection result is incorrect.

The original thread for this bug received 4 replies; the last was 103 days ago.

If you fix this bug, please add the following tag to the commit:
Reported-by: syzbot+65788f9af9d548443...@syzkaller.appspotmail.com

If you send any email or patch for this bug, please consider replying to the
original thread.  For the git send-email command to use, or tips on how to reply
if the thread isn't in your mailbox, see

[PATCH] mm/vmalloc: fix a compile warning in mm

2019-06-23 Thread Weitao Hou

mm/vmalloc.c: In function ‘pcpu_get_vm_areas’:
mm/vmalloc.c:976:4: warning: ‘lva’ may be used uninitialized in
this function [-Wmaybe-uninitialized]
insert_vmap_area_augment(lva, >rb_node,

Signed-off-by: Weitao Hou 
---
 mm/vmalloc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 4c9e150e5ad3..78c5617fdf3f 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -913,7 +913,7 @@ adjust_va_to_fit_type(struct vmap_area *va,
unsigned long nva_start_addr, unsigned long size,
enum fit_type type)
 {
-   struct vmap_area *lva;
+   struct vmap_area *lva = NULL;
 
if (type == FL_FIT_TYPE) {
/*
-- 
2.18.0

Reminder: 9 open syzbot bugs in sound subsystem

2019-06-23 Thread Eric Biggers

[This email was generated by a script.  Let me know if you have any suggestions
to make it better.]

Of the currently open syzbot reports against the upstream kernel, I've manually
marked 9 of them as possibly being bugs in the sound subsystem.  I've listed
these reports below, sorted by an algorithm that tries to list first the reports
most likely to be still valid, important, and actionable.

Of these 9 bugs, 1 was bisected to a commit from the following person:

Takashi Iwai 

If you believe a bug is no longer valid, please close the syzbot report by
sending a '#syz fix', '#syz dup', or '#syz invalid' command in reply to the
original thread, as explained at https://goo.gl/tpsmEJ#status

If you believe I misattributed a bug to the sound subsystem, please let me know,
and if possible forward the report to the correct people or mailing list.

Here are the bugs:


Title:  KASAN: slab-out-of-bounds Write in default_read_copy_kernel
Last occurred:  119 days ago
Reported:   195 days ago
Branches:   Mainline
Dashboard link: 
https://syzkaller.appspot.com/bug?id=04933ddeeb1b542edf54b88ceccdac34de747a40
Original thread:
https://lkml.kernel.org/lkml/4a6256057ca3b...@google.com/T/#u

This bug has a C reproducer.

This bug was bisected to:

commit 65766ee0bf7fe8b3be80e2e1c3ef54ad59b29476
Author: Takashi Iwai 
Date:   Fri Nov 9 10:59:45 2018 +

  ALSA: oss: Use kvzalloc() for local buffer allocations

The original thread for this bug received 1 reply, 96 days ago.

If you fix this bug, please add the following tag to the commit:
Reported-by: syzbot+12f17c177de05efea...@syzkaller.appspotmail.com

If you send any email or patch for this bug, please consider replying to the
original thread.  For the git send-email command to use, or tips on how to reply
if the thread isn't in your mailbox, see the "Reply instructions" at
https://lkml.kernel.org/r/4a6256057ca3b...@google.com


Title:  WARNING: proc registration bug in snd_info_card_register
Last occurred:  27 days ago
Reported:   72 days ago
Branches:   Mainline (with usb-fuzzer patches)
Dashboard link: 
https://syzkaller.appspot.com/bug?id=0cf36d8457554bf03c3cacc44d31ff145a0c1a11
Original thread:
https://lkml.kernel.org/lkml/7f693a058653d...@google.com/T/#u

This bug has a C reproducer.

No one has replied to the original thread for this bug yet.

This looks like a bug in a sound USB driver.

If you fix this bug, please add the following tag to the commit:
Reported-by: syzbot+2e782bf6a60d0fcb9...@syzkaller.appspotmail.com

If you send any email or patch for this bug, please consider replying to the
original thread.  For the git send-email command to use, or tips on how to reply
if the thread isn't in your mailbox, see the "Reply instructions" at
https://lkml.kernel.org/r/7f693a058653d...@google.com


Title:  WARNING in 
snd_usb_motu_microbookii_communicate/usb_submit_urb
Last occurred:  15 days ago
Reported:   12 days ago
Branches:   Mainline (with usb-fuzzer patches)
Dashboard link: 
https://syzkaller.appspot.com/bug?id=125081d1f7eba4b9b25f53aaae53176cd4abb2b7
Original thread:
https://lkml.kernel.org/lkml/acb99a058b0d5...@google.com/T/#u

This bug has a syzkaller reproducer only.

No one has replied to the original thread for this bug yet.

This looks like a bug in a sound USB driver.

If you fix this bug, please add the following tag to the commit:
Reported-by: syzbot+d952e5e28f5fb7718...@syzkaller.appspotmail.com

If you send any email or patch for this bug, please reply to the original
thread.  For the git send-email command to use, or tips on how to reply if the
thread isn't in your mailbox, see the "Reply instructions" at
https://lkml.kernel.org/r/acb99a058b0d5...@google.com


Title:  INFO: rcu detected stall in snd_seq_write
Last occurred:  57 days ago
Reported:   300 days ago
Branches:   Mainline and others
Dashboard link: 
https://syzkaller.appspot.com/bug?id=33501520944e11adedf1c454eec4cb818bee16c8
Original thread:
https://lkml.kernel.org/lkml/e5050205746dc...@google.com/T/#u

This bug has a syzkaller reproducer only.

The original thread for this bug received 1 reply, 300 days ago.

If you fix this bug, please add the following tag to the commit:
Reported-by: syzbot+97aae04ce27e39cbf...@syzkaller.appspotmail.com

If you send any email or patch for this bug, please consider replying to the
original thread.  For the git send-email command to use,

Re: [PATCH] mm/hugetlb: allow gigantic page allocation to migrate away smaller huge page

2019-06-23 Thread Ira Weiny

On Mon, Jun 24, 2019 at 12:21:08PM +0800, Pingfan Liu wrote:
> The current pfn_range_valid_gigantic() rejects the pud huge page allocation
> if there is a pmd huge page inside the candidate range.
> 
> But pud huge resource is more rare, which should align on 1GB on x86. It is
> worth to allow migrating away pmd huge page to make room for a pud huge
> page.
> 
> The same logic is applied to pgd and pud huge pages.

I'm sorry but I don't quite understand why we should do this.  Is this a bug or
an optimization?  It sounds like an optimization.

> 
> Signed-off-by: Pingfan Liu 
> Cc: Mike Kravetz 
> Cc: Oscar Salvador 
> Cc: David Hildenbrand 
> Cc: Andrew Morton 
> Cc: linux-kernel@vger.kernel.org
> ---
>  mm/hugetlb.c | 8 +---
>  1 file changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index ac843d3..02d1978 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -1081,7 +1081,11 @@ static bool pfn_range_valid_gigantic(struct zone *z,
>   unsigned long start_pfn, unsigned long nr_pages)
>  {
>   unsigned long i, end_pfn = start_pfn + nr_pages;
> - struct page *page;
> + struct page *page = pfn_to_page(start_pfn);
> +
> + if (PageHuge(page))
> + if (compound_order(compound_head(page)) >= nr_pages)

I don't think you want compound_order() here.

Ira

> + return false;
>  
>   for (i = start_pfn; i < end_pfn; i++) {
>   if (!pfn_valid(i))
> @@ -1098,8 +1102,6 @@ static bool pfn_range_valid_gigantic(struct zone *z,
>   if (page_count(page) > 0)
>   return false;
>  
> - if (PageHuge(page))
> - return false;
>   }
>  
>   return true;
> -- 
> 2.7.5
>

Reminder: 30 open syzbot bugs in "net/bpf" subsystem

2019-06-23 Thread Eric Biggers

[This email was generated by a script.  Let me know if you have any suggestions
to make it better.]

Of the currently open syzbot reports against the upstream kernel, I've manually
marked 30 of them as possibly being bugs in the "net/bpf" subsystem.  I've
listed these reports below, sorted by an algorithm that tries to list first the
reports most likely to be still valid, important, and actionable.

Of these 30 bugs, 14 were seen in mainline in the last week.

Of these 30 bugs, 8 were bisected to commits from the following people:

John Fastabend 
Daniel Borkmann 
Alexei Starovoitov 

If you believe a bug is no longer valid, please close the syzbot report by
sending a '#syz fix', '#syz dup', or '#syz invalid' command in reply to the
original thread, as explained at https://goo.gl/tpsmEJ#status

If you believe I misattributed a bug to the "net/bpf" subsystem, please let me
know, and if possible forward the report to the correct people or mailing list.

Here are the bugs:


Title:  WARNING in bpf_jit_free
Last occurred:  0 days ago
Reported:   342 days ago
Branches:   Mainline and others
Dashboard link: 
https://syzkaller.appspot.com/bug?id=d04f9c2ec11ab2678f7427795ff5170cb9eb2220
Original thread:
https://lkml.kernel.org/lkml/e92d1805711f5...@google.com/T/#u

This bug has a C reproducer.

syzbot has bisected this bug, but I think the bisection result is incorrect.

The original thread for this bug received 5 replies; the last was 12 days ago.

If you fix this bug, please add the following tag to the commit:
Reported-by: syzbot+2ff1e7cb738fd3c41...@syzkaller.appspotmail.com

If you send any email or patch for this bug, please reply to the original
thread, which had activity only 12 days ago.  For the git send-email command to
use, or tips on how to reply if the thread isn't in your mailbox, see the "Reply
instructions" at 
https://lkml.kernel.org/r/e92d1805711f5...@google.com


Title:  BUG: unable to handle kernel paging request in 
bpf_prog_kallsyms_add
Last occurred:  0 days ago
Reported:   286 days ago
Branches:   Mainline and others
Dashboard link: 
https://syzkaller.appspot.com/bug?id=97f89d84d528e4f5150dcfbdeb97347bc8471e96
Original thread:
https://lkml.kernel.org/lkml/9417ef0575802...@google.com/T/#u

This bug has a syzkaller reproducer only.

The original thread for this bug received 2 replies; the last was 111 days ago.

If you fix this bug, please add the following tag to the commit:
Reported-by: syzbot+c827a78260579449a...@syzkaller.appspotmail.com

If you send any email or patch for this bug, please consider replying to the
original thread.  For the git send-email command to use, or tips on how to reply
if the thread isn't in your mailbox, see the "Reply instructions" at
https://lkml.kernel.org/r/9417ef0575802...@google.com


Title:  KASAN: use-after-free Read in sk_psock_unlink
Last occurred:  0 days ago
Reported:   240 days ago
Branches:   Mainline and others
Dashboard link: 
https://syzkaller.appspot.com/bug?id=d691981726208716cc7aec231fb915e27763d662
Original thread:
https://lkml.kernel.org/lkml/fd342e05791cc...@google.com/T/#u

This bug has a syzkaller reproducer only.

syzbot has bisected this bug, but I think the bisection result is incorrect.

The original thread for this bug received 1 reply, 31 days ago.

If you fix this bug, please add the following tag to the commit:
Reported-by: syzbot+3acd9f67a6a157666...@syzkaller.appspotmail.com

If you send any email or patch for this bug, please consider replying to the
original thread.  For the git send-email command to use, or tips on how to reply
if the thread isn't in your mailbox, see the "Reply instructions" at
https://lkml.kernel.org/r/fd342e05791cc...@google.com


Title:  WARNING: kernel stack frame pointer has bad value (2)
Last occurred:  5 days ago
Reported:   342 days ago
Branches:   Mainline and others
Dashboard link: 
https://syzkaller.appspot.com/bug?id=02a32f98a4e3b5a2ed6929aabdd28dd1618b9c03
Original thread:
https://lkml.kernel.org/lkml/0956640571197...@google.com/T/#u

This bug has a C reproducer.

The original thread for this bug received 1 reply, 342 days ago.

If you fix this bug, please add the following tag to the commit:
Reported-by: syzbot+903cdd6bce9a6eb83...@syzkaller.appspotmail.com

If you send any email or patch for this bug, please consider replying to the
original thread.  For the git send-email command to use, or tips on

Re: [RESEND,v2] cpufreq: s5pv210: Don't flood kernel log after cpufreq change

2019-06-23 Thread Viresh Kumar

On 21-06-19, 12:10, Paweł Chmiel wrote:
> This commit replaces printk with pr_debug, so we don't flood kernel log.
> 
> Signed-off-by: Paweł Chmiel 
> Acked-by: Krzysztof Kozlowski 
> ---
> Changes from v1:
>   - Added Acked-by
> ---
>  drivers/cpufreq/s5pv210-cpufreq.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Applied. Thanks.

-- 
viresh

Re: [PATCH] Bluetooth: btrtl: HCI reset on close for RTL8822BE

2019-06-23 Thread Daniel Drake

Hi Jian-Hong,

On Fri, Jun 21, 2019 at 4:59 PM Jian-Hong Pan  wrote:
> Realtek RTL8822BE BT chip on ASUS X420FA cannot be turned on correctly
> after on-off several times.  Bluetooth daemon sets BT mode failed when
> this issue happens.
>
> bluetoothd[1576]: Failed to set mode: Failed (0x03)
>
> If BT is tunred off, then turned on again, it works correctly again.
> This patch makes RTL8822BE BT reset on close to fix this issue.

I know we've been trying to understand why Realtek's own bluetooth
driver avoids this bug, but is this solution based upon code in the
vendor driver?
At a glance I can't see the flag (or equivalent) being set there.

Daniel

Re: [PATCH -mm] mm, swap: Fix THP swap out

2019-06-23 Thread Huang, Ying

Ming Lei  writes:

> Hi Huang Ying,
>
> On Mon, Jun 24, 2019 at 10:23:36AM +0800, Huang, Ying wrote:
>> From: Huang Ying 
>> 
>> 0-Day test system reported some OOM regressions for several
>> THP (Transparent Huge Page) swap test cases.  These regressions are
>> bisected to 6861428921b5 ("block: always define BIO_MAX_PAGES as
>> 256").  In the commit, BIO_MAX_PAGES is set to 256 even when THP swap
>> is enabled.  So the bio_alloc(gfp_flags, 512) in get_swap_bio() may
>> fail when swapping out THP.  That causes the OOM.
>> 
>> As in the patch description of 6861428921b5 ("block: always define
>> BIO_MAX_PAGES as 256"), THP swap should use multi-page bvec to write
>> THP to swap space.  So the issue is fixed via doing that in
>> get_swap_bio().
>> 
>> BTW: I remember I have checked the THP swap code when
>> 6861428921b5 ("block: always define BIO_MAX_PAGES as 256") was merged,
>> and thought the THP swap code needn't to be changed.  But apparently,
>> I was wrong.  I should have done this at that time.
>> 
>> Fixes: 6861428921b5 ("block: always define BIO_MAX_PAGES as 256")
>> Signed-off-by: "Huang, Ying" 
>> Cc: Ming Lei 
>> Cc: Michal Hocko 
>> Cc: Johannes Weiner 
>> Cc: Hugh Dickins 
>> Cc: Minchan Kim 
>> Cc: Rik van Riel 
>> Cc: Daniel Jordan 
>> ---
>>  mm/page_io.c | 7 ++-
>>  1 file changed, 2 insertions(+), 5 deletions(-)
>> 
>> diff --git a/mm/page_io.c b/mm/page_io.c
>> index 2e8019d0e048..4ab997f84061 100644
>> --- a/mm/page_io.c
>> +++ b/mm/page_io.c
>> @@ -29,10 +29,9 @@
>>  static struct bio *get_swap_bio(gfp_t gfp_flags,
>>  struct page *page, bio_end_io_t end_io)
>>  {
>> -int i, nr = hpage_nr_pages(page);
>>  struct bio *bio;
>>  
>> -bio = bio_alloc(gfp_flags, nr);
>> +bio = bio_alloc(gfp_flags, 1);
>>  if (bio) {
>>  struct block_device *bdev;
>>  
>> @@ -41,9 +40,7 @@ static struct bio *get_swap_bio(gfp_t gfp_flags,
>>  bio->bi_iter.bi_sector <<= PAGE_SHIFT - 9;
>>  bio->bi_end_io = end_io;
>>  
>> -for (i = 0; i < nr; i++)
>> -bio_add_page(bio, page + i, PAGE_SIZE, 0);
>
> bio_add_page() supposes to work, just wondering why it doesn't recently.

Yes.  Just checked and bio_add_page() works too.  I should have used
that.  The problem isn't bio_add_page(), but bio_alloc(), because nr ==
512 > 256, mempool cannot be used during swapout, so swapout will fail.

Best Regards,
Huang, Ying

> Could you share me one test case for reproducing it?
>
>> -VM_BUG_ON(bio->bi_iter.bi_size != PAGE_SIZE * nr);
>> +__bio_add_page(bio, page, PAGE_SIZE * hpage_nr_pages(page), 0);
>>  }
>>  return bio;
>
> Actually the above code can be simplified as:
>
> diff --git a/mm/page_io.c b/mm/page_io.c
> index 2e8019d0e048..c20b4189d0a1 100644
> --- a/mm/page_io.c
> +++ b/mm/page_io.c
> @@ -29,7 +29,7 @@
>  static struct bio *get_swap_bio(gfp_t gfp_flags,
>   struct page *page, bio_end_io_t end_io)
>  {
> - int i, nr = hpage_nr_pages(page);
> + int nr = hpage_nr_pages(page);
>   struct bio *bio;
>  
>   bio = bio_alloc(gfp_flags, nr);
> @@ -41,8 +41,7 @@ static struct bio *get_swap_bio(gfp_t gfp_flags,
>   bio->bi_iter.bi_sector <<= PAGE_SHIFT - 9;
>   bio->bi_end_io = end_io;
>  
> - for (i = 0; i < nr; i++)
> - bio_add_page(bio, page + i, PAGE_SIZE, 0);
> + bio_add_page(bio, page, PAGE_SIZE * nr, 0);
>   VM_BUG_ON(bio->bi_iter.bi_size != PAGE_SIZE * nr);
>   }
>   return bio;
>
>
> Thanks,
> Ming

Re: [PATCHv2] mm/gup: speed up check_and_migrate_cma_pages() on huge page

2019-06-23 Thread Ira Weiny

On Mon, Jun 24, 2019 at 12:12:41PM +0800, Pingfan Liu wrote:
> Both hugetlb and thp locate on the same migration type of pageblock, since
> they are allocated from a free_list[]. Based on this fact, it is enough to
> check on a single subpage to decide the migration type of the whole huge
> page. By this way, it saves (2M/4K - 1) times loop for pmd_huge on x86,
> similar on other archs.
> 
> Furthermore, when executing isolate_huge_page(), it avoid taking global
> hugetlb_lock many times, and meanless remove/add to the local link list
> cma_page_list.
> 
> Signed-off-by: Pingfan Liu 
> Cc: Andrew Morton 
> Cc: Ira Weiny 
> Cc: Mike Rapoport 
> Cc: "Kirill A. Shutemov" 
> Cc: Thomas Gleixner 
> Cc: John Hubbard 
> Cc: "Aneesh Kumar K.V" 
> Cc: Christoph Hellwig 
> Cc: Keith Busch 
> Cc: Mike Kravetz 
> Cc: Linux-kernel@vger.kernel.org
> ---
>  mm/gup.c | 19 ---
>  1 file changed, 12 insertions(+), 7 deletions(-)
> 
> diff --git a/mm/gup.c b/mm/gup.c
> index ddde097..544f5de 100644
> --- a/mm/gup.c
> +++ b/mm/gup.c
> @@ -1342,19 +1342,22 @@ static long check_and_migrate_cma_pages(struct 
> task_struct *tsk,
>   LIST_HEAD(cma_page_list);
>  
>  check_again:
> - for (i = 0; i < nr_pages; i++) {
> + for (i = 0; i < nr_pages;) {
> +
> + struct page *head = compound_head(pages[i]);
> + long step = 1;
> +
> + if (PageCompound(head))
> + step = compound_order(head) - (pages[i] - head);

Sorry if I missed this last time.  compound_order() is not correct here.

Ira

>   /*
>* If we get a page from the CMA zone, since we are going to
>* be pinning these entries, we might as well move them out
>* of the CMA zone if possible.
>*/
> - if (is_migrate_cma_page(pages[i])) {
> -
> - struct page *head = compound_head(pages[i]);
> -
> - if (PageHuge(head)) {
> + if (is_migrate_cma_page(head)) {
> + if (PageHuge(head))
>   isolate_huge_page(head, _page_list);
> - } else {
> + else {
>   if (!PageLRU(head) && drain_allow) {
>   lru_add_drain_all();
>   drain_allow = false;
> @@ -1369,6 +1372,8 @@ static long check_and_migrate_cma_pages(struct 
> task_struct *tsk,
>   }
>   }
>   }
> +
> + i += step;
>   }
>  
>   if (!list_empty(_page_list)) {
> -- 
> 2.7.5
>

Re: KASAN: user-memory-access Read in ip6_hold_safe (3)

2019-06-23 Thread Xin Long

On Mon, Jun 3, 2019 at 2:57 PM Dmitry Vyukov  wrote:
>
> On Sat, Jun 1, 2019 at 7:15 PM David Ahern  wrote:
> >
> > On 6/1/19 12:05 AM, syzbot wrote:
> > > Hello,
> > >
> > > syzbot found the following crash on:
> > >
> > > HEAD commit:dfb569f2 net: ll_temac: Fix compile error
> > > git tree:   net-next
> > syzbot team:
> >
> > Is there any way to know the history of syzbot runs to determine that
> > crash X did not happen at commit Y but does happen at commit Z? That
> > narrows the window when trying to find where a regression occurs.
>
> Hi David,
>
> All info is available on the dashboard:
>
> > dashboard link: https://syzkaller.appspot.com/bug?extid=a5b6e01ec8116d046842
>
> We don't keep any private info on top of that.
>
> This crash happened 129 times in the past 9 days. This suggests this
> is not a previous memory corruption, these usually happen at most few
> times.
> The first one was:
>
> 2019/05/24 15:33 net-next dfb569f2
>
> Then it was joined by bpf-next:
>
> ci-upstream-bpf-next-kasan-gce 2019/06/01 15:51 bpf-next 0462eaac
>
> Since it happens a dozen of times per day, most likely it was
> introduced into net-next around dfb569f2 (syzbot should do new builds
> every ~12h, minus broken trees).

I think all these pcpu memory corruptions can be marked as Fixed-by:

commit c3bcde026684c62d7a2b6f626dc7cf763833875c
Author: Xin Long 
Date:   Mon Jun 17 21:34:15 2019 +0800

tipc: pass tunnel dev as NULL to udp_tunnel(6)_xmit_skb

Re: [PATCH 3/3] tools: memory-model: Improve data-race detection

2019-06-23 Thread Paul E. McKenney

On Sun, Jun 23, 2019 at 11:15:06AM -0400, Alan Stern wrote:
> On Sun, 23 Jun 2019, Akira Yokosawa wrote:
> 
> > Hi Paul and Alan,
> > 
> > On 2019/06/22 8:54, Paul E. McKenney wrote:
> > > On Fri, Jun 21, 2019 at 10:25:23AM -0400, Alan Stern wrote:
> > >> On Fri, 21 Jun 2019, Andrea Parri wrote:
> > >>
> > >>> On Thu, Jun 20, 2019 at 11:55:58AM -0400, Alan Stern wrote:
> >  Herbert Xu recently reported a problem concerning RCU and compiler
> >  barriers.  In the course of discussing the problem, he put forth a
> >  litmus test which illustrated a serious defect in the Linux Kernel
> >  Memory Model's data-race-detection code.
> > 
> > I was not involved in the mail thread and wondering what the litmus test
> > looked like. Some searching of the archive has suggested that Alan presented
> > a properly formatted test based on Herbert's idea in [1].
> > 
> > [1]: 
> > https://lore.kernel.org/lkml/pine.lnx.4.44l0.1906041026570.1731-100...@iolanthe.rowland.org/
> 
> Yes, that's it.  The test is also available at:
> 
> https://github.com/paulmckrcu/litmus/blob/master/manual/plain/C-S-rcunoderef-2.litmus
> 
> Alan
> 
> > If this is the case, adding the link (or message id) in the change
> > log would help people see the circumstances, I suppose.
> > Paul, can you amend the change log?
> > 
> > I ran herd7 on said litmus test at both "lkmm" and "dev" of -rcu and
> > confirmed that this patch fixes the result.
> > 
> > So,
> > 
> > Tested-by: Akira Yokosawa 

Thank you both!  I will apply these changes tomorrow morning, Pacific Time.

Thanx, Paul

Re: [PATCH v7 5/6] mm,thp: add read-only THP support for (non-shmem) FS

2019-06-23 Thread Song Liu

Hi Hillf,

> On Jun 23, 2019, at 8:16 PM, Hillf Danton  wrote:
> 
> 
> Hello
> 
> On Sun, 23 Jun 2019 13:48:47 +0800 Song Liu wrote:
>> This patch is (hopefully) the first step to enable THP for non-shmem
>> filesystems.
>> 
>> This patch enables an application to put part of its text sections to THP
>> via madvise, for example:
>> 
>>madvise((void *)0x60, 0x20, MADV_HUGEPAGE);
>> 
>> We tried to reuse the logic for THP on tmpfs.
>> 
>> Currently, write is not supported for non-shmem THP. khugepaged will only
>> process vma with VM_DENYWRITE. The next patch will handle writes, which
>> would only happen when the vma with VM_DENYWRITE is unmapped.
>> 
>> An EXPERIMENTAL config, READ_ONLY_THP_FOR_FS, is added to gate this
>> feature.
>> 
>> Acked-by: Rik van Riel 
>> Signed-off-by: Song Liu 
>> ---
>> mm/Kconfig  | 11 ++
>> mm/filemap.c|  4 +--
>> mm/khugepaged.c | 90 -
>> mm/rmap.c   | 12 ---
>> 4 files changed, 96 insertions(+), 21 deletions(-)
>> 
>> diff --git a/mm/Kconfig b/mm/Kconfig
>> index f0c76ba47695..0a8fd589406d 100644
>> --- a/mm/Kconfig
>> +++ b/mm/Kconfig
>> @@ -762,6 +762,17 @@ config GUP_BENCHMARK
>> 
>>See tools/testing/selftests/vm/gup_benchmark.c
>> 
>> +config READ_ONLY_THP_FOR_FS
>> +bool "Read-only THP for filesystems (EXPERIMENTAL)"
>> +depends on TRANSPARENT_HUGE_PAGECACHE && SHMEM
>> +
> The ext4 mentioned in the cover letter, along with the subject line of
> this patch, suggests the scissoring of SHMEM.

We reuse khugepaged code for SHMEM, so the dependency does exist. 

Thanks,
Song

Re:Re: [PATCH v2] kexec: fix warnig of crash_zero_bytes in crash.c

2019-06-23 Thread Tiezhu Yang

At 2019-06-24 09:53:59, "Dave Young"  wrote:
>On 06/24/19 at 09:35am, Dave Young wrote:
>> On 06/23/19 at 06:24am, Tiezhu Yang wrote:
>> > Fix the following sparse warning:
>> > 
>> > arch/x86/kernel/crash.c:59:15:
>> > warning: symbol 'crash_zero_bytes' was not declared. Should it be static?
>> > 
>> > First, make crash_zero_bytes static. In addition, crash_zero_bytes
>> > is used when CONFIG_KEXEC_FILE is set, so make it only available
>> > under CONFIG_KEXEC_FILE. Otherwise, if CONFIG_KEXEC_FILE is not set,
>> > the following warning will appear when make crash_zero_bytes static:
>> > 
>> > arch/x86/kernel/crash.c:59:22:
>> > warning: ‘crash_zero_bytes’ defined but not used [-Wunused-variable]
>> > 
>> > Fixes: dd5f726076cc ("kexec: support for kexec on panic using new system 
>> > call")
>> > Signed-off-by: Tiezhu Yang 
>> > Cc: Vivek Goyal 
>> > ---
>> >  arch/x86/kernel/crash.c | 4 +++-
>> >  1 file changed, 3 insertions(+), 1 deletion(-)
>> > 
>> > diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
>> > index 576b2e1..f13480e 100644
>> > --- a/arch/x86/kernel/crash.c
>> > +++ b/arch/x86/kernel/crash.c
>> > @@ -56,7 +56,9 @@ struct crash_memmap_data {
>> >   */
>> >  crash_vmclear_fn __rcu *crash_vmclear_loaded_vmcss = NULL;
>> >  EXPORT_SYMBOL_GPL(crash_vmclear_loaded_vmcss);
>> > -unsigned long crash_zero_bytes;
>> > +#ifdef CONFIG_KEXEC_FILE
>> > +static unsigned long crash_zero_bytes;
>> > +#endif
>> >  
>> >  static inline void cpu_crash_vmclear_loaded_vmcss(void)
>> >  {
>> > -- 
>> > 1.8.3.1
>> 
>> Acked-by: Dave Young 
>
>BTW, a soft reminder, for kexec patches, it would be better to cc kexec mail
>list.

Thank you for reminding me of that, I will resend it with a Cc to 
ke...@lists.infradead.org.

Thanks,

>
>> 
>> Thanks
>> Dave
>>

linux-next: build failure after merge of the amdgpu tree

2019-06-23 Thread Stephen Rothwell

Hi Alex,

After merging the amdgpu tree, today's linux-next build (x86_64
allmodconfig) failed like this:

In file included from include/linux/kernel.h:15,
 from include/asm-generic/bug.h:18,
 from arch/x86/include/asm/bug.h:83,
 from include/linux/bug.h:5,
 from include/linux/mmdebug.h:5,
 from include/linux/gfp.h:5,
 from include/linux/firmware.h:7,
 from drivers/gpu/drm/amd/amdgpu/../powerplay/smu_v11_0.c:23:
drivers/gpu/drm/amd/amdgpu/../powerplay/smu_v11_0.c: In function 
'smu_v11_0_irq_process':
drivers/gpu/drm/amd/amdgpu/../powerplay/smu_v11_0.c:1542:5: error: implicit 
declaration of function 'PCI_BUS_NUM' [-Werror=implicit-function-declaration]
 PCI_BUS_NUM(adev->pdev->devfn),
 ^~~
include/linux/printk.h:306:37: note: in definition of macro 'pr_warning'
  printk(KERN_WARNING pr_fmt(fmt), ##__VA_ARGS__)
 ^~~
drivers/gpu/drm/amd/amdgpu/../powerplay/smu_v11_0.c:1541:4: note: in expansion 
of macro 'pr_warn'
pr_warn("GPU over temperature range detected on PCIe %d:%d.%d!\n",
^~~
drivers/gpu/drm/amd/amdgpu/../powerplay/smu_v11_0.c:1542:27: error: 
dereferencing pointer to incomplete type 'struct pci_dev'
 PCI_BUS_NUM(adev->pdev->devfn),
   ^~
include/linux/printk.h:306:37: note: in definition of macro 'pr_warning'
  printk(KERN_WARNING pr_fmt(fmt), ##__VA_ARGS__)
 ^~~
drivers/gpu/drm/amd/amdgpu/../powerplay/smu_v11_0.c:1541:4: note: in expansion 
of macro 'pr_warn'
pr_warn("GPU over temperature range detected on PCIe %d:%d.%d!\n",
^~~
drivers/gpu/drm/amd/amdgpu/../powerplay/smu_v11_0.c:1543:5: error: implicit 
declaration of function 'PCI_SLOT'; did you mean 'CC_SET'? 
[-Werror=implicit-function-declaration]
 PCI_SLOT(adev->pdev->devfn),
 ^~~~
include/linux/printk.h:306:37: note: in definition of macro 'pr_warning'
  printk(KERN_WARNING pr_fmt(fmt), ##__VA_ARGS__)
 ^~~
drivers/gpu/drm/amd/amdgpu/../powerplay/smu_v11_0.c:1541:4: note: in expansion 
of macro 'pr_warn'
pr_warn("GPU over temperature range detected on PCIe %d:%d.%d!\n",
^~~
drivers/gpu/drm/amd/amdgpu/../powerplay/smu_v11_0.c:1544:5: error: implicit 
declaration of function 'PCI_FUNC'; did you mean 'STT_FUNC'? 
[-Werror=implicit-function-declaration]
 PCI_FUNC(adev->pdev->devfn));
 ^~~~
include/linux/printk.h:306:37: note: in definition of macro 'pr_warning'
  printk(KERN_WARNING pr_fmt(fmt), ##__VA_ARGS__)
 ^~~
drivers/gpu/drm/amd/amdgpu/../powerplay/smu_v11_0.c:1541:4: note: in expansion 
of macro 'pr_warn'
pr_warn("GPU over temperature range detected on PCIe %d:%d.%d!\n",
^~~
cc1: some warnings being treated as errors

Caused by commit

  5e6d266573db ("drm/amd/powerplay: add thermal ctf support for navi10")

I have used the amdgu tree from next-20190621 for today.

-- 
Cheers,
Stephen Rothwell


pgpQJjHXrW8yQ.pgp
Description: OpenPGP digital signature

[PATCH] mm/hugetlb: allow gigantic page allocation to migrate away smaller huge page

2019-06-23 Thread Pingfan Liu

The current pfn_range_valid_gigantic() rejects the pud huge page allocation
if there is a pmd huge page inside the candidate range.

But pud huge resource is more rare, which should align on 1GB on x86. It is
worth to allow migrating away pmd huge page to make room for a pud huge
page.

The same logic is applied to pgd and pud huge pages.

Signed-off-by: Pingfan Liu 
Cc: Mike Kravetz 
Cc: Oscar Salvador 
Cc: David Hildenbrand 
Cc: Andrew Morton 
Cc: linux-kernel@vger.kernel.org
---
 mm/hugetlb.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index ac843d3..02d1978 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1081,7 +1081,11 @@ static bool pfn_range_valid_gigantic(struct zone *z,
unsigned long start_pfn, unsigned long nr_pages)
 {
unsigned long i, end_pfn = start_pfn + nr_pages;
-   struct page *page;
+   struct page *page = pfn_to_page(start_pfn);
+
+   if (PageHuge(page))
+   if (compound_order(compound_head(page)) >= nr_pages)
+   return false;
 
for (i = start_pfn; i < end_pfn; i++) {
if (!pfn_valid(i))
@@ -1098,8 +1102,6 @@ static bool pfn_range_valid_gigantic(struct zone *z,
if (page_count(page) > 0)
return false;
 
-   if (PageHuge(page))
-   return false;
}
 
return true;
-- 
2.7.5

[PATCHv2] mm/gup: speed up check_and_migrate_cma_pages() on huge page

2019-06-23 Thread Pingfan Liu

Both hugetlb and thp locate on the same migration type of pageblock, since
they are allocated from a free_list[]. Based on this fact, it is enough to
check on a single subpage to decide the migration type of the whole huge
page. By this way, it saves (2M/4K - 1) times loop for pmd_huge on x86,
similar on other archs.

Furthermore, when executing isolate_huge_page(), it avoid taking global
hugetlb_lock many times, and meanless remove/add to the local link list
cma_page_list.

Signed-off-by: Pingfan Liu 
Cc: Andrew Morton 
Cc: Ira Weiny 
Cc: Mike Rapoport 
Cc: "Kirill A. Shutemov" 
Cc: Thomas Gleixner 
Cc: John Hubbard 
Cc: "Aneesh Kumar K.V" 
Cc: Christoph Hellwig 
Cc: Keith Busch 
Cc: Mike Kravetz 
Cc: Linux-kernel@vger.kernel.org
---
 mm/gup.c | 19 ---
 1 file changed, 12 insertions(+), 7 deletions(-)

diff --git a/mm/gup.c b/mm/gup.c
index ddde097..544f5de 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -1342,19 +1342,22 @@ static long check_and_migrate_cma_pages(struct 
task_struct *tsk,
LIST_HEAD(cma_page_list);
 
 check_again:
-   for (i = 0; i < nr_pages; i++) {
+   for (i = 0; i < nr_pages;) {
+
+   struct page *head = compound_head(pages[i]);
+   long step = 1;
+
+   if (PageCompound(head))
+   step = compound_order(head) - (pages[i] - head);
/*
 * If we get a page from the CMA zone, since we are going to
 * be pinning these entries, we might as well move them out
 * of the CMA zone if possible.
 */
-   if (is_migrate_cma_page(pages[i])) {
-
-   struct page *head = compound_head(pages[i]);
-
-   if (PageHuge(head)) {
+   if (is_migrate_cma_page(head)) {
+   if (PageHuge(head))
isolate_huge_page(head, _page_list);
-   } else {
+   else {
if (!PageLRU(head) && drain_allow) {
lru_add_drain_all();
drain_allow = false;
@@ -1369,6 +1372,8 @@ static long check_and_migrate_cma_pages(struct 
task_struct *tsk,
}
}
}
+
+   i += step;
}
 
if (!list_empty(_page_list)) {
-- 
2.7.5

Re: Kirkwood PCI Express and bridges

2019-06-23 Thread Chris Packham

Hi Thomas,

On 21/06/19 6:17 PM, Thomas Petazzoni wrote:
> Hello Chris,
> 
> On Fri, 21 Jun 2019 04:03:27 +
> Chris Packham  wrote:
> 
>> I'm in the process of updating the kernel version used on our products
>> from 4.4 -> 5.1.
>>
>> We have one product that uses a Kirkwood CPU, IDT PCI bridge and Marvell
>> Switch ASIC. The Switch ASIC presents as multiple PCI devices.
>>
>> The hardware setup looks like this
>>__
>> [ Kirkwood ] --- [ IDT 5T5 ] ---+---  |  |
>> +---  |  Switch  |
>> +---  |  |
>> +---  |__|
>>
>> On the 4.4 based kernel things are fine
>>
>> [root@awplus flash]# lspci -t
>> -[:00]---01.0-[01-06]00.0-[02-06]--+-02.0-[03]00.0
>>  +-03.0-[04]00.0
>>  +-04.0-[05]00.0
>>  \-05.0-[06]00.0
>>
>> But on the 5.1 based kernel things get a little weird
>>
>> [root@awplus flash]# lspci -t
>> -[:00]---01.0-[01-06]--+-00.0-[02-06]--
>>  +-01.0
>>  +-02.0-[02-06]--
>>  +-03.0-[02-06]--
>>  +-04.0-[02-06]--
>>  +-05.0-[02-06]--
>>  +-06.0-[02-06]--
>>  +-07.0-[02-06]--
>>  +-08.0-[02-06]--
>>  +-09.0-[02-06]--
>>  +-0a.0-[02-06]--
>>  +-0b.0-[02-06]--
>>  +-0c.0-[02-06]--
>>  +-0d.0-[02-06]--
>>  +-0e.0-[02-06]--
>>  +-0f.0-[02-06]--
>>  +-10.0-[02-06]--
>>  +-11.0-[02-06]--
>>  +-12.0-[02-06]--
>>  +-13.0-[02-06]--
>>  +-14.0-[02-06]--
>>  +-15.0-[02-06]--
>>  +-16.0-[02-06]--
>>  +-17.0-[02-06]--
>>  +-18.0-[02-06]--
>>  +-19.0-[02-06]--
>>  +-1a.0-[02-06]--
>>  +-1b.0-[02-06]--
>>  +-1c.0-[02-06]--
>>  +-1d.0-[02-06]--
>>  +-1e.0-[02-06]--
>>  \-1f.0-[02-06]--+-02.0-[03]00.0
>>  +-03.0-[04]00.0
>>  +-04.0-[05]00.0
>>  \-05.0-[06]00.0
>>
>>
>> I'll start bisecting to see where things started going wrong. I just
>> wondered if this rings any bells for anyone.
> 
> I am almost sure that the culprit is
> 1f08673eef1236f7d02d93fcf596bb8531ef0d12 ("PCI: mvebu: Convert to PCI
> emulated bridge config space").

The problem seems to pre-date this commit. I've gone back as far as 4.18 
and the problem still exists (in fact there are more duplicate devices). 
I'll keep going back (unfortunately due to out platform being out of 
tree it's not a simple bisect).

> I still think it makes sense to share the bridge emulation code between
> the mvebu and aardvark drivers, but this sharing has required making
> the code very different, with lots of subtle differences in behavior in
> how registers are emulated.

Agreed. Bugs love to hide in duplicated code.

I will admit to being ignorant about the need for an emulated bridge. I 
know it has something to do with the type of transaction used for the 
downstream devices. I also know that these systems won't work without an 
emulated bridge.

> Unfortunately, I don't have access to one of these complicated PCI
> setup with a HW switch on the way, so I couldn't test this kind of
> setups.
> 
> Do you mind helping with figuring out what the issues are ? That would
> be really nice.

No problem. As I said I'll keep going to find a point where behaviour 
turns bad for me. I suspect we might find other problems along the way.

RE: [EXT] Re: [PATCH v7 4/5] usb: host: Stops USB controller init if PLL fails to lock

2019-06-23 Thread Yinbo Zhu



> -Original Message-
> From: Greg Kroah-Hartman [mailto:gre...@linuxfoundation.org]
> Sent: 2019年6月20日 20:10
> To: Yinbo Zhu 
> Cc: Alan Stern ; Xiaobo Xie ;
> Jiafei Pan ; Ramneek Mehresh
> ; Nikhil Badola
> ; Ran Wang ;
> linux-...@vger.kernel.org; linux-kernel@vger.kernel.org
> Subject: [EXT] Re: [PATCH v7 4/5] usb: host: Stops USB controller init if PLL 
> fails to
> lock
> 
> Caution: EXT Email
> 
> On Fri, Jun 14, 2019 at 04:54:32PM +0800, Yinbo Zhu wrote:
> > From: Ramneek Mehresh 
> >
> > USB erratum-A006918 workaround tries to start internal PHY inside
> > uboot (when PLL fails to lock). However, if the workaround also fails,
> > then USB initialization is also stopped inside Linux.
> > Erratum-A006918 workaround failure creates "fsl,erratum_a006918"
> > node in device-tree. Presence of this node in device-tree is used to
> > stop USB controller initialization in Linux
> >
> > Signed-off-by: Ramneek Mehresh 
> > Signed-off-by: Suresh Gupta 
> > Signed-off-by: Yinbo Zhu 
> > ---
> > Change in v7:
> >   keep v5 version "fall through"
> >
> >  drivers/usb/host/ehci-fsl.c  | 9 +
> >  drivers/usb/host/fsl-mph-dr-of.c | 3 ++-
> >  2 files changed, 11 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/usb/host/ehci-fsl.c b/drivers/usb/host/ehci-fsl.c
> > index 8f3bf3efb038..ef3dfd33a62e 100644
> > --- a/drivers/usb/host/ehci-fsl.c
> > +++ b/drivers/usb/host/ehci-fsl.c
> > @@ -236,6 +236,15 @@ static int ehci_fsl_setup_phy(struct usb_hcd *hcd,
> >   portsc |= PORT_PTS_PTW;
> >   /* fall through */
> >   case FSL_USB2_PHY_UTMI:
> > + /* Presence of this node "has_fsl_erratum_a006918"
> > +  * in device-tree is used to stop USB controller
> > +  * initialization in Linux
> > +  */
> > + if (pdata->has_fsl_erratum_a006918) {
> > + dev_warn(dev, "USB PHY clock invalid\n");
> > + return -EINVAL;
> > + }
> > +
> 
> You need a /* fall through */ comment here, right?
> 
HI greg k-h.

Thanks your feedback!

Yes ,it is needed, because this case doesn't have break, in addition I will add 
a "/* fall through*/" in
case FSL_USB2_PHY_UTMI, please you note.
Thanks

Best Regards,
Yinbo Zhu.

> thanks,
> 
> greg k-h

Re: [PATCH bpf-next] bpf: fix cgroup bpf release synchronization

2019-06-23 Thread Roman Gushchin

On Sun, Jun 23, 2019 at 08:29:21PM -0700, Alexei Starovoitov wrote:
> On 6/23/19 7:30 PM, Roman Gushchin wrote:
> > Since commit 4bfc0bb2c60e ("bpf: decouple the lifetime of cgroup_bpf
> > from cgroup itself"), cgroup_bpf release occurs asynchronously
> > (from a worker context), and before the release of the cgroup itself.
> > 
> > This introduced a previously non-existing race between the release
> > and update paths. E.g. if a leaf's cgroup_bpf is released and a new
> > bpf program is attached to the one of ancestor cgroups at the same
> > time. The race may result in double-free and other memory corruptions.
> > 
> > To fix the problem, let's protect the body of cgroup_bpf_release()
> > with cgroup_mutex, as it was effectively previously, when all this
> > code was called from the cgroup release path with cgroup mutex held.
> > 
> > Also make sure, that we don't leave already freed pointers to the
> > effective prog arrays. Otherwise, they can be released again by
> > the update path. It wasn't necessary before, because previously
> > the update path couldn't see such a cgroup, as cgroup_bpf and cgroup
> > itself were released together.
> 
> I thought dying cgroup won't have any children cgroups ?

It's not completely true, a dying cgroup can't have living children.

> It should have been empty with no tasks inside it?

Right.

> Only some resources are still held?

Right.

> mutex and zero init are highly suspicious.
> It feels that cgroup_bpf_release is called too early.

An alternative solution is to bump the refcounter on
every update path, and explicitly skip de-bpf'ed cgroups.

> 
> Thinking from another angle... if child cgroups can still attach then
> this bpf_release is broken.

Hm, what do you mean under attach? It's not possible to attach
a new prog, but if a prog is attached to a parent cgroup,
a pointer can spill through "effective" array.

But I agree, it's broken. Update path should ignore such
cgroups (cgroups, which cgroup_bpf was released). I'll take a look.

> The code should be
> calling __cgroup_bpf_detach() one by one to make sure
> update_effective_progs() is called, since descendant are still
> sort-of alive and can attach?

Not sure I get you. Dying cgroup is a leaf cgroup.

> 
> My money is on 'too early'.
> May be cgroup is not dying ?
> Just cgroup_sk_free() is called on the last socket and
> this auto-detach logic got triggered incorrectly?

So, once again, what's my picture:

A
A/B
A/B/C

cpu1:   cpu2:
rmdir C attach new prog to A
C got dying update A, update B, update C...
C's cgroup_bpf is released  C's effective progs is replaced with new one
old is double freed

It looks like it can be reproduced without any sockets.

Thanks!

Re: [PATCH next] softirq: enable MAX_SOFTIRQ_TIME tuning with sysctl max_softirq_time_usecs

2019-06-23 Thread Zhiqiang Liu



在 2019/6/24 0:38, Thomas Gleixner 写道:
> Zhiqiang,
>> controlled by sysadmins to copy with hardware changes over time.
> 
> So much for the theory. See below.

Thanks for your reply.
> 
>> Correspondingly, the MAX_SOFTIRQ_TIME should be able to be tunned by 
>> sysadmins,
>> who knows best about hardware performance, for excepted tradeoff between 
>> latence
>> and fairness.
>>
>> Here, we add sysctl variable max_softirq_time_usecs to replace 
>> MAX_SOFTIRQ_TIME
>> with 2ms default value.
> 
> ...
> 
>>   */
>> -#define MAX_SOFTIRQ_TIME  msecs_to_jiffies(2)
>> +unsigned int __read_mostly max_softirq_time_usecs = 2000;
>>  #define MAX_SOFTIRQ_RESTART 10
>>
>>  #ifdef CONFIG_TRACE_IRQFLAGS
>> @@ -248,7 +249,8 @@ static inline void lockdep_softirq_end(bool in_hardirq) 
>> { }
>>
>>  asmlinkage __visible void __softirq_entry __do_softirq(void)
>>  {
>> -unsigned long end = jiffies + MAX_SOFTIRQ_TIME;
>> +unsigned long end = jiffies +
>> +usecs_to_jiffies(max_softirq_time_usecs);
> 
> That's still jiffies based and therefore depends on CONFIG_HZ. Any budget
> value will be rounded up to the next jiffie. So in case of HZ=100 and
> time=1000us this will still result in 10ms of allowed loop time.
> 
> I'm not saying that we must use a more fine grained time source, but both
> the changelog and the sysctl documentation are misleading.
> 
> If we keep it jiffies based, then microseconds do not make any sense. They
> just give a false sense of controlability.
> 
> Keep also in mind that with jiffies the accuracy depends also on the
> distance to the next tick when 'end' is evaluated. The next tick might be
> imminent.
> 
> That's all information which needs to be in the documentation.
> 

Thanks again for your detailed advice.
As your said, the max_softirq_time_usecs setting without explaining the
relationship with CONFIG_HZ will give a false sense of controlability. And
the time accuracy of jiffies will result in a certain difference between the
max_softirq_time_usecs set value and the actual value, which is in one jiffies
range.

I will add these infomation in the sysctl documentation and changelog in v2 
patch.

>> +{
>> +.procname   = "max_softirq_time_usecs",
>> +.data   = _softirq_time_usecs,
>> +.maxlen = sizeof(unsigned int),
>> +.mode   = 0644,
>> +.proc_handler   = proc_dointvec_minmax,
>> +.extra1 = ,
>> +},
> 
> Zero as the lower limit? That means it allows a single loop. Fine, but
> needs to be documented as well.
> 
> Thanks,
> 
>   tglx
> 
> .
>

[PATCH v2] flow_dissector: Fix vlan header offset in __skb_flow_dissect

2019-06-23 Thread YueHaibing

We build vlan on top of bonding interface, which vlan offload
is off, bond mode is 802.3ad (LACP) and xmit_hash_policy is
BOND_XMIT_POLICY_ENCAP34.

__skb_flow_dissect() fails to get information from protocol headers
encapsulated within vlan, because 'nhoff' is points to IP header,
so bond hashing is based on layer 2 info, which fails to distribute
packets across slaves.

Fixes: d5709f7ab776 ("flow_dissector: For stripped vlan, get vlan info from 
skb->vlan_tci")
Signed-off-by: YueHaibing 
---
v2: remove redundant spaces
---
 net/core/flow_dissector.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c
index 01ad60b..ff85934 100644
--- a/net/core/flow_dissector.c
+++ b/net/core/flow_dissector.c
@@ -998,6 +998,9 @@ bool __skb_flow_dissect(const struct net *net,
skb && skb_vlan_tag_present(skb)) {
proto = skb->protocol;
} else {
+   if (dissector_vlan == FLOW_DISSECTOR_KEY_MAX)
+   nhoff -= sizeof(*vlan);
+
vlan = __skb_header_pointer(skb, nhoff, sizeof(_vlan),
data, hlen, &_vlan);
if (!vlan) {
-- 
2.7.4

Re: linux-next: build failure after merge of the net-next tree

2019-06-23 Thread Palmer Dabbelt


On Sun, 23 Jun 2019 20:12:45 PDT (-0700), Stephen Rothwell wrote:

Hi all,

On Thu, 20 Jun 2019 19:13:48 +1000 Stephen Rothwell  
wrote:


After merging the net-next tree, today's linux-next build (powerpc
allyesconfig) failed like this:

drivers/net/ethernet/cadence/macb_main.c:48:16: error: field 'hw' has 
incomplete type
  struct clk_hw hw;
^~
drivers/net/ethernet/cadence/macb_main.c:4003:21: error: variable 
'fu540_c000_ops' has initializer but incomplete type
 static const struct clk_ops fu540_c000_ops = {
 ^~~
drivers/net/ethernet/cadence/macb_main.c:4004:3: error: 'const struct clk_ops' 
has no member named 'recalc_rate'
  .recalc_rate = fu540_macb_tx_recalc_rate,
   ^~~
drivers/net/ethernet/cadence/macb_main.c:4004:17: warning: excess elements in 
struct initializer
  .recalc_rate = fu540_macb_tx_recalc_rate,
 ^
drivers/net/ethernet/cadence/macb_main.c:4004:17: note: (near initialization 
for 'fu540_c000_ops')
drivers/net/ethernet/cadence/macb_main.c:4005:3: error: 'const struct clk_ops' 
has no member named 'round_rate'
  .round_rate = fu540_macb_tx_round_rate,
   ^~
drivers/net/ethernet/cadence/macb_main.c:4005:16: warning: excess elements in 
struct initializer
  .round_rate = fu540_macb_tx_round_rate,
^~~~
drivers/net/ethernet/cadence/macb_main.c:4005:16: note: (near initialization 
for 'fu540_c000_ops')
drivers/net/ethernet/cadence/macb_main.c:4006:3: error: 'const struct clk_ops' 
has no member named 'set_rate'
  .set_rate = fu540_macb_tx_set_rate,
   ^~~~
drivers/net/ethernet/cadence/macb_main.c:4006:14: warning: excess elements in 
struct initializer
  .set_rate = fu540_macb_tx_set_rate,
  ^~
drivers/net/ethernet/cadence/macb_main.c:4006:14: note: (near initialization 
for 'fu540_c000_ops')
drivers/net/ethernet/cadence/macb_main.c: In function 'fu540_c000_clk_init':
drivers/net/ethernet/cadence/macb_main.c:4013:23: error: storage size of 'init' 
isn't known
  struct clk_init_data init;
   ^~~~
drivers/net/ethernet/cadence/macb_main.c:4032:12: error: implicit declaration 
of function 'clk_register'; did you mean 'sock_register'? 
[-Werror=implicit-function-declaration]
  *tx_clk = clk_register(NULL, >hw);
^~~~
sock_register
drivers/net/ethernet/cadence/macb_main.c:4013:23: warning: unused variable 
'init' [-Wunused-variable]
  struct clk_init_data init;
   ^~~~
drivers/net/ethernet/cadence/macb_main.c: In function 'macb_probe':
drivers/net/ethernet/cadence/macb_main.c:4366:2: error: implicit declaration of 
function 'clk_unregister'; did you mean 'sock_unregister'? 
[-Werror=implicit-function-declaration]
  clk_unregister(tx_clk);
  ^~
  sock_unregister
drivers/net/ethernet/cadence/macb_main.c: At top level:
drivers/net/ethernet/cadence/macb_main.c:4003:29: error: storage size of 
'fu540_c000_ops' isn't known
 static const struct clk_ops fu540_c000_ops = {
 ^~

Caused by commit

  c218ad559020 ("macb: Add support for SiFive FU540-C000")

CONFIG_COMMON_CLK is not set for this build.

I have reverted that commit for today.


I am still reverting that commit.  Has this problem been fixed in some
subtle way?


I don't think so.  I'm assuming something like this is necessary

diff --git a/drivers/net/ethernet/cadence/Kconfig 
b/drivers/net/ethernet/cadence/Kconfig
index 1766697c9c5a..d13db9e9c818 100644
--- a/drivers/net/ethernet/cadence/Kconfig
+++ b/drivers/net/ethernet/cadence/Kconfig
@@ -23,6 +23,7 @@ config MACB
   tristate "Cadence MACB/GEM support"
   depends on HAS_DMA
   select PHYLIB
+   depends on COMMON_CLK
   ---help---
 The Cadence MACB ethernet interface is found on many Atmel AT32 and
 AT91 parts.  This driver also supports the Cadence GEM (Gigabit
@@ -42,7 +43,7 @@ config MACB_USE_HWSTAMP

config MACB_PCI
   tristate "Cadence PCI MACB/GEM support"
-   depends on MACB && PCI && COMMON_CLK
+   depends on MACB && PCI
   ---help---
 This is PCI wrapper for MACB driver.

at a minimum, though it may be saner to #ifdef support for the SiFive clock
driver as that's only useful on some systems.  Assuming I can reproduce the
build failure (which shouldn't be too hard), I'll send out a patch that adds a
Kconfig for the FU540 clock driver to avoid adding a COMMON_CLK dependency for
all MACB systems.

[PATCH] node: Fix warning while make xmldocs

2019-06-23 Thread Masanari Iida

This patch fixes following warning while make xmldocs.
./drivers/base/node.c:690: warning: Excess function parameter
 'mem_node' description in 'register_memory_node_under_compute_node'
./drivers/base/node.c:690: warning: Excess function parameter
 'cpu_node' description in 'register_memory_node_under_compute_node'

Signed-off-by: Masanari Iida 
---
 drivers/base/node.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/base/node.c b/drivers/base/node.c
index d8c02e65df68..944ee45d122f 100644
--- a/drivers/base/node.c
+++ b/drivers/base/node.c
@@ -673,8 +673,8 @@ int register_cpu_under_node(unsigned int cpu, unsigned int 
nid)
 /**
  * register_memory_node_under_compute_node - link memory node to its compute
  *  node for a given access class.
- * @mem_node:  Memory node number
- * @cpu_node:  Cpu  node number
+ * @mem_nid:   Memory node identifier
+ * @cpu_nid:   Cpu  node identifier
  * @access:Access class to register
  *
  * Description:
-- 
2.22.0.190.ga6a95cd1b46e

[PATCH] tpm: Get TCG log from TPM2 ACPI table for tpm2 systems

2019-06-23 Thread Jordan Hand

For TPM2-based systems, retrieve the TCG log from the TPM2 ACPI table.

Signed-off-by: Jordan Hand 
---
 drivers/char/tpm/eventlog/acpi.c | 67 +++-
 1 file changed, 48 insertions(+), 19 deletions(-)

diff --git a/drivers/char/tpm/eventlog/acpi.c b/drivers/char/tpm/eventlog/acpi.c
index 63ada5e53f13..942d282e2738 100644
--- a/drivers/char/tpm/eventlog/acpi.c
+++ b/drivers/char/tpm/eventlog/acpi.c
@@ -41,17 +41,31 @@ struct acpi_tcpa {
};
 };
 
+struct acpi_tpm2 {
+   struct acpi_table_header hdr;
+   u16 platform_class;
+   u16 reserved;
+   u64 control_area_addr;
+   u32 start_method;
+   u8 start_method_params[12];
+   u32 log_max_len;
+   u64 log_start_addr;
+} __packed;
+
 /* read binary bios log */
 int tpm_read_log_acpi(struct tpm_chip *chip)
 {
-   struct acpi_tcpa *buff;
+   struct acpi_table_header *buff;
+   struct acpi_tcpa *tcpa;
+   struct acpi_tpm2 *tpm2;
+
acpi_status status;
void __iomem *virt;
u64 len, start;
+   int log_type;
struct tpm_bios_log *log;
-
-   if (chip->flags & TPM_CHIP_FLAG_TPM2)
-   return -ENODEV;
+   bool is_tpm2 = chip->flags & TPM_CHIP_FLAG_TPM2;
+   acpi_string table_sig;
 
log = >log;
 
@@ -61,26 +75,41 @@ int tpm_read_log_acpi(struct tpm_chip *chip)
if (!chip->acpi_dev_handle)
return -ENODEV;
 
-   /* Find TCPA entry in RSDT (ACPI_LOGICAL_ADDRESSING) */
-   status = acpi_get_table(ACPI_SIG_TCPA, 1,
-   (struct acpi_table_header **));
+   /* Find TCPA or TPM2 entry in RSDT (ACPI_LOGICAL_ADDRESSING) */
+   table_sig = is_tpm2 ? ACPI_SIG_TPM2 : ACPI_SIG_TCPA;
+   status = acpi_get_table(table_sig, 1, );
 
if (ACPI_FAILURE(status))
return -ENODEV;
 
-   switch(buff->platform_class) {
-   case BIOS_SERVER:
-   len = buff->server.log_max_len;
-   start = buff->server.log_start_addr;
-   break;
-   case BIOS_CLIENT:
-   default:
-   len = buff->client.log_max_len;
-   start = buff->client.log_start_addr;
-   break;
+   /* If log_max_len and log_start_addr are set, start_method_params will
+* be 12 bytes, according to TCG ACPI spec. If start_method_params is
+* fewer than 12 bytes, the TCG log is not available
+*/
+   if (is_tpm2 && (buff->length == sizeof(acpi_tpm2))) {
+   tpm2 = (struct acpi_tpm2 *)buff;
+   len = tpm2->log_max_len;
+   start = tpm2->log_start_addr;
+   log_type = EFI_TCG2_EVENT_LOG_FORMAT_TCG_2;
+   } else {
+   tcpa = (struct acpi_tcpa *)buff;
+   switch (tcpa->platform_class) {
+   case BIOS_SERVER:
+   len = tcpa->server.log_max_len;
+   start = tcpa->server.log_start_addr;
+   break;
+   case BIOS_CLIENT:
+   default:
+   len = tcpa->client.log_max_len;
+   start = tcpa->client.log_start_addr;
+   break;
+   }
+   log_type = EFI_TCG2_EVENT_LOG_FORMAT_TCG_1_2;
}
+
if (!len) {
-   dev_warn(>dev, "%s: TCPA log area empty\n", __func__);
+   dev_warn(>dev, "%s: %s log area empty\n",
+   table_sig, __func__);
return -EIO;
}
 
@@ -98,7 +127,7 @@ int tpm_read_log_acpi(struct tpm_chip *chip)
memcpy_fromio(log->bios_event_log, virt, len);
 
acpi_os_unmap_iomem(virt, len);
-   return EFI_TCG2_EVENT_LOG_FORMAT_TCG_1_2;
+   return log_type;
 
 err:
kfree(log->bios_event_log);
-- 
2.20.1

[PATCHv3 1/1] coresight: Do not default to CPU0 for missing CPU phandle

2019-06-23 Thread Sai Prakash Ranjan

Coresight platform support assumes that a missing "cpu" phandle
defaults to CPU0. This could be problematic and unnecessarily binds
components to CPU0, where they may not be. Let us make the DT binding
rules a bit stricter by not defaulting to CPU0 for missing "cpu"
affinity information.

Also in coresight etm and cpu-debug drivers, abort the probe
for such cases.

Signed-off-by: Sai Prakash Ranjan 
---
 .../bindings/arm/coresight-cpu-debug.txt |  4 ++--
 .../devicetree/bindings/arm/coresight.txt|  8 +---
 .../hwtracing/coresight/coresight-cpu-debug.c|  3 +++
 drivers/hwtracing/coresight/coresight-etm3x.c|  3 +++
 drivers/hwtracing/coresight/coresight-etm4x.c|  3 +++
 drivers/hwtracing/coresight/coresight-platform.c | 16 
 6 files changed, 24 insertions(+), 13 deletions(-)

diff --git a/Documentation/devicetree/bindings/arm/coresight-cpu-debug.txt 
b/Documentation/devicetree/bindings/arm/coresight-cpu-debug.txt
index 298291211ea4..f1de3247c1b7 100644
--- a/Documentation/devicetree/bindings/arm/coresight-cpu-debug.txt
+++ b/Documentation/devicetree/bindings/arm/coresight-cpu-debug.txt
@@ -26,8 +26,8 @@ Required properties:
processor core is clocked by the internal CPU clock, so it
is enabled with CPU clock by default.
 
-- cpu : the CPU phandle the debug module is affined to. When omitted
-   the module is considered to belong to CPU0.
+- cpu : the CPU phandle the debug module is affined to. Do not assume it
+to default to CPU0 if omitted.
 
 Optional properties:
 
diff --git a/Documentation/devicetree/bindings/arm/coresight.txt 
b/Documentation/devicetree/bindings/arm/coresight.txt
index 8a88ddebc1a2..fcc3bacfd8bc 100644
--- a/Documentation/devicetree/bindings/arm/coresight.txt
+++ b/Documentation/devicetree/bindings/arm/coresight.txt
@@ -59,6 +59,11 @@ its hardware characteristcs.
 
* port or ports: see "Graph bindings for Coresight" below.
 
+* Additional required property for Embedded Trace Macrocell (version 3.x and
+  version 4.x):
+   * cpu: the cpu phandle this ETM/PTM is affined to. Do not
+ assume it to default to CPU0 if omitted.
+
 * Additional required properties for System Trace Macrocells (STM):
* reg: along with the physical base address and length of the register
  set as described above, another entry is required to describe the
@@ -87,9 +92,6 @@ its hardware characteristcs.
* arm,cp14: must be present if the system accesses ETM/PTM management
  registers via co-processor 14.
 
-   * cpu: the cpu phandle this ETM/PTM is affined to. When omitted the
- source is considered to belong to CPU0.
-
 * Optional property for TMC:
 
* arm,buffer-size: size of contiguous buffer space for TMC ETR
diff --git a/drivers/hwtracing/coresight/coresight-cpu-debug.c 
b/drivers/hwtracing/coresight/coresight-cpu-debug.c
index 07a1367c733f..58bfd6319f65 100644
--- a/drivers/hwtracing/coresight/coresight-cpu-debug.c
+++ b/drivers/hwtracing/coresight/coresight-cpu-debug.c
@@ -579,6 +579,9 @@ static int debug_probe(struct amba_device *adev, const 
struct amba_id *id)
return -ENOMEM;
 
drvdata->cpu = coresight_get_cpu(dev);
+   if (drvdata->cpu < 0)
+   return drvdata->cpu;
+
if (per_cpu(debug_drvdata, drvdata->cpu)) {
dev_err(dev, "CPU%d drvdata has already been initialized\n",
drvdata->cpu);
diff --git a/drivers/hwtracing/coresight/coresight-etm3x.c 
b/drivers/hwtracing/coresight/coresight-etm3x.c
index 225c2982e4fe..e2cb6873c3f2 100644
--- a/drivers/hwtracing/coresight/coresight-etm3x.c
+++ b/drivers/hwtracing/coresight/coresight-etm3x.c
@@ -816,6 +816,9 @@ static int etm_probe(struct amba_device *adev, const struct 
amba_id *id)
}
 
drvdata->cpu = coresight_get_cpu(dev);
+   if (drvdata->cpu < 0)
+   return drvdata->cpu;
+
desc.name  = devm_kasprintf(dev, GFP_KERNEL, "etm%d", drvdata->cpu);
if (!desc.name)
return -ENOMEM;
diff --git a/drivers/hwtracing/coresight/coresight-etm4x.c 
b/drivers/hwtracing/coresight/coresight-etm4x.c
index 7fe266194ab5..7bcac8896fc1 100644
--- a/drivers/hwtracing/coresight/coresight-etm4x.c
+++ b/drivers/hwtracing/coresight/coresight-etm4x.c
@@ -1101,6 +1101,9 @@ static int etm4_probe(struct amba_device *adev, const 
struct amba_id *id)
spin_lock_init(>spinlock);
 
drvdata->cpu = coresight_get_cpu(dev);
+   if (drvdata->cpu < 0)
+   return drvdata->cpu;
+
desc.name = devm_kasprintf(dev, GFP_KERNEL, "etm%d", drvdata->cpu);
if (!desc.name)
return -ENOMEM;
diff --git a/drivers/hwtracing/coresight/coresight-platform.c 
b/drivers/hwtracing/coresight/coresight-platform.c
index 3c5ceda8db24..4990da2c13e9 100644
--- a/drivers/hwtracing/coresight/coresight-platform.c
+++

[PATCHv3 0/1] coresight: Do not default to CPU0 for missing CPU phandle

2019-06-23 Thread Sai Prakash Ranjan

In case of missing CPU phandle, the affinity is set default to
CPU0 which is not a correct assumption. Fix this in coresight
platform to set affinity to invalid and abort the probe in drivers.
Also update the dt-bindings accordingly.

v3:
 * Addressed review comments from Suzuki and updated
   acpi_coresight_get_cpu.
 * Removed patch 2 which had invalid check for online
   cpus.

v2:
 * Addressed review comments from Suzuki and Mathieu.
 * Allows the probe of etm and cpu-debug to abort earlier
   in case of unavailability of respective cpus.

Sai Prakash Ranjan (1):
  coresight: Do not default to CPU0 for missing CPU phandle

 .../bindings/arm/coresight-cpu-debug.txt |  4 ++--
 .../devicetree/bindings/arm/coresight.txt|  8 +---
 .../hwtracing/coresight/coresight-cpu-debug.c|  3 +++
 drivers/hwtracing/coresight/coresight-etm3x.c|  3 +++
 drivers/hwtracing/coresight/coresight-etm4x.c|  3 +++
 drivers/hwtracing/coresight/coresight-platform.c | 16 
 6 files changed, 24 insertions(+), 13 deletions(-)

-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation

Re: [PATCH -mm] mm, swap: Fix THP swap out

2019-06-23 Thread Ming Lei

Hi Huang Ying,

On Mon, Jun 24, 2019 at 10:23:36AM +0800, Huang, Ying wrote:
> From: Huang Ying 
> 
> 0-Day test system reported some OOM regressions for several
> THP (Transparent Huge Page) swap test cases.  These regressions are
> bisected to 6861428921b5 ("block: always define BIO_MAX_PAGES as
> 256").  In the commit, BIO_MAX_PAGES is set to 256 even when THP swap
> is enabled.  So the bio_alloc(gfp_flags, 512) in get_swap_bio() may
> fail when swapping out THP.  That causes the OOM.
> 
> As in the patch description of 6861428921b5 ("block: always define
> BIO_MAX_PAGES as 256"), THP swap should use multi-page bvec to write
> THP to swap space.  So the issue is fixed via doing that in
> get_swap_bio().
> 
> BTW: I remember I have checked the THP swap code when
> 6861428921b5 ("block: always define BIO_MAX_PAGES as 256") was merged,
> and thought the THP swap code needn't to be changed.  But apparently,
> I was wrong.  I should have done this at that time.
> 
> Fixes: 6861428921b5 ("block: always define BIO_MAX_PAGES as 256")
> Signed-off-by: "Huang, Ying" 
> Cc: Ming Lei 
> Cc: Michal Hocko 
> Cc: Johannes Weiner 
> Cc: Hugh Dickins 
> Cc: Minchan Kim 
> Cc: Rik van Riel 
> Cc: Daniel Jordan 
> ---
>  mm/page_io.c | 7 ++-
>  1 file changed, 2 insertions(+), 5 deletions(-)
> 
> diff --git a/mm/page_io.c b/mm/page_io.c
> index 2e8019d0e048..4ab997f84061 100644
> --- a/mm/page_io.c
> +++ b/mm/page_io.c
> @@ -29,10 +29,9 @@
>  static struct bio *get_swap_bio(gfp_t gfp_flags,
>   struct page *page, bio_end_io_t end_io)
>  {
> - int i, nr = hpage_nr_pages(page);
>   struct bio *bio;
>  
> - bio = bio_alloc(gfp_flags, nr);
> + bio = bio_alloc(gfp_flags, 1);
>   if (bio) {
>   struct block_device *bdev;
>  
> @@ -41,9 +40,7 @@ static struct bio *get_swap_bio(gfp_t gfp_flags,
>   bio->bi_iter.bi_sector <<= PAGE_SHIFT - 9;
>   bio->bi_end_io = end_io;
>  
> - for (i = 0; i < nr; i++)
> - bio_add_page(bio, page + i, PAGE_SIZE, 0);

bio_add_page() supposes to work, just wondering why it doesn't recently.

Could you share me one test case for reproducing it?

> - VM_BUG_ON(bio->bi_iter.bi_size != PAGE_SIZE * nr);
> + __bio_add_page(bio, page, PAGE_SIZE * hpage_nr_pages(page), 0);
>   }
>   return bio;

Actually the above code can be simplified as:

diff --git a/mm/page_io.c b/mm/page_io.c
index 2e8019d0e048..c20b4189d0a1 100644
--- a/mm/page_io.c
+++ b/mm/page_io.c
@@ -29,7 +29,7 @@
 static struct bio *get_swap_bio(gfp_t gfp_flags,
struct page *page, bio_end_io_t end_io)
 {
-   int i, nr = hpage_nr_pages(page);
+   int nr = hpage_nr_pages(page);
struct bio *bio;
 
bio = bio_alloc(gfp_flags, nr);
@@ -41,8 +41,7 @@ static struct bio *get_swap_bio(gfp_t gfp_flags,
bio->bi_iter.bi_sector <<= PAGE_SHIFT - 9;
bio->bi_end_io = end_io;
 
-   for (i = 0; i < nr; i++)
-   bio_add_page(bio, page + i, PAGE_SIZE, 0);
+   bio_add_page(bio, page, PAGE_SIZE * nr, 0);
VM_BUG_ON(bio->bi_iter.bi_size != PAGE_SIZE * nr);
}
return bio;


Thanks,
Ming

[PATCH v2] x86/speculation/mds: Eliminate leaks by trace_hardirqs_on()

2019-06-23 Thread Zhenzhong Duan

Move mds_idle_clear_cpu_buffers() after trace_hardirqs_on() to ensure
all store buffer entries are flushed.

Signed-off-by: Zhenzhong Duan 
---
-v2: remove pointless changes which made a double flush

 arch/x86/include/asm/mwait.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/mwait.h b/arch/x86/include/asm/mwait.h
index eb0f80c..e28f8b7 100644
--- a/arch/x86/include/asm/mwait.h
+++ b/arch/x86/include/asm/mwait.h
@@ -86,9 +86,9 @@ static inline void __mwaitx(unsigned long eax, unsigned long 
ebx,
 
 static inline void __sti_mwait(unsigned long eax, unsigned long ecx)
 {
-   mds_idle_clear_cpu_buffers();
-
trace_hardirqs_on();
+
+   mds_idle_clear_cpu_buffers();
/* "mwait %eax, %ecx;" */
asm volatile("sti; .byte 0x0f, 0x01, 0xc9;"
 :: "a" (eax), "c" (ecx));
-- 
1.8.3.1

[PATCH] x86/speculation/mds: Avoid clearing CPU buffers in native machine with old microcode

2019-06-23 Thread Zhenzhong Duan

Commit 22dd8365088b ("x86/speculation/mds: Add mitigation mode VMWERV") add
an internal mitigation mode VWWERV which enables the invocation of the CPU
buffer clearing even if X86_FEATURE_MD_CLEAR is not set.

This wastes a few CPU cycles for native machine with an old microcode
unnecessorily. Avoid it by checking if it's running in native machine.

Signed-off-by: Zhenzhong Duan 
---
 arch/x86/kernel/cpu/bugs.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index 03b4cc0..03f5a77 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -233,7 +233,9 @@ static void x86_amd_ssb_disable(void)
 
 static void __init mds_select_mitigation(void)
 {
-   if (!boot_cpu_has_bug(X86_BUG_MDS) || cpu_mitigations_off()) {
+   if (!boot_cpu_has_bug(X86_BUG_MDS) || cpu_mitigations_off() ||
+   (hypervisor_is_type(X86_HYPER_NATIVE) &&
+   !boot_cpu_has(X86_FEATURE_MD_CLEAR))) {
mds_mitigation = MDS_MITIGATION_OFF;
return;
}
-- 
1.8.3.1

Re: [PATCH bpf-next] bpf: fix cgroup bpf release synchronization

2019-06-23 Thread Alexei Starovoitov

On 6/23/19 7:30 PM, Roman Gushchin wrote:
> Since commit 4bfc0bb2c60e ("bpf: decouple the lifetime of cgroup_bpf
> from cgroup itself"), cgroup_bpf release occurs asynchronously
> (from a worker context), and before the release of the cgroup itself.
> 
> This introduced a previously non-existing race between the release
> and update paths. E.g. if a leaf's cgroup_bpf is released and a new
> bpf program is attached to the one of ancestor cgroups at the same
> time. The race may result in double-free and other memory corruptions.
> 
> To fix the problem, let's protect the body of cgroup_bpf_release()
> with cgroup_mutex, as it was effectively previously, when all this
> code was called from the cgroup release path with cgroup mutex held.
> 
> Also make sure, that we don't leave already freed pointers to the
> effective prog arrays. Otherwise, they can be released again by
> the update path. It wasn't necessary before, because previously
> the update path couldn't see such a cgroup, as cgroup_bpf and cgroup
> itself were released together.

I thought dying cgroup won't have any children cgroups ?
It should have been empty with no tasks inside it?
Only some resources are still held?
mutex and zero init are highly suspicious.
It feels that cgroup_bpf_release is called too early.

Thinking from another angle... if child cgroups can still attach then
this bpf_release is broken. The code should be
calling __cgroup_bpf_detach() one by one to make sure
update_effective_progs() is called, since descendant are still
sort-of alive and can attach?

My money is on 'too early'.
May be cgroup is not dying ?
Just cgroup_sk_free() is called on the last socket and
this auto-detach logic got triggered incorrectly?

linux-next: manual merge of the spi-nor tree with Linus' tree

2019-06-23 Thread Stephen Rothwell

Hi all,

Today's linux-next merge of the spi-nor tree got a conflict in:

  drivers/mtd/spi-nor/stm32-quadspi.c

between commit:

  caab277b1de0 ("treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 
234")

from Linus' tree and commit:

  df6bd6c002a4 ("mtd: spi-nor: stm32: remove the driver as it was replaced by 
spi-stm32-qspi.c")

from the spi-nor tree.

I fixed it up (I removed the file) and can carry the fix as necessary.
This is now fixed as far as linux-next is concerned, but any non trivial
conflicts should be mentioned to your upstream maintainer when your tree
is submitted for merging.  You may also want to consider cooperating
with the maintainer of the conflicting tree to minimise any particularly
complex conflicts.

-- 
Cheers,
Stephen Rothwell


pgpUjC3Ad_flF.pgp
Description: OpenPGP digital signature

[PATCH] netfilter: Fix remainder of pseudo-header protocol 0

2019-06-23 Thread zhe.he

From: He Zhe 

Since v5.1-rc1, some types of packets do not get unreachable reply with the
following iptables setting. Fox example,

$ iptables -A INPUT -p icmp --icmp-type 8 -j REJECT
$ ping 127.0.0.1 -c 1
PING 127.0.0.1 (127.0.0.1) 56(84) bytes of data.
— 127.0.0.1 ping statistics —
1 packets transmitted, 0 received, 100% packet loss, time 0ms

We should have got the following reply from command line, but we did not.
>From 127.0.0.1 icmp_seq=1 Destination Port Unreachable

Yi Zhao reported it and narrowed it down to:
7fc38225363d ("netfilter: reject: skip csum verification for protocols that 
don't support it"),

This is because nf_ip_checksum still expects pseudo-header protocol type 0 for
packets that are of neither TCP or UDP, and thus ICMP packets are mistakenly
treated as TCP/UDP.

This patch corrects the conditions in nf_ip_checksum and all other places that
still call it with protocol 0.

Fixes: 7fc38225363d ("netfilter: reject: skip csum verification for protocols 
that don't support it")
Reported-by: Yi Zhao 
Signed-off-by: He Zhe 
---
 net/netfilter/nf_conntrack_proto_icmp.c | 2 +-
 net/netfilter/nf_nat_proto.c| 2 +-
 net/netfilter/utils.c   | 5 +++--
 3 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/net/netfilter/nf_conntrack_proto_icmp.c 
b/net/netfilter/nf_conntrack_proto_icmp.c
index a824367..dd53e2b 100644
--- a/net/netfilter/nf_conntrack_proto_icmp.c
+++ b/net/netfilter/nf_conntrack_proto_icmp.c
@@ -218,7 +218,7 @@ int nf_conntrack_icmpv4_error(struct nf_conn *tmpl,
/* See ip_conntrack_proto_tcp.c */
if (state->net->ct.sysctl_checksum &&
state->hook == NF_INET_PRE_ROUTING &&
-   nf_ip_checksum(skb, state->hook, dataoff, 0)) {
+   nf_ip_checksum(skb, state->hook, dataoff, IPPROTO_ICMP)) {
icmp_error_log(skb, state, "bad hw icmp checksum");
return -NF_ACCEPT;
}
diff --git a/net/netfilter/nf_nat_proto.c b/net/netfilter/nf_nat_proto.c
index 07da077..83a24cc 100644
--- a/net/netfilter/nf_nat_proto.c
+++ b/net/netfilter/nf_nat_proto.c
@@ -564,7 +564,7 @@ int nf_nat_icmp_reply_translation(struct sk_buff *skb,
 
if (!skb_make_writable(skb, hdrlen + sizeof(*inside)))
return 0;
-   if (nf_ip_checksum(skb, hooknum, hdrlen, 0))
+   if (nf_ip_checksum(skb, hooknum, hdrlen, IPPROTO_ICMP))
return 0;
 
inside = (void *)skb->data + hdrlen;
diff --git a/net/netfilter/utils.c b/net/netfilter/utils.c
index 06dc555..51b454d 100644
--- a/net/netfilter/utils.c
+++ b/net/netfilter/utils.c
@@ -17,7 +17,8 @@ __sum16 nf_ip_checksum(struct sk_buff *skb, unsigned int hook,
case CHECKSUM_COMPLETE:
if (hook != NF_INET_PRE_ROUTING && hook != NF_INET_LOCAL_IN)
break;
-   if ((protocol == 0 && !csum_fold(skb->csum)) ||
+   if ((protocol != IPPROTO_TCP && protocol != IPPROTO_UDP &&
+   !csum_fold(skb->csum)) ||
!csum_tcpudp_magic(iph->saddr, iph->daddr,
   skb->len - dataoff, protocol,
   skb->csum)) {
@@ -26,7 +27,7 @@ __sum16 nf_ip_checksum(struct sk_buff *skb, unsigned int hook,
}
/* fall through */
case CHECKSUM_NONE:
-   if (protocol == 0)
+   if (protocol != IPPROTO_TCP && protocol != IPPROTO_UDP)
skb->csum = 0;
else
skb->csum = csum_tcpudp_nofold(iph->saddr, iph->daddr,
-- 
2.7.4

[PATCH v1 07/11] PM / devfreq: tegra30: Reset boosting if clock rate changed

2019-06-23 Thread Dmitry Osipenko

There is a situation when memory activity is going up, hence boosting up
starts to happen, and then governor ramps memory clock rate up. In this
case consecutive events may be stopped if new "COUNT" is within watermarks
range, meanwhile old boosting value remains, which is plainly wrong and
results in unneeded "go down" events after ramping up. In a result of this
change unnecessary interrupts activity goes even lower.

Signed-off-by: Dmitry Osipenko 
---
 drivers/devfreq/tegra30-devfreq.c | 28 
 1 file changed, 28 insertions(+)

diff --git a/drivers/devfreq/tegra30-devfreq.c 
b/drivers/devfreq/tegra30-devfreq.c
index fc278f2f1b62..6fb3ca125438 100644
--- a/drivers/devfreq/tegra30-devfreq.c
+++ b/drivers/devfreq/tegra30-devfreq.c
@@ -631,6 +631,24 @@ static void tegra_actmon_stop(struct tegra_devfreq *tegra)
tegra_actmon_stop_device(>devices[i]);
 }
 
+static void tegra_actmon_stop_boosting(struct tegra_devfreq *tegra)
+{
+   struct tegra_devfreq_device *dev = tegra->devices;
+   unsigned int i;
+   u32 dev_ctrl;
+
+   for (i = 0; i < ARRAY_SIZE(tegra->devices); i++, dev++) {
+   if (!dev->boost_freq)
+   continue;
+
+   dev_ctrl = device_readl(dev, ACTMON_DEV_CTRL);
+   dev_ctrl &= ~ACTMON_DEV_CTRL_CONSECUTIVE_BELOW_WMARK_EN;
+   device_writel(dev, dev_ctrl, ACTMON_DEV_CTRL);
+
+   dev->boost_freq = 0;
+   }
+}
+
 static int tegra_devfreq_target(struct device *dev, unsigned long *freq,
u32 flags)
 {
@@ -656,6 +674,16 @@ static int tegra_devfreq_target(struct device *dev, 
unsigned long *freq,
if (err)
goto restore_min_rate;
 
+   /*
+* Hence boosting-up could be active at the moment of the rate-change
+* and in this case boosting should be reset because it doesn't relate
+* to the new state. If average won't follow shortly in a case of going
+* UP, then clock rate will drop back on next update due to the missed
+* boosting.
+*/
+   if (rate != devfreq->previous_freq)
+   tegra_actmon_stop_boosting(tegra);
+
return 0;
 
 restore_min_rate:
-- 
2.22.0

Re: [PATCH 06/15] ARM: imx: cleanup cppcheck shifting errors

2019-06-23 Thread Shawn Guo

On Sun, Jun 23, 2019 at 10:13:04PM +0700, Phong Tran wrote:
> [arch/arm/mach-imx/iomux-mx3.h:93]: (error) Shifting signed 32-bit value
> by 31 bits is undefined behaviour
> 
> Signed-off-by: Phong Tran 

Applied, thanks.

Re: [PATCH -next v2] drm/amdgpu: return 'ret' in amdgpu_pmu_init

2019-06-23 Thread maowenan




On 2019/6/22 22:00, Julia Lawall wrote:
> 
> 
> On Sat, 22 Jun 2019, maowenan wrote:
> 
>>
>>
>> On 2019/6/22 21:06, Julia Lawall wrote:
>>>
>>>
>>> On Sat, 22 Jun 2019, Mao Wenan wrote:
>>>
 There is one warning:
 drivers/gpu/drm/amd/amdgpu/amdgpu_pmu.c: In function ‘amdgpu_pmu_init’:
 drivers/gpu/drm/amd/amdgpu/amdgpu_pmu.c:249:6: warning: variable ‘ret’ set 
 but not used [-Wunused-but-set-variable]
   int ret = 0;
   ^
 amdgpu_pmu_init() is called by amdgpu_device_init() in 
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c,
 which will use the return value. So it returns 'ret' for caller.
 amdgpu_device_init()
r = amdgpu_pmu_init(adev);

 Fixes: 9c7c85f7ea1f ("drm/amdgpu: add pmu counters")

 Signed-off-by: Mao Wenan 
 ---
  v1->v2: change the subject for this patch; change the indenting when it 
 calls init_pmu_by_type; use the value 'ret' in
  amdgpu_pmu_init().
  drivers/gpu/drm/amd/amdgpu/amdgpu_pmu.c | 6 +++---
  1 file changed, 3 insertions(+), 3 deletions(-)

 diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_pmu.c 
 b/drivers/gpu/drm/amd/amdgpu/amdgpu_pmu.c
 index 0e6dba9..145e720 100644
 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_pmu.c
 +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_pmu.c
 @@ -252,8 +252,8 @@ int amdgpu_pmu_init(struct amdgpu_device *adev)
case CHIP_VEGA20:
/* init df */
ret = init_pmu_by_type(adev, df_v3_6_attr_groups,
 - "DF", "amdgpu_df", PERF_TYPE_AMDGPU_DF,
 - DF_V3_6_MAX_COUNTERS);
 + "DF", "amdgpu_df", 
 PERF_TYPE_AMDGPU_DF,
 + 
 DF_V3_6_MAX_COUNTERS);

/* other pmu types go here*/
>>>
>>> I don't know what is the impact of the other pmu types that are planned
>>> for the future.  Perhaps it would be better to abort the function
>>> immediately in the case of a failure.
>>

OK, v3 will be sent.

>> I guess it would be better to use new function or new switch case clause to 
>> process different PMU types
>> in future.
> 
> I don't know.  But normally when an error may occur it is checked for
> immediately, rather than just letting it go until the end of the function.
> But maybe the developer know what is planned for the future for this
> function.
> 
> julia
> 
>>
>>>
>>> julia
>>>
break;
 @@ -261,7 +261,7 @@ int amdgpu_pmu_init(struct amdgpu_device *adev)
return 0;
}

 -  return 0;
 +  return ret;
  }


 --
 2.7.4


>>>
>>
>>
>

Re: linux-next: build failure after merge of the net-next tree

2019-06-23 Thread Stephen Rothwell

Hi all,

On Thu, 20 Jun 2019 19:13:48 +1000 Stephen Rothwell  
wrote:
>
> After merging the net-next tree, today's linux-next build (powerpc
> allyesconfig) failed like this:
> 
> drivers/net/ethernet/cadence/macb_main.c:48:16: error: field 'hw' has 
> incomplete type
>   struct clk_hw hw;
> ^~
> drivers/net/ethernet/cadence/macb_main.c:4003:21: error: variable 
> 'fu540_c000_ops' has initializer but incomplete type
>  static const struct clk_ops fu540_c000_ops = {
>  ^~~
> drivers/net/ethernet/cadence/macb_main.c:4004:3: error: 'const struct 
> clk_ops' has no member named 'recalc_rate'
>   .recalc_rate = fu540_macb_tx_recalc_rate,
>^~~
> drivers/net/ethernet/cadence/macb_main.c:4004:17: warning: excess elements in 
> struct initializer
>   .recalc_rate = fu540_macb_tx_recalc_rate,
>  ^
> drivers/net/ethernet/cadence/macb_main.c:4004:17: note: (near initialization 
> for 'fu540_c000_ops')
> drivers/net/ethernet/cadence/macb_main.c:4005:3: error: 'const struct 
> clk_ops' has no member named 'round_rate'
>   .round_rate = fu540_macb_tx_round_rate,
>^~
> drivers/net/ethernet/cadence/macb_main.c:4005:16: warning: excess elements in 
> struct initializer
>   .round_rate = fu540_macb_tx_round_rate,
> ^~~~
> drivers/net/ethernet/cadence/macb_main.c:4005:16: note: (near initialization 
> for 'fu540_c000_ops')
> drivers/net/ethernet/cadence/macb_main.c:4006:3: error: 'const struct 
> clk_ops' has no member named 'set_rate'
>   .set_rate = fu540_macb_tx_set_rate,
>^~~~
> drivers/net/ethernet/cadence/macb_main.c:4006:14: warning: excess elements in 
> struct initializer
>   .set_rate = fu540_macb_tx_set_rate,
>   ^~
> drivers/net/ethernet/cadence/macb_main.c:4006:14: note: (near initialization 
> for 'fu540_c000_ops')
> drivers/net/ethernet/cadence/macb_main.c: In function 'fu540_c000_clk_init':
> drivers/net/ethernet/cadence/macb_main.c:4013:23: error: storage size of 
> 'init' isn't known
>   struct clk_init_data init;
>^~~~
> drivers/net/ethernet/cadence/macb_main.c:4032:12: error: implicit declaration 
> of function 'clk_register'; did you mean 'sock_register'? 
> [-Werror=implicit-function-declaration]
>   *tx_clk = clk_register(NULL, >hw);
> ^~~~
> sock_register
> drivers/net/ethernet/cadence/macb_main.c:4013:23: warning: unused variable 
> 'init' [-Wunused-variable]
>   struct clk_init_data init;
>^~~~
> drivers/net/ethernet/cadence/macb_main.c: In function 'macb_probe':
> drivers/net/ethernet/cadence/macb_main.c:4366:2: error: implicit declaration 
> of function 'clk_unregister'; did you mean 'sock_unregister'? 
> [-Werror=implicit-function-declaration]
>   clk_unregister(tx_clk);
>   ^~
>   sock_unregister
> drivers/net/ethernet/cadence/macb_main.c: At top level:
> drivers/net/ethernet/cadence/macb_main.c:4003:29: error: storage size of 
> 'fu540_c000_ops' isn't known
>  static const struct clk_ops fu540_c000_ops = {
>  ^~
> 
> Caused by commit
> 
>   c218ad559020 ("macb: Add support for SiFive FU540-C000")
> 
> CONFIG_COMMON_CLK is not set for this build.
> 
> I have reverted that commit for today.

I am still reverting that commit.  Has this problem been fixed in some
subtle way?
-- 
Cheers,
Stephen Rothwell


pgp1Wpv0yQ2lf.pgp
Description: OpenPGP digital signature

Re: [PATCH V6 3/3] arm64/mm: Enable memory hot remove

2019-06-23 Thread Anshuman Khandual




On 06/21/2019 08:05 PM, Steve Capper wrote:
> Hi Anshuman,
> 
> On Wed, Jun 19, 2019 at 09:47:40AM +0530, Anshuman Khandual wrote:
>> The arch code for hot-remove must tear down portions of the linear map and
>> vmemmap corresponding to memory being removed. In both cases the page
>> tables mapping these regions must be freed, and when sparse vmemmap is in
>> use the memory backing the vmemmap must also be freed.
>>
>> This patch adds a new remove_pagetable() helper which can be used to tear
>> down either region, and calls it from vmemmap_free() and
>> ___remove_pgd_mapping(). The sparse_vmap argument determines whether the
>> backing memory will be freed.
>>
>> remove_pagetable() makes two distinct passes over the kernel page table.
>> In the first pass it unmaps, invalidates applicable TLB cache and frees
>> backing memory if required (vmemmap) for each mapped leaf entry. In the
>> second pass it looks for empty page table sections whose page table page
>> can be unmapped, TLB invalidated and freed.
>>
>> While freeing intermediate level page table pages bail out if any of its
>> entries are still valid. This can happen for partially filled kernel page
>> table either from a previously attempted failed memory hot add or while
>> removing an address range which does not span the entire page table page
>> range.
>>
>> The vmemmap region may share levels of table with the vmalloc region.
>> There can be conflicts between hot remove freeing page table pages with
>> a concurrent vmalloc() walking the kernel page table. This conflict can
>> not just be solved by taking the init_mm ptl because of existing locking
>> scheme in vmalloc(). Hence unlike linear mapping, skip freeing page table
>> pages while tearing down vmemmap mapping.
>>
>> While here update arch_add_memory() to handle __add_pages() failures by
>> just unmapping recently added kernel linear mapping. Now enable memory hot
>> remove on arm64 platforms by default with ARCH_ENABLE_MEMORY_HOTREMOVE.
>>
>> This implementation is overall inspired from kernel page table tear down
>> procedure on X86 architecture.
>>
>> Acked-by: David Hildenbrand 
>> Signed-off-by: Anshuman Khandual 
>> ---
> 
> FWIW:
> Acked-by: Steve Capper 

Thanks Steve.

> 
> One minor comment below though.
> 
>>  arch/arm64/Kconfig  |   3 +
>>  arch/arm64/mm/mmu.c | 290 
>> ++--
>>  2 files changed, 284 insertions(+), 9 deletions(-)
>>
>> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
>> index 6426f48..9375f26 100644
>> --- a/arch/arm64/Kconfig
>> +++ b/arch/arm64/Kconfig
>> @@ -270,6 +270,9 @@ config HAVE_GENERIC_GUP
>>  config ARCH_ENABLE_MEMORY_HOTPLUG
>>  def_bool y
>>  
>> +config ARCH_ENABLE_MEMORY_HOTREMOVE
>> +def_bool y
>> +
>>  config SMP
>>  def_bool y
>>  
>> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
>> index 93ed0df..9e80a94 100644
>> --- a/arch/arm64/mm/mmu.c
>> +++ b/arch/arm64/mm/mmu.c
>> @@ -733,6 +733,250 @@ int kern_addr_valid(unsigned long addr)
>>  
>>  return pfn_valid(pte_pfn(pte));
>>  }
>> +
>> +#ifdef CONFIG_MEMORY_HOTPLUG
>> +static void free_hotplug_page_range(struct page *page, size_t size)
>> +{
>> +WARN_ON(!page || PageReserved(page));
>> +free_pages((unsigned long)page_address(page), get_order(size));
>> +}
> 
> We are dealing with power of 2 number of pages, it makes a lot more
> sense (to me) to replace the size parameter with order.
> 
> Also, all the callers are for known compile-time sizes, so we could just
> translate the size parameter as follows to remove any usage of get_order?
> PAGE_SIZE -> 0
> PMD_SIZE -> PMD_SHIFT - PAGE_SHIFT
> PUD_SIZE -> PUD_SHIFT - PAGE_SHIFT

Sure this can be changed but I remember Mark wanted to have this on size
instead of order which I proposed initially.

Re: [PATCH RFC] kvm: x86: Expose AVX512_BF16 feature to guest

2019-06-23 Thread Jing Liu


Hi Paolo,

After thinking more, I found way to satisfy all cases in a easy way.
How about things like this?

@@ -507,12 +510,26 @@ static inline int __do_cpuid_ent(struct 
kvm_cpuid_entry2 *entry, u32 fu

 * if the host doesn't support it.
 */
entry->edx |= F(ARCH_CAPABILITIES);
+   } else if (index == 1) {
+   entry->eax &= kvm_cpuid_7_1_eax_x86_features;
+   entry->ebx = 0;
+   entry->ecx = 0;
+   entry->edx = 0;
} else {
+   entry->eax = 0;
entry->ebx = 0;
entry->ecx = 0;
entry->edx = 0;
}
-   entry->eax = 0;
+
+   if (index == 0 && entry->eax >= 1) {
+   entry[1].eax &= kvm_cpuid_7_1_eax_x86_features;
+   entry[1].ebx = 0;
+   entry[1].ecx = 0;
+   entry[1].edx = 0;
+   entry[1].flags |= KVM_CPUID_FLAG_SIGNIFCANT_INDEX;
+   ++*nent;
+   }
break;
}


Or you prefer that I update this into another version later?

Thanks!
Jing

On 6/20/2019 11:09 PM, Liu, Jing2 wrote:

Hi Paolo,

On 6/20/2019 8:16 PM, Paolo Bonzini wrote:

On 20/06/19 13:21, Jing Liu wrote:

+    for (i = 1; i <= times; i++) {
+    if (*nent >= maxnent)
+    goto out;
+    do_cpuid_1_ent([i], function, i);
+    entry[i].eax &= F(AVX512_BF16);
+    entry[i].ebx = 0;
+    entry[i].ecx = 0;
+    entry[i].edx = 0;
+    entry[i].flags |= KVM_CPUID_FLAG_SIGNIFCANT_INDEX;
+    ++*nent;


This woud be wrong for i > 1, so instead make this

if (entry->eax >= 1)



I am confused about the @index parameter. @index seems not used for
every case except 0x07. Since the caller function only has @index=0, so
all other cases except 0x07 put cpuid info from subleaf=0 to max subleaf.

What do you think about @index in current function? Does it mean, we
need put cpuid from index to max subleaf to @entry[i]? If so, the logic
seems as follows,

if (index == 0) {
     // Put subleaf 0 into @entry
     // Put subleaf 1 into @entry[1]
} else if (index < entry->eax) {
     // Put subleaf 1 into @entry
} else {
     // Put all zero into @entry
}

But this seems not identical with other cases, for current caller
function. Or we can simply ignore @index in 0x07 and just put all possible
subleaf info back?


and define F(AVX512_BF16) as a new constant kvm_cpuid_7_1_eax_features.


Got it.


Thanks,
Jing


Paolo

1 2 3 4 5 >

1 - 100 of 407 matches

Mail list logo