Re: Regression: commit c9712e333809 breaks xilinx_uartps

2019-09-11 Thread Michal Simek
On 11. 09. 19 19:04, Paul Thomas wrote:
> Hello,
> 
> As I was working with a recent 5.2 kernel with a Zynq Ultrascale+
> board I found that the serial console wasn't working with a message
> like:
>  xuartps: probe of ff00.serial failed with error -16
> 
> I did a git bisect and found that this came from:
>  commit: c9712e3338098359a82c3f5d198c92688fa6cd26
>  serial: uartps: Use the same dynamic major number for all ports
> 
> One difference I might have is in the device-tree, I'm using a proper
> clock config (zynqmp-clk-ccf.dtsi) using the firmware clock interface.
> This is absolutely necessary, for instance, with the Ethernet
> negotiation where the macb driver needs to change the clock rate. In
> any case I believe this pushes my case to the -EPROBE_DEFER when
> devm_clk_get() fails the first time, this might not have been tested
> in with the original submission. I'm not sure this makes everything
> completely correct, but the patch below does fix the issue for me.
> 
> thanks,
> Paul
> 
> ---
>  drivers/tty/serial/xilinx_uartps.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/tty/serial/xilinx_uartps.c
> b/drivers/tty/serial/xilinx_uartps.c
> index 9dcc4d855ddd..ece7f6caa994 100644
> --- a/drivers/tty/serial/xilinx_uartps.c
> +++ b/drivers/tty/serial/xilinx_uartps.c
> @@ -1565,6 +1565,8 @@ static int cdns_uart_probe(struct platform_device *pdev)
> 
> cdns_uart_data->pclk = devm_clk_get(>dev, "pclk");
> if (PTR_ERR(cdns_uart_data->pclk) == -EPROBE_DEFER) {
> +   /* If we end up defering then set uartps_major back to 0 */
> +   uartps_major = 0;
> rc = PTR_ERR(cdns_uart_data->pclk);
> goto err_out_unregister_driver;
> }
> 

I expect that this can be problematic for all failures in probe.
What about this? Just setting up global major number if first instance
pass.
Also cleanup should be likely done in remove function too.


 diff --git a/drivers/tty/serial/xilinx_uartps.c
b/drivers/tty/serial/xilinx_uartps.c
 index f145946f659b..c1550b45d59b 100644
 --- a/drivers/tty/serial/xilinx_uartps.c
 +++ b/drivers/tty/serial/xilinx_uartps.c
 @@ -1550,7 +1550,6 @@ static int cdns_uart_probe(struct platform_device
*pdev)
 goto err_out_id;
 }

 -   uartps_major = cdns_uart_uart_driver->tty_driver->major;
 cdns_uart_data->cdns_uart_driver = cdns_uart_uart_driver;

 /*
 @@ -1680,6 +1679,7 @@ static int cdns_uart_probe(struct platform_device
*pdev)
 console_port = NULL;
  #endif

 +   uartps_major = cdns_uart_uart_driver->tty_driver->major;
 cdns_uart_data->cts_override =
of_property_read_bool(pdev->dev.of_node,

"cts-override");
 return 0;

Thanks,
Michal


Re: [PATCH] clk: sprd: add missing kfree

2019-09-11 Thread Chunyan Zhang
gentle ping

On Thu, 5 Sep 2019 at 18:30, Chunyan Zhang  wrote:
>
> From: Chunyan Zhang 
>
> The number of config registers for different pll clocks probably are not
> same, so we have to use malloc, and should free the memory before return.
>
> Fixes: 3e37b005580b ("clk: sprd: add adjustable pll support")
> Signed-off-by: Chunyan Zhang 
> Signed-off-by: Chunyan Zhang 
> ---
>  drivers/clk/sprd/pll.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/drivers/clk/sprd/pll.c b/drivers/clk/sprd/pll.c
> index 36b4402bf09e..640270f51aa5 100644
> --- a/drivers/clk/sprd/pll.c
> +++ b/drivers/clk/sprd/pll.c
> @@ -136,6 +136,7 @@ static unsigned long _sprd_pll_recalc_rate(const struct 
> sprd_pll *pll,
>  k2 + refin * nint * CLK_PLL_1M;
> }
>
> +   kfree(cfg);
> return rate;
>  }
>
> @@ -222,6 +223,7 @@ static int _sprd_pll_set_rate(const struct sprd_pll *pll,
> if (!ret)
> udelay(pll->udelay);
>
> +   kfree(cfg);
> return ret;
>  }
>
> --
> 2.20.1
>


[PATCH] dmaengine: sprd: Fix the link-list pointer register configuration issue

2019-09-11 Thread Baolin Wang
From: Zhenfang Wang 

We will set the link-list pointer register point to next link-list
configuration's physical address, which can load DMA configuration
from the link-list node automatically.

But the link-list node's physical address can be larger than 32bits,
and now Spreadtrum DMA driver only supports 32bits physical address,
which may cause loading a incorrect DMA configuration when starting
the link-list transfer mode. According to the DMA datasheet, we can
use SRC_BLK_STEP register (bit28 - bit31) to save the high bits of the
link-list node's physical address to fix this issue.

Fixes: 4ac695464763 ("dmaengine: sprd: Support DMA link-list mode")
Signed-off-by: Zhenfang Wang 
Signed-off-by: Baolin Wang 
---
 drivers/dma/sprd-dma.c |   12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/drivers/dma/sprd-dma.c b/drivers/dma/sprd-dma.c
index 525dc73..a4a91f2 100644
--- a/drivers/dma/sprd-dma.c
+++ b/drivers/dma/sprd-dma.c
@@ -134,6 +134,10 @@
 #define SPRD_DMA_SRC_TRSF_STEP_OFFSET  0
 #define SPRD_DMA_TRSF_STEP_MASKGENMASK(15, 0)
 
+/* SPRD DMA_SRC_BLK_STEP register definition */
+#define SPRD_DMA_LLIST_HIGH_MASK   GENMASK(31, 28)
+#define SPRD_DMA_LLIST_HIGH_SHIFT  28
+
 /* define DMA channel mode & trigger mode mask */
 #define SPRD_DMA_CHN_MODE_MASK GENMASK(7, 0)
 #define SPRD_DMA_TRG_MODE_MASK GENMASK(7, 0)
@@ -717,6 +721,7 @@ static int sprd_dma_fill_desc(struct dma_chan *chan,
u32 int_mode = flags & SPRD_DMA_INT_MASK;
int src_datawidth, dst_datawidth, src_step, dst_step;
u32 temp, fix_mode = 0, fix_en = 0;
+   phys_addr_t llist_ptr;
 
if (dir == DMA_MEM_TO_DEV) {
src_step = sprd_dma_get_step(slave_cfg->src_addr_width);
@@ -814,13 +819,16 @@ static int sprd_dma_fill_desc(struct dma_chan *chan,
 * Set the link-list pointer point to next link-list
 * configuration's physical address.
 */
-   hw->llist_ptr = schan->linklist.phy_addr + temp;
+   llist_ptr = schan->linklist.phy_addr + temp;
+   hw->llist_ptr = lower_32_bits(llist_ptr);
+   hw->src_blk_step = (upper_32_bits(llist_ptr) << 
SPRD_DMA_LLIST_HIGH_SHIFT) &
+   SPRD_DMA_LLIST_HIGH_MASK;
} else {
hw->llist_ptr = 0;
+   hw->src_blk_step = 0;
}
 
hw->frg_step = 0;
-   hw->src_blk_step = 0;
hw->des_blk_step = 0;
return 0;
 }
-- 
1.7.9.5



Re: [PATCH v3 3/3] powerpc/prom_init: Use -ffreestanding to avoid a reference to bcmp

2019-09-11 Thread Nathan Chancellor
On Wed, Sep 11, 2019 at 02:01:59PM -0700, Nick Desaulniers wrote:
> On Wed, Sep 11, 2019 at 11:21 AM Nathan Chancellor
>  wrote:
> >
> > r370454 gives LLVM the ability to convert certain loops into a reference
> > to bcmp as an optimization; this breaks prom_init_check.sh:
> >
> >   CALLarch/powerpc/kernel/prom_init_check.sh
> > Error: External symbol 'bcmp' referenced from prom_init.c
> > make[2]: *** [arch/powerpc/kernel/Makefile:196: prom_init_check] Error 1
> >
> > bcmp is defined in lib/string.c as a wrapper for memcmp so this could be
> > added to the whitelist. However, commit 450e7dd4001f ("powerpc/prom_init:
> > don't use string functions from lib/") copied memcmp as prom_memcmp to
> > avoid KASAN instrumentation so having bcmp be resolved to regular memcmp
> > would break that assumption. Furthermore, because the compiler is the
> > one that inserted bcmp, we cannot provide something like prom_bcmp.
> >
> > To prevent LLVM from being clever with optimizations like this, use
> > -ffreestanding to tell LLVM we are not hosted so it is not free to make
> > transformations like this.
> >
> > Link: https://github.com/ClangBuiltLinux/linux/issues/647
> > Link: 
> > https://github.com/llvm/llvm-project/commit/5c9f3cfec78f9e9ae013de9a0d092a68e3e79e002
> 
> The above link doesn't work for me (HTTP 404).  PEBKAC?
> https://github.com/llvm/llvm-project/commit/5c9f3cfec78f9e9ae013de9a0d092a68e3e79e002

Not really sure how an extra 2 got added on the end of that... Must have
screwed up in vim somehow.

Link: 
https://github.com/llvm/llvm-project/commit/5c9f3cfec78f9e9ae013de9a0d092a68e3e79e00

I can resend unless the maintainer is able to fix that up when it gets
applied.

Cheers,
Nathan


[PATCH] s390: remove pointless drivers-y in drivers/s390/Makefile

2019-09-11 Thread Masahiro Yamada
This is unused.

Signed-off-by: Masahiro Yamada 
---

 drivers/s390/Makefile | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/s390/Makefile b/drivers/s390/Makefile
index a863b0462b43..cde73b6a9afb 100644
--- a/drivers/s390/Makefile
+++ b/drivers/s390/Makefile
@@ -4,6 +4,3 @@
 #
 
 obj-y += cio/ block/ char/ crypto/ net/ scsi/ virtio/
-
-drivers-y += drivers/s390/built-in.a
-
-- 
2.17.1



[PATCH V2] ovl: Fix dereferencing possible ERR_PTR()

2019-09-11 Thread Ding Xiang
if ovl_encode_real_fh() fails, no memory was allocated
and the error in the error-valued pointer should be returned.

V1->V2: fix SHA1 length problem

Fixes: 9b6faee07470 ("ovl: check ERR_PTR() return value from ovl_encode_fh()")
Signed-off-by: Ding Xiang 
---
 fs/overlayfs/export.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/overlayfs/export.c b/fs/overlayfs/export.c
index cb8ec1f..50ade19 100644
--- a/fs/overlayfs/export.c
+++ b/fs/overlayfs/export.c
@@ -229,7 +229,7 @@ static int ovl_d_to_fh(struct dentry *dentry, char *buf, 
int buflen)
ovl_dentry_upper(dentry), !enc_lower);
err = PTR_ERR(fh);
if (IS_ERR(fh))
-   goto fail;
+   return err;
 
err = -EOVERFLOW;
if (fh->len > buflen)
-- 
1.9.1





Re: [PATCH 5/5] hugetlbfs: Limit wait time when trying to share huge PMD

2019-09-11 Thread Davidlohr Bueso

On Wed, 11 Sep 2019, Matthew Wilcox wrote:


On Wed, Sep 11, 2019 at 08:26:52PM -0700, Mike Kravetz wrote:

All this got me wondering if we really need to take i_mmap_rwsem in write
mode here.  We are not changing the tree, only traversing it looking for
a suitable vma.

Unless I am missing something, the hugetlb code only ever takes the semaphore
in write mode; never read.  Could this have been the result of changing the
tree semaphore to read/write?  Instead of analyzing all the code, the easiest
and safest thing would have been to take all accesses in write mode.


I was wondering the same thing.  It was changed here:

commit 83cde9e8ba95d180eaefefe834958fbf7008cf39
Author: Davidlohr Bueso 
Date:   Fri Dec 12 16:54:21 2014 -0800

   mm: use new helper functions around the i_mmap_mutex

   Convert all open coded mutex_lock/unlock calls to the
   i_mmap_[lock/unlock]_write() helpers.

and a subsequent patch said:

   This conversion is straightforward.  For now, all users take the write
   lock.

There were subsequent patches which changed a few places
c8475d144abb1e62958cc5ec281d2a9e161c1946
1acf2e040721564d579297646862b8ea3dd4511b
d28eb9c861f41aa2af4cfcc5eeeddff42b13d31e
874bfcaf79e39135cd31e1cfc9265cf5222d1ec3
3dec0ba0be6a532cac949e02b853021bf6d57dad

but I don't know why this one wasn't changed.


I cannot recall why huge_pmd_share() was not changed along with the other
callers that don't modify the interval tree. By looking at the function,
I agree that this could be shared, in fact this lock is much less involved
than it's anon_vma counterpart, last I checked (perhaps with the exception
of take_rmap_locks().



(I was also wondering about caching a potentially sharable page table
in the address_space to avoid having to walk the VMA tree at all if that
one happened to be sharable).


I also think that the right solution is within the mm instead of adding
a new api to rwsem and the extra complexity/overhead to osq _just_ for this
case. We've managed to not need timeout extensions in our locking primitives
thus far, which is a good thing imo.

Thanks,
Davidlohr


Klientskie Bazy http://prodawez.tilda.ws/page7270311.html

2019-09-11 Thread linux-kernel
Klientskie Bazy http://prodawez.tilda.ws/page7270311.html


Re: [PATCH V7 3/3] arm64/mm: Enable memory hot remove

2019-09-11 Thread Anshuman Khandual



On 09/10/2019 09:47 PM, Catalin Marinas wrote:
> On Tue, Sep 03, 2019 at 03:15:58PM +0530, Anshuman Khandual wrote:
>> @@ -770,6 +1022,28 @@ int __meminit vmemmap_populate(unsigned long start, 
>> unsigned long end, int node,
>>  void vmemmap_free(unsigned long start, unsigned long end,
>>  struct vmem_altmap *altmap)
>>  {
>> +#ifdef CONFIG_MEMORY_HOTPLUG
>> +/*
>> + * FIXME: We should have called remove_pagetable(start, end, true).
>> + * vmemmap and vmalloc virtual range might share intermediate kernel
>> + * page table entries. Removing vmemmap range page table pages here
>> + * can potentially conflict with a concurrent vmalloc() allocation.
>> + *
>> + * This is primarily because vmalloc() does not take init_mm ptl for
>> + * the entire page table walk and it's modification. Instead it just
>> + * takes the lock while allocating and installing page table pages
>> + * via [p4d|pud|pmd|pte]_alloc(). A concurrently vanishing page table
>> + * entry via memory hot remove can cause vmalloc() kernel page table
>> + * walk pointers to be invalid on the fly which can cause corruption
>> + * or worst, a crash.
>> + *
>> + * So free_empty_tables() gets called where vmalloc and vmemmap range
>> + * do not overlap at any intermediate level kernel page table entry.
>> + */
>> +unmap_hotplug_range(start, end, true);
>> +if (!vmalloc_vmemmap_overlap)
>> +free_empty_tables(start, end);
>> +#endif
>>  }
>>  #endif  /* CONFIG_SPARSEMEM_VMEMMAP */

Hello Catalin,

> 
> I wonder whether we could simply ignore the vmemmap freeing altogether,
> just leave it around and not unmap it. This way, we could call

This would have been an option (even if we just ignore for a moment that
it might not be the cleanest possible method) if present memory hot remove
scenarios involved just system RAM of comparable sizes.

But with persistent memory which will be plugged in as ZONE_DEVICE might
ask for a vmem_atlamp based vmemmap mapping where the backing memory comes
from the persistent memory range itself not from existing system RAM. IIRC
altmap support was originally added because the amount persistent memory on
a system might be order of magnitude higher than that of regular system RAM.
During normal memory hot add (without altmap) would have caused great deal
of consumption from system RAM just for persistent memory range's vmemmap
mapping. In order to avoid such a scenario altmap was created to allocate
vmemmap mapping backing memory from the device memory range itself.

In such cases vmemmap must be unmapped and it's backing memory freed up for
the complete removal of persistent memory which originally requested for
altmap based vmemmap backing.

Just as a reference, the upcoming series which enables altmap support on
arm64 tries to allocate vmemmap mapping backing memory from the device range
itself during memory hot add and free them up during memory hot remove. Those
methods will not be possible if memory hot-remove does not really free up
vmemmap backing storage.

https://patchwork.kernel.org/project/linux-mm/list/?series=139299

> unmap_kernel_range() for removing the linear map and we save some code.
> 
> For the linear map, I think we use just above 2MB of tables for 1GB of
> memory mapped (worst case with 4KB pages we need 512 pte pages). For
> vmemmap we'd use slightly above 2MB for a 64GB hotplugged memory. Do we

You are right, the amount of memory required for kernel page table pages
are dependent on mapping page size and size of the range to be mapped. But
as explained below there might be hot remove situations where these ranges
will remain unused for ever after hot remove. There are chances that some
these pages (even empty) might remain unused for good.

> expect such memory to be re-plugged again in the same range? If we do,
> then I shouldn't even bother with removing the vmmemmap.
> 
> I don't fully understand the use-case for memory hotremove, so any
> additional info would be useful to make a decision here.

Sure, these are some of the scenarios I could recollect.

Physical Environment:

A. Physical DIMM replacement

Platform detects memory errors and initiates a DIMM replacement.

- Hot remove selected DIMM with errors
- Hot add a new DIMM in it's place on the same slot

In normal circumstances, the new DIMM will require the same linear
and vmemmap mapping. In such cases hot-remove could just unmap
linear mapping, leave everything else and be done with it. Though
I am not sure whether its a good idea to leave aside accessible
struct pages which correspond to non-present pfns.

B. Physical DIMM movement

Platform can detect errors on a DIMM slot itself and initiates a
DIMM movement into a different empty slot

- Hot remove selected memory DIMM from defective slot
- Hot add same memory DIMM into a different available empty slot

Physical address range for the DIMM has now changed, it will require

Klientskie Bazy http://prodawez.tilda.ws/page7270311.html

2019-09-11 Thread linux-kernel
Klientskie Bazy http://prodawez.tilda.ws/page7270311.html


[PATCH] KVM: x86: work around leak of uninitialized stack contents

2019-09-11 Thread Fuqian Huang
Emulation of VMPTRST can incorrectly inject a page fault
when passed an operand that points to an MMIO address.
The page fault will use uninitialized kernel stack memory
as the CR2 and error code.

The right behavior would be to abort the VM with a KVM_EXIT_INTERNAL_ERROR
exit to userspace; however, it is not an easy fix, so for now just ensure
that the error code and CR2 are zero.

Signed-off-by: Fuqian Huang 
---
 arch/x86/kvm/x86.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 290c3c3efb87..7f442d710858 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -5312,6 +5312,7 @@ int kvm_write_guest_virt_system(struct kvm_vcpu *vcpu, 
gva_t addr, void *val,
/* kvm_write_guest_virt_system can pull in tons of pages. */
vcpu->arch.l1tf_flush_l1d = true;
 
+   memset(exception, 0, sizeof(*exception));
return kvm_write_guest_virt_helper(addr, val, bytes, vcpu,
   PFERR_WRITE_MASK, exception);
 }
-- 
2.11.0



Re: [PATCH 11/13] nvdimm: Use more common logic testing styles and bare ; positions

2019-09-11 Thread Verma, Vishal L
On Wed, 2019-09-11 at 19:54 -0700, Joe Perches wrote:
> Avoid using uncommon logic testing styles to make the code a
> bit more like other kernel code.
> 
> e.g.:
>   if (foo) {
>   ;
>   } else {
>   
>   }
> 
> is typically written
> 
>   if (!foo) {
>   
>   }
> 

A lot of times the excessive inversions seem to result in a net loss of
readability - e.g.:



> diff --git a/drivers/nvdimm/region_devs.c
> b/drivers/nvdimm/region_devs.c
> index 65df07481909..6861e0997d21 100644
> --- a/drivers/nvdimm/region_devs.c
> +++ b/drivers/nvdimm/region_devs.c
> @@ -320,9 +320,7 @@ static ssize_t set_cookie_show(struct device *dev,
>   struct nd_interleave_set *nd_set = nd_region->nd_set;
>   ssize_t rc = 0;
>  
> - if (is_memory(dev) && nd_set)
> - /* pass, should be precluded by region_visible */;

For one, the comment is lost

> - else
> + if (!(is_memory(dev) && nd_set))

And it takes a moment to resolve between things such as:

if (!(A && B))
  vs.
if (!(A) && B)

And this is especially true if 'A' and 'B' are longer function calls,
split over multiple lines, or are themselves compound 'sections'.

I'm not opposed to /all/ such transformations -- for example, the ones
where the logical inversion can be 'consumed' by toggling a comparision
operator, as you have a few times in this patch, don't sacrifice any
readibility, and perhaps even improve it. 

>   return -ENXIO;
>  
>   /*


[PATCH v2 net 2/3] sctp: remove redundant assignment when call sctp_get_port_local

2019-09-11 Thread Mao Wenan
There are more parentheses in if clause when call sctp_get_port_local
in sctp_do_bind, and redundant assignment to 'ret'. This patch is to
do cleanup.

Signed-off-by: Mao Wenan 
Acked-by: Neil Horman 
---
 net/sctp/socket.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/net/sctp/socket.c b/net/sctp/socket.c
index 5e1934c48709..2f810078c91d 100644
--- a/net/sctp/socket.c
+++ b/net/sctp/socket.c
@@ -399,9 +399,8 @@ static int sctp_do_bind(struct sock *sk, union sctp_addr 
*addr, int len)
 * detection.
 */
addr->v4.sin_port = htons(snum);
-   if ((ret = sctp_get_port_local(sk, addr))) {
+   if (sctp_get_port_local(sk, addr))
return -EADDRINUSE;
-   }
 
/* Refresh ephemeral port.  */
if (!bp->port)
-- 
2.20.1



[PATCH v2 net 3/3] sctp: destroy bucket if failed to bind addr

2019-09-11 Thread Mao Wenan
There is one memory leak bug report:
BUG: memory leak
unreferenced object 0x8881dc4c5ec0 (size 40):
  comm "syz-executor.0", pid 5673, jiffies 4298198457 (age 27.578s)
  hex dump (first 32 bytes):
02 00 00 00 81 88 ff ff 00 00 00 00 00 00 00 00  
f8 63 3d c1 81 88 ff ff 00 00 00 00 00 00 00 00  .c=.
  backtrace:
[<72006339>] sctp_get_port_local+0x2a1/0xa00 [sctp]
[] sctp_do_bind+0x176/0x2c0 [sctp]
[<5be274a2>] sctp_bind+0x5a/0x80 [sctp]
[] inet6_bind+0x59/0xd0 [ipv6]
[] __sys_bind+0x120/0x1f0 net/socket.c:1647
[<4513635b>] __do_sys_bind net/socket.c:1658 [inline]
[<4513635b>] __se_sys_bind net/socket.c:1656 [inline]
[<4513635b>] __x64_sys_bind+0x3e/0x50 net/socket.c:1656
[<61f2501e>] do_syscall_64+0x72/0x2e0 arch/x86/entry/common.c:296
[<03d1e05e>] entry_SYSCALL_64_after_hwframe+0x49/0xbe

This is because in sctp_do_bind, if sctp_get_port_local is to
create hash bucket successfully, and sctp_add_bind_addr failed
to bind address, e.g return -ENOMEM, so memory leak found, it
needs to destroy allocated bucket.

Reported-by: Hulk Robot 
Signed-off-by: Mao Wenan 
Acked-by: Neil Horman 
---
 net/sctp/socket.c | 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/net/sctp/socket.c b/net/sctp/socket.c
index 2f810078c91d..69ec3b796197 100644
--- a/net/sctp/socket.c
+++ b/net/sctp/socket.c
@@ -412,11 +412,13 @@ static int sctp_do_bind(struct sock *sk, union sctp_addr 
*addr, int len)
ret = sctp_add_bind_addr(bp, addr, af->sockaddr_len,
 SCTP_ADDR_SRC, GFP_ATOMIC);
 
-   /* Copy back into socket for getsockname() use. */
-   if (!ret) {
-   inet_sk(sk)->inet_sport = htons(inet_sk(sk)->inet_num);
-   sp->pf->to_sk_saddr(addr, sk);
+   if (ret) {
+   sctp_put_port(sk);
+   return ret;
}
+   /* Copy back into socket for getsockname() use. */
+   inet_sk(sk)->inet_sport = htons(inet_sk(sk)->inet_num);
+   sp->pf->to_sk_saddr(addr, sk);
 
return ret;
 }
-- 
2.20.1



[PATCH v2 net 1/3] sctp: change return type of sctp_get_port_local

2019-09-11 Thread Mao Wenan
Currently sctp_get_port_local() returns a long
which is either 0,1 or a pointer casted to long.
It's neither of the callers use the return value since
commit 62208f12451f ("net: sctp: simplify sctp_get_port").
Now two callers are sctp_get_port and sctp_do_bind,
they actually assumend a casted to an int was the same as
a pointer casted to a long, and they don't save the return
value just check whether it is zero or non-zero, so
it would better change return type from long to int for
sctp_get_port_local.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Mao Wenan 
---
 net/sctp/socket.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/net/sctp/socket.c b/net/sctp/socket.c
index 9d1f83b10c0a..5e1934c48709 100644
--- a/net/sctp/socket.c
+++ b/net/sctp/socket.c
@@ -309,7 +309,7 @@ static int sctp_bind(struct sock *sk, struct sockaddr 
*addr, int addr_len)
return retval;
 }
 
-static long sctp_get_port_local(struct sock *, union sctp_addr *);
+static int sctp_get_port_local(struct sock *, union sctp_addr *);
 
 /* Verify this is a valid sockaddr. */
 static struct sctp_af *sctp_sockaddr_af(struct sctp_sock *opt,
@@ -7998,7 +7998,7 @@ static void sctp_unhash(struct sock *sk)
 static struct sctp_bind_bucket *sctp_bucket_create(
struct sctp_bind_hashbucket *head, struct net *, unsigned short snum);
 
-static long sctp_get_port_local(struct sock *sk, union sctp_addr *addr)
+static int sctp_get_port_local(struct sock *sk, union sctp_addr *addr)
 {
struct sctp_sock *sp = sctp_sk(sk);
bool reuse = (sk->sk_reuse || sp->reuse);
@@ -8108,7 +8108,7 @@ static long sctp_get_port_local(struct sock *sk, union 
sctp_addr *addr)
 
if (sctp_bind_addr_conflict(>base.bind_addr,
addr, sp2, sp)) {
-   ret = (long)sk2;
+   ret = 1;
goto fail_unlock;
}
}
@@ -8180,7 +8180,7 @@ static int sctp_get_port(struct sock *sk, unsigned short 
snum)
addr.v4.sin_port = htons(snum);
 
/* Note: sk->sk_num gets filled in if ephemeral port request. */
-   return !!sctp_get_port_local(sk, );
+   return sctp_get_port_local(sk, );
 }
 
 /*
-- 
2.20.1



Re: Linux 5.3-rc8

2019-09-11 Thread Ahmed S. Darwish
Hi Ted,

On Wed, Sep 11, 2019 at 01:36:24PM -0400, Theodore Y. Ts'o wrote:
> On Wed, Sep 11, 2019 at 06:00:19PM +0100, Linus Torvalds wrote:
> > [0.231255] random: get_random_bytes called from
> > start_kernel+0x323/0x4f5 with crng_init=0
> >
> > and that's this code:
> >
> > add_latent_entropy();
> > add_device_randomness(command_line, strlen(command_line));
> > boot_init_stack_canary();
> >
> > in particular, it's the boot_init_stack_canary() thing that asks for a
> > random number for the canary.
> >
> > I don't actually see the 'crng init done' until much much later:
> >
> > [   21.741125] random: crng init done
>
> Yes, that's super early in the boot sequence.  IIRC the stack canary
> gets reinitialized later (or maybe it was only for the other CPU's in
> SMP mode; I don't recall the details of the top of my head).
>
> I think this one always fails, and perhaps we should have a way of
> suppressing it --- but that's correct the in-kernel interface doesn't
> block.
>
> The /dev/urandom device doesn't block either, despite security
> eggheads continually asking me to change it to block ala getrandom(2),
> but I have always pushed because because I *know* changing
> /dev/urandom to block would be asking for userspace regressions.
>
> The compromise we came up with was that since getrandom(2) is a new
> interface, we could make this have the behavior that the security
> heads wanted, which is to make blocking unconditional, since the
> theory was that *this* interface would be sane, and that userspace
> applications which used it too early was buggy, and we could make it
> *their* problem.
>

H, IMHO it's almost impossible to define "too early" here... Does
it mean applications in the critical boot path? Does gnome-session =>
libICE => libbsd => getentropy() => getrandom() => generated MIT magic
cookie count as being too early? It's very hazy...

getrandom(2) basically has no guaranteed upper bound for the waiting
time. And in the report I submitted in the parent thread, the upper
bound is really "infinitely locked"...

Here is a trace_printk() log of all the getrandom() calls done from
system boot:

systemd-random--179   2.510228: getrandom(512 bytes, flags = 1)
systemd-random--179   2.510239: getrandom(512 bytes, flags = 0)
polkitd-294   3.903699: getrandom(8 bytes, flags = 1)
polkitd-294   3.904191: getrandom(8 bytes, flags = 1)

  ... + 45 similar instances

gnome-session-b-327   4.400620: getrandom(16 bytes, flags = 0)

  ... boot blocks here, until
  pressing some keys

gnome-session-b-327   49.32140: getrandom(16 bytes, flags = 0)

  ... + 3 similar instances

gnome-shell-335   49.553594: getrandom(8 bytes, flags = 1)
gnome-shell-335   49.553600: getrandom(8 bytes, flags = 1)

  ... + 10 similar instances

   Xwayland-345   50.129401: getrandom(8 bytes, flags = 1)
   Xwayland-345   50.129491: getrandom(8 bytes, flags = 1)

  ... + 9 similar instances

gnome-shell-335   50.487543: getrandom(8 bytes, flags = 1)
gnome-shell-335   50.487550: getrandom(8 bytes, flags = 1)

  ... + 79 similar instances

  gsd-xsettings-390   51.431638: getrandom(8 bytes, flags = 1)
  gsd-clipboard-389   51.432693: getrandom(8 bytes, flags = 1)
  gsd-xsettings-390   51.433899: getrandom(8 bytes, flags = 1)
  gsd-smartcard-388   51.433924: getrandom(110 bytes, flags = 0)
  gsd-smartcard-388   51.433936: getrandom(256 bytes, flags = 0)

  ... + 3 similar instances

And it goes on, including processes like gsd-power-, gsd-xsettings-,
gsd-clipboard-, gsd-print-notif, gsd-clipboard-, gsd-color,
gst-keyboard-, etc.

What's the boundary of "too early" here? It's kinda undefinable..

> People have suggested adding a new getrandom flag, 
> GRND_I_KNOW_THIS_IS_INSECURE,
> or some such, which wouldn't block and would return "best efforts"
> randomness.  I haven't been super enthusiastic about such a flag
> because I *know* it would be insecure.   However, the next time a massive
> security bug shows up on the front pages of the Wall Street Journal,
> or on some web site such as https://factorable.net, it won't be the kernel's 
> fault
> since the flag will be GRND_INSECURE_BROKEN_APPLICATION, or some such.
> It doesn't really solve the problem, though.
>

At least for generating the MIT cookie, it would make some sort of
sense... Really caring about truly random-numbers while using Xorg
is almost like perfecting a hard-metal door for the paper house ;)

(Jokes aside, I understand that this cannot be the solution)

> > But this does show that
> >
> >  (a) we have the same issue in the kernel, and we don't block there
>
> Ultimately, I think the only right answer is to make it the
> bootloader's 

[PATCH v2 net 0/3] fix memory leak for sctp_do_bind

2019-09-11 Thread Mao Wenan
First two patches are to do cleanup, remove redundant assignment,
and change return type of sctp_get_port_local.
Third patch is to fix memory leak for sctp_do_bind if failed
to bind address.

---
 v2: add one patch to change return type of sctp_get_port_local.
---
Mao Wenan (3):
  sctp: change return type of sctp_get_port_local
  sctp: remove redundant assignment when call sctp_get_port_local
  sctp: destroy bucket if failed to bind addr

 net/sctp/socket.c | 21 +++--
 1 file changed, 11 insertions(+), 10 deletions(-)

-- 
2.20.1



Re: [PATCH 5/5] hugetlbfs: Limit wait time when trying to share huge PMD

2019-09-11 Thread Matthew Wilcox
On Wed, Sep 11, 2019 at 08:26:52PM -0700, Mike Kravetz wrote:
> All this got me wondering if we really need to take i_mmap_rwsem in write
> mode here.  We are not changing the tree, only traversing it looking for
> a suitable vma.
> 
> Unless I am missing something, the hugetlb code only ever takes the semaphore
> in write mode; never read.  Could this have been the result of changing the
> tree semaphore to read/write?  Instead of analyzing all the code, the easiest
> and safest thing would have been to take all accesses in write mode.

I was wondering the same thing.  It was changed here:

commit 83cde9e8ba95d180eaefefe834958fbf7008cf39
Author: Davidlohr Bueso 
Date:   Fri Dec 12 16:54:21 2014 -0800

mm: use new helper functions around the i_mmap_mutex

Convert all open coded mutex_lock/unlock calls to the
i_mmap_[lock/unlock]_write() helpers.

and a subsequent patch said:

This conversion is straightforward.  For now, all users take the write
lock.

There were subsequent patches which changed a few places
c8475d144abb1e62958cc5ec281d2a9e161c1946
1acf2e040721564d579297646862b8ea3dd4511b
d28eb9c861f41aa2af4cfcc5eeeddff42b13d31e
874bfcaf79e39135cd31e1cfc9265cf5222d1ec3
3dec0ba0be6a532cac949e02b853021bf6d57dad

but I don't know why this one wasn't changed.

(I was also wondering about caching a potentially sharable page table
in the address_space to avoid having to walk the VMA tree at all if that
one happened to be sharable).


[PATCH RFC] rtc: Fix the AltCentury value on AMD/Hygon platform

2019-09-11 Thread Jinke Fan
When using following operations:
date -s "21190910 19:20:00"
hwclock -w
to change date from 2019 to 2119 for test, it will fail on Hygon
Dhyana and AMD Zen CPUs, while the same operations run ok on Intel i7
platform.

MC146818 driver use function mc146818_set_time() to set register
RTC_FREQ_SELECT(RTC_REG_A)'s bit4-bit6 field which means divider stage
reset value on Intel platform to 1.

While AMD/Hygon RTC_REG_A(0Ah)'s bit4 is defined as DV0 [Reference]:
DV0 = 0 selects Bank 0, DV0 = 1 selects Bank 1. Bit5-bit6 is defined
as reserved.

DV0 is set to 1, it will select Bank 1, which will disable AltCentury
register(0x32) access. As UEFI pass acpi_gbl_FADT.century 0x32
(AltCentury), the CMOS write will be failed on code:
CMOS_WRITE(century, acpi_gbl_FADT.century).

Correct RTC_REG_A bank select bit(DV0) to 0 on AMD/Hygon CPUs, it will
enable AltCentury(0x32) register writing and finally setup century as
expected.

Test results on AMD/Hygon machine show that it works as expected.

Reference:
https://www.amd.com/system/files/TechDocs/51192_Bolton_FCH_RRG.pdf
section: 3.13 Real Time Clock (RTC)

Signed-off-by: Jinke Fan 
---
 drivers/rtc/rtc-mc146818-lib.c | 9 +++--
 include/linux/mc146818rtc.h| 2 ++
 2 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/drivers/rtc/rtc-mc146818-lib.c b/drivers/rtc/rtc-mc146818-lib.c
index 2ecd8752b088..c09fe486ae67 100644
--- a/drivers/rtc/rtc-mc146818-lib.c
+++ b/drivers/rtc/rtc-mc146818-lib.c
@@ -170,9 +170,14 @@ int mc146818_set_time(struct rtc_time *time)
}
 
save_control = CMOS_READ(RTC_CONTROL);
-   CMOS_WRITE((save_control|RTC_SET), RTC_CONTROL);
+   CMOS_WRITE((save_control | RTC_SET), RTC_CONTROL);
save_freq_select = CMOS_READ(RTC_FREQ_SELECT);
-   CMOS_WRITE((save_freq_select|RTC_DIV_RESET2), RTC_FREQ_SELECT);
+
+   if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD ||
+   boot_cpu_data.x86_vendor == X86_VENDOR_HYGON)
+   CMOS_WRITE((save_freq_select & (~RTC_DV0)), RTC_FREQ_SELECT);
+   else
+   CMOS_WRITE((save_freq_select | RTC_DIV_RESET2), 
RTC_FREQ_SELECT);
 
 #ifdef CONFIG_MACH_DECSTATION
CMOS_WRITE(real_yrs, RTC_DEC_YEAR);
diff --git a/include/linux/mc146818rtc.h b/include/linux/mc146818rtc.h
index 0661af17a758..b8ba6556c371 100644
--- a/include/linux/mc146818rtc.h
+++ b/include/linux/mc146818rtc.h
@@ -86,6 +86,8 @@ struct cmos_rtc_board_info {
/* 2 values for divider stage reset, others for "testing purposes only" */
 #  define RTC_DIV_RESET1   0x60
 #  define RTC_DIV_RESET2   0x70
+   /* DV0 = 0 selects Bank 0, DV0 = 1 selects Bank 1 on AMD/Hygon platform */
+#  define RTC_DV0  0x10
   /* Periodic intr. / Square wave rate select. 0=none, 1=32.8kHz,... 15=2Hz */
 # define RTC_RATE_SELECT   0x0F
 
-- 
2.17.1



Re: [PATCH 5/5] hugetlbfs: Limit wait time when trying to share huge PMD

2019-09-11 Thread Mike Kravetz
On 9/11/19 8:05 AM, Waiman Long wrote:
> When allocating a large amount of static hugepages (~500-1500GB) on a
> system with large number of CPUs (4, 8 or even 16 sockets), performance
> degradation (random multi-second delays) was observed when thousands
> of processes are trying to fault in the data into the huge pages. The
> likelihood of the delay increases with the number of sockets and hence
> the CPUs a system has.  This only happens in the initial setup phase
> and will be gone after all the necessary data are faulted in.
> 
> These random delays, however, are deemed unacceptable. The cause of
> that delay is the long wait time in acquiring the mmap_sem when trying
> to share the huge PMDs.
> 
> To remove the unacceptable delays, we have to limit the amount of wait
> time on the mmap_sem. So the new down_write_timedlock() function is
> used to acquire the write lock on the mmap_sem with a timeout value of
> 10ms which should not cause a perceivable delay. If timeout happens,
> the task will abandon its effort to share the PMD and allocate its own
> copy instead.
> 

> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 6d7296dd11b8..445af661ae29 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -4750,6 +4750,8 @@ void adjust_range_if_pmd_sharing_possible(struct 
> vm_area_struct *vma,
>   }
>  }
>  
> +#define PMD_SHARE_DISABLE_THRESHOLD  (1 << 8)
> +
>  /*
>   * Search for a shareable pmd page for hugetlb. In any case calls pmd_alloc()
>   * and returns the corresponding pte. While this is not necessary for the
> @@ -4770,11 +4772,24 @@ pte_t *huge_pmd_share(struct mm_struct *mm, unsigned 
> long addr, pud_t *pud)
>   pte_t *spte = NULL;
>   pte_t *pte;
>   spinlock_t *ptl;
> + static atomic_t timeout_cnt;
>  
> - if (!vma_shareable(vma, addr))
> - return (pte_t *)pmd_alloc(mm, pud, addr);
> + /*
> +  * Don't share if it is not sharable or locking attempt timed out
> +  * after 10ms. After 256 timeouts, PMD sharing will be permanently
> +  * disabled as it is just too slow.
> +  */
> + if (!vma_shareable(vma, addr) ||
> +(atomic_read(_cnt) >= PMD_SHARE_DISABLE_THRESHOLD))
> + goto out_no_share;
> +
> + if (!i_mmap_timedlock_write(mapping, ms_to_ktime(10))) {
> + if (atomic_inc_return(_cnt) ==
> + PMD_SHARE_DISABLE_THRESHOLD)
> + pr_info("Hugetlbfs PMD sharing disabled because of 
> timeouts!\n");
> + goto out_no_share;
> + }
>  
> - i_mmap_lock_write(mapping);

All this got me wondering if we really need to take i_mmap_rwsem in write
mode here.  We are not changing the tree, only traversing it looking for
a suitable vma.

Unless I am missing something, the hugetlb code only ever takes the semaphore
in write mode; never read.  Could this have been the result of changing the
tree semaphore to read/write?  Instead of analyzing all the code, the easiest
and safest thing would have been to take all accesses in write mode.

I can investigate more, but wanted to ask the question in case someone already
knows.

At one time, I thought it was safe to acquire the semaphore in read mode for
huge_pmd_share, but write mode for huge_pmd_unshare.  See commit b43a99900559.
This was reverted along with another patch for other reasons.

If we change change from write to read mode, this may have significant impact
on the stalls.
-- 
Mike Kravetz


[PATCH] zswap: Add CONFIG_ZSWAP_IO_SWITCH

2019-09-11 Thread Hui Zhu
I use zswap to handle the swap IO issue in a VM that uses a swap file.
This VM has 4G memory and 2 CPUs.  And I set up 4G swap in /swapfile.
This is test script:
cat 1.sh
./usemem --sleep 3600 -M -a -n 1 $((3 * 1024 * 1024 * 1024)) &
sleep 10
echo 1 > /proc/sys/vm/drop_caches
./usemem -S -f /test2 $((2 * 1024 * 1024 * 1024)) &
while [ True ]; do ./usemem -a -n 1 $((1 * 1024 * 1024 * 1024)); done

Without ZSWAP:
echo 100 > /proc/sys/vm/swappiness
swapon /swapfile
sh 1.sh
...
...
1207959552 bytes / 2076479 usecs = 568100 KB/s
61088 usecs to free memory
1207959552 bytes / 2035439 usecs = 579554 KB/s
55073 usecs to free memory
2415919104 bytes / 24054408 usecs = 98081 KB/s
3741 usecs to free memory
1207959552 bytes / 1954371 usecs = 603594 KB/s
53161 usecs to free memory
...
...

With ZSWAP:
echo 100 > /proc/sys/vm/swappiness
swapon /swapfile
echo lz4 > /sys/module/zswap/parameters/compressor
echo zsmalloc > /sys/module/zswap/parameters/zpool
echo 0 > /sys/module/zswap/parameters/same_filled_pages_enabled
echo 20 > /sys/module/zswap/parameters/max_pool_percent
echo 1 > /sys/module/zswap/parameters/enabled
sh 1.sh
1207959552 bytes / 3619283 usecs = 325934 KB/s
194825 usecs to free memory
1207959552 bytes / 3439563 usecs = 342964 KB/s
218419 usecs to free memory
2415919104 bytes / 19508762 usecs = 120935 KB/s
5632 usecs to free memory
1207959552 bytes / 3329369 usecs = 354315 KB/s
179764 usecs to free memory

The normal io speed is increased from 98081 KB/s to 120935 KB/s.
But I found 2 issues of zswap in this machine:
1. Because the disk of VM has the file cache in the host layer,
   so normal swap speed is higher than with zswap.
2. Because zswap need allocates memory to store the compressed pages,
   it will make memory capacity worse.
For example:
Command "./usemem -a -n 1 $((7 * 1024 * 1024 * 1024))" request 7G memory
from this machine.
It will work OK without zswap but got OOM when zswap is opened.

This commit adds CONFIG_ZSWAP_IO_SWITCH that try to handle the issues
and let zswap keep save IO.
It add two parameters read_in_flight_limit and write_in_flight_limit to
zswap.
In zswap_frontswap_store, pages will be stored to zswap only when
the IO in flight number of swap device is bigger than
zswap_read_in_flight_limit or zswap_write_in_flight_limit
when zswap is enabled.
Then the zswap just work when the IO in flight number of swap device
is low.

This is the test result:
echo 100 > /proc/sys/vm/swappiness
swapon /swapfile
echo lz4 > /sys/module/zswap/parameters/compressor
echo zsmalloc > /sys/module/zswap/parameters/zpool
echo 0 > /sys/module/zswap/parameters/same_filled_pages_enabled
echo 20 > /sys/module/zswap/parameters/max_pool_percent
echo 1 > /sys/module/zswap/parameters/enabled
echo 3 > /sys/module/zswap/parameters/read_in_flight_limit
echo 50 > /sys/module/zswap/parameters/write_in_flight_limit
sh 1.sh
...
1207959552 bytes / 2320861 usecs = 508280 KB/s
106164 usecs to free memory
1207959552 bytes / 2343916 usecs = 503280 KB/s
79386 usecs to free memory
2415919104 bytes / 20136015 usecs = 117167 KB/s
4411 usecs to free memory
1207959552 bytes / 1833403 usecs = 643419 KB/s
70452 usecs to free memory
...
killall usemem
./usemem -a -n 1 $((7 * 1024 * 1024 * 1024))
8455716864 bytes / 14457505 usecs = 571159 KB/s
365961 usecs to free memory

Signed-off-by: Hui Zhu 
---
 include/linux/swap.h |  3 +++
 mm/Kconfig   | 11 +++
 mm/page_io.c | 16 +++
 mm/zswap.c   | 55 
 4 files changed, 85 insertions(+)

diff --git a/include/linux/swap.h b/include/linux/swap.h
index de2c67a..82b621f 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -389,6 +389,9 @@ extern void end_swap_bio_write(struct bio *bio);
 extern int __swap_writepage(struct page *page, struct writeback_control *wbc,
bio_end_io_t end_write_func);
 extern int swap_set_page_dirty(struct page *page);
+#ifdef CONFIG_ZSWAP_IO_SWITCH
+extern void swap_io_in_flight(struct page *page, unsigned int inflight[2]);
+#endif
 
 int add_swap_extent(struct swap_info_struct *sis, unsigned long start_page,
unsigned long nr_pages, sector_t start_block);
diff --git a/mm/Kconfig b/mm/Kconfig
index 56cec63..d077e51 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -546,6 +546,17 @@ config ZSWAP
  they have not be fully explored on the large set of potential
  configurations and workloads that exist.
 
+config ZSWAP_IO_SWITCH
+   bool "Compressed cache for swap pages according to the IO status"
+   depends on ZSWAP
+   def_bool n
+   help
+ Add two parameters read_in_flight_limit and write_in_flight_limit to
+ ZSWAP.  When ZSWAP is enabled, pages will be stored to zswap only
+ when the IO in flight number of swap device is bigger than
+ zswap_read_in_flight_limit or zswap_write_in_flight_limit.
+ If unsure, say "n".
+
 config ZPOOL
tristate "Common API for compressed 

Re: [PATCH] mm/mmap.c: rb_parent is not necessary in __vma_link_list

2019-09-11 Thread Wei Yang
On Tue, Aug 13, 2019 at 11:26:56AM +0800, Wei Yang wrote:
>Now we use rb_parent to get next, while this is not necessary.
>
>When prev is NULL, this means vma should be the first element in the
>list. Then next should be current first one (mm->mmap), no matter
>whether we have parent or not.
>
>After removing it, the code shows the beauty of symmetry.
>
>Signed-off-by: Wei Yang 

Ping~

>---
> mm/internal.h | 2 +-
> mm/mmap.c | 2 +-
> mm/nommu.c| 2 +-
> mm/util.c | 8 ++--
> 4 files changed, 5 insertions(+), 9 deletions(-)
>
>diff --git a/mm/internal.h b/mm/internal.h
>index e32390802fd3..41a49574acc3 100644
>--- a/mm/internal.h
>+++ b/mm/internal.h
>@@ -290,7 +290,7 @@ static inline bool is_data_mapping(vm_flags_t flags)
> 
> /* mm/util.c */
> void __vma_link_list(struct mm_struct *mm, struct vm_area_struct *vma,
>-  struct vm_area_struct *prev, struct rb_node *rb_parent);
>+  struct vm_area_struct *prev);
> 
> #ifdef CONFIG_MMU
> extern long populate_vma_page_range(struct vm_area_struct *vma,
>diff --git a/mm/mmap.c b/mm/mmap.c
>index f7ed0afb994c..b8072630766f 100644
>--- a/mm/mmap.c
>+++ b/mm/mmap.c
>@@ -632,7 +632,7 @@ __vma_link(struct mm_struct *mm, struct vm_area_struct 
>*vma,
>   struct vm_area_struct *prev, struct rb_node **rb_link,
>   struct rb_node *rb_parent)
> {
>-  __vma_link_list(mm, vma, prev, rb_parent);
>+  __vma_link_list(mm, vma, prev);
>   __vma_link_rb(mm, vma, rb_link, rb_parent);
> }
> 
>diff --git a/mm/nommu.c b/mm/nommu.c
>index fed1b6e9c89b..12a66fbeb988 100644
>--- a/mm/nommu.c
>+++ b/mm/nommu.c
>@@ -637,7 +637,7 @@ static void add_vma_to_mm(struct mm_struct *mm, struct 
>vm_area_struct *vma)
>   if (rb_prev)
>   prev = rb_entry(rb_prev, struct vm_area_struct, vm_rb);
> 
>-  __vma_link_list(mm, vma, prev, parent);
>+  __vma_link_list(mm, vma, prev);
> }
> 
> /*
>diff --git a/mm/util.c b/mm/util.c
>index e6351a80f248..80632db29247 100644
>--- a/mm/util.c
>+++ b/mm/util.c
>@@ -264,7 +264,7 @@ void *memdup_user_nul(const void __user *src, size_t len)
> EXPORT_SYMBOL(memdup_user_nul);
> 
> void __vma_link_list(struct mm_struct *mm, struct vm_area_struct *vma,
>-  struct vm_area_struct *prev, struct rb_node *rb_parent)
>+  struct vm_area_struct *prev)
> {
>   struct vm_area_struct *next;
> 
>@@ -273,12 +273,8 @@ void __vma_link_list(struct mm_struct *mm, struct 
>vm_area_struct *vma,
>   next = prev->vm_next;
>   prev->vm_next = vma;
>   } else {
>+  next = mm->mmap;
>   mm->mmap = vma;
>-  if (rb_parent)
>-  next = rb_entry(rb_parent,
>-  struct vm_area_struct, vm_rb);
>-  else
>-  next = NULL;
>   }
>   vma->vm_next = next;
>   if (next)
>-- 
>2.17.1

-- 
Wei Yang
Help you, Help me


Re: [PATCH v3 7/7] mm/gup: Allow VM_FAULT_RETRY for multiple times

2019-09-11 Thread Peter Xu
On Wed, Sep 11, 2019 at 10:47:59AM +0100, Linus Torvalds wrote:
> On Wed, Sep 11, 2019 at 8:11 AM Peter Xu  wrote:
> >
> > This is the gup counterpart of the change that allows the
> > VM_FAULT_RETRY to happen for more than once.  One thing to mention is
> > that we must check the fatal signal here before retry because the GUP
> > can be interrupted by that, otherwise we can loop forever.
> 
> I still get nervous about the signal handling here.
> 
> I'm not entirely sure we get it right even before your patch series.
> 
> Right now, __get_user_pages() can return -ERESTARTSYS when it's killed:
> 
> /*
>  * If we have a pending SIGKILL, don't keep faulting pages and
>  * potentially allocating memory.
>  */
> if (fatal_signal_pending(current)) {
> ret = -ERESTARTSYS;
> goto out;
> }
> 
> and I don't think your series changes that.  And note how this is true
> _regardless_ of any FOLL_xyz flags (and we don't pass the
> FAULT_FLAG_xyz flags at all, they get generated deeper down if we
> actually end up faulting things in).
> 
> So this part of the patch:
> 
> +   if (fatal_signal_pending(current))
> +   goto out;
> +
> *locked = 1;
> -   lock_dropped = true;
> down_read(>mmap_sem);
> ret = __get_user_pages(tsk, mm, start, 1, flags | FOLL_TRIED,
> -  pages, NULL, NULL);
> +  pages, NULL, locked);
> +   if (!*locked) {
> +   /* Continue to retry until we succeeded */
> +   BUG_ON(ret != 0);
> +   goto retry;
> 
> just makes me go "that can't be right". The fatal_signal_pending() is
> pointless and would probably better be something like
> 
> if (down_read_killable(>mmap_sem) < 0)
> goto out;

Thanks for noticing all these!  I'd admit when I was working on the
series I didn't think & test very carefully with fatal signals but
mostly I'm making sure the normal signals should work especially for
processes like userfaultfd tracees so they won't hang death (I'm
always testing with GDB, and if without proper signal handle they do
hang death..).

I agree that we should probably replace the down_read() with the
killable version as you suggested.  Though we might still want the
explicit check of fatal_signal_pending() to make sure we react even
faster because IMHO down_read_killable() does not really check signals
all the time but it just put us into killable state if we need to wait
for the mmap_sem.  In other words, if we are always lucky that we get
the lock without even waiting anything then down_read_killable() will
still ignore the fatal signals forever.

> 
> and then _after_ calling __get_user_pages(), the whole "negative error
> handling" should be more obvious.
> 
> The BUG_ON(ret != 0) makes me nervous, but it might be fine (I guess
> the fatal signal handling has always been done before the lock is
> released?).

Yes it indeed looks nervous, though it's probably should be true in
all cases.  Actually we already have checks like this, for example, in
current __get_user_pages_locked():

/* VM_FAULT_RETRY cannot return errors */
if (!*locked) {
BUG_ON(ret < 0);
BUG_ON(ret >= nr_pages);
}

And in the new retry path since we always pass in npages==1 so it must
be zero when VM_FAULT_RETRY.

While... When I'm looking into this more carefully I seem to have
found another bug that we might want to fix with hugetlbfs path:

diff --git a/mm/gup.c b/mm/gup.c
index 7230f60a68d6..29ee3de65fad 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -836,6 +836,16 @@ static long __get_user_pages(struct task_struct *tsk, 
struct mm_struct *mm,
i = follow_hugetlb_page(mm, vma, pages, vmas,
, _pages, i,
gup_flags, locked);
+   if (locked && *locked == 0) {
+   /*
+* We've got a VM_FAULT_RETRY
+* and we've lost mmap_sem.
+* We must stop here.
+*/
+   BUG_ON(gup_flags & FOLL_NOWAIT);
+   BUG_ON(ret != 0);
+   goto out;
+   }
continue;
}
}

The problem is that if *locked==0 then we've lost the mmap_sem
already!  Then we probably can't go any further before taking it back
again.  With that, we should be able to keep the previous assumption
valid.

> 
> But exactly *because* __get_user_pages() can already 

RE: [PATCH v5 1/2] dt-bindings: mailbox: add binding doc for the ARM SMC/HVC mailbox

2019-09-11 Thread Peng Fan
> Subject: Re: [PATCH v5 1/2] dt-bindings: mailbox: add binding doc for the ARM
> SMC/HVC mailbox
> 
> On Wed, Sep 11, 2019 at 10:03 AM Andre Przywara
>  wrote:
> >
> > On Tue, 10 Sep 2019 21:44:11 -0500
> > Jassi Brar  wrote:
> >
> > Hi,
> >
> > > On Mon, Sep 9, 2019 at 10:42 AM Andre Przywara
>  wrote:
> > > >
> > > > On Wed, 28 Aug 2019 03:02:58 + Peng Fan 
> > > > wrote:
> > > >
> > [ ... ]
> > > >
> > > > > +
> > > > > +  arm,func-ids:
> > > > > +description: |
> > > > > +  An array of 32-bit values specifying the function IDs used by
> each
> > > > > +  mailbox channel. Those function IDs follow the ARM SMC
> calling
> > > > > +  convention standard [1].
> > > > > +
> > > > > +  There is one identifier per channel and the number of
> supported
> > > > > +  channels is determined by the length of this array.
> > > >
> > > > I think this makes it obvious that arm,num-chans is not needed.
> > > >
> > > > Also this somewhat contradicts the driver implementation, which allows
> the array to be shorter, marking this as UINT_MAX and later on using the first
> data item as a function identifier. This is somewhat surprising and not
> documented (unless I missed something).
> > > >
> > > > So I would suggest:
> > > > - We drop the transports property, and always put the client provided
> data in the registers, according to the SMCCC. Document this here.
> > > >   A client not needing those could always puts zeros (or garbage) in
> there, the respective firmware would just ignore the registers.
> > > > - We drop "arm,num-chans", as this is just redundant with the length of
> the func-ids array.
> > > > - We don't impose an arbitrary limit on the number of channels. From
> the firmware point of view this is just different function IDs, from Linux' 
> point
> of view just the size of the memory used. Both don't need to be limited
> artificially IMHO.
> > > >
> > > Sounds like we are in sync.
> > >
> > > > - We mark arm,func-ids as required, as this needs to be fixed, allocated
> number.
> > > >
> > > I still think func-id can be done without. A client can always pass
> > > the value as it knows what it expects.
> >
> > I don't think it's the right abstraction. The mailbox *controller* uses a
> specific func-id, this has to match the one the firmware expects. So this is a
> property of the mailbox transport channel (the SMC call), and the *client*
> should *not* care about it. It just sees the logical channel ID (if we have 
> one),
> which the controller translates into the func-ID.
> >
> arg0 is special only to the client/protocol, otherwise it is simply the first
> argument for the arm_smccc_smc *instruction* controller.
> arg[1,7] are already provided by the client, so it is only neater if
> arg0 is also taken from the client.
> 
> But as I said, I am still ok if func-id is passed from dt and arg0 from 
> client is
> ignored because we have one channel per controller design and we don't have
> to worry about number of channels there can be dedicated to specific
> functions.

Ok, so I'll make it an optional property.

> 
> > So it should really look like this (assuming only single channel 
> > controllers):
> > mailbox: smc-mailbox {
> > #mbox-cells = <0>;
> > compatible = "arm,smc-mbox";
> > method = "smc";
> >
> Do we want to do away with 'method' property and use different 'compatible'
> properties instead?
>  compatible = "arm,smc-mbox"; orcompatible = "arm,hvc-mbox";

I am ok, just need add data in driver to differentiate smc/hvc.
Andre, are you ok?

Thanks,
Peng.

> 
> > arm,func-id = <0x82fe>;
> > };
> > scmi {
> > compatible = "arm,scmi";
> > mboxes = <_mbox>;
> > mbox-names = "tx"; /* rx is optional */
> > shmem = <_scp_hpri>;
> > };
> >
> > If you allow the client to provide the function ID (and I am not saying 
> > this is
> a good idea): where would this func ID come from? It would need to be a
> property of the client DT node, then. So one way would be to use the func ID
> as the Linux mailbox channel ID:
> > mailbox: smc-mailbox {
> > #mbox-cells = <1>;
> > compatible = "arm,smc-mbox";
> > method = "smc";
> > };
> > scmi {
> > compatible = "arm,scmi";
> > mboxes = <_mbox 0x82fe>;
> > mbox-names = "tx"; /* rx is optional */
> > shmem = <_scp_hpri>;
> > };
> >
> > But this doesn't look desirable.
> >
> > And as I mentioned this before: allowing some mailbox clients to provide
> the function IDs sound scary, as they could use anything they want, triggering
> random firmware actions (think PSCI_CPU_OFF).
> >
> That paranoia is unwarranted. We have to keep faith in kernel-space code
> doing the right thing.
> Either the illegitimate function request should be rejected by the firmware or
> client driver be called buggy just as we would call a block device driver
> buggy if it messed up the sector numbers in a write request.
> 
> thnx.


Re: [PATCH 1/1] MAINTAINERS: update FORCEDETH MAINTAINERS info

2019-09-11 Thread Rain River
OK. I will resend this very soon.

On Tue, Sep 10, 2019 at 11:50 PM David Miller  wrote:
>
> From: rain.1986.08...@gmail.com
> Date: Sat,  7 Sep 2019 16:14:46 +0800
>
> > From: Rain River 
> >
> > Many FORCEDETH NICs are used in our hosts. Several bugs are fixed and
> > some features are developed for FORCEDETH NICs. And I have been
> > reviewing patches for FORCEDETH NIC for several months. Mark me as the
> > FORCEDETH NIC maintainer. I will send out the patches and maintain
> > FORCEDETH NIC.
> >
> > Signed-off-by: Rain River 
>
> Please resend this with netdev properly CC:'d, thank you.


[PATCH 1/2] ARM: dts: imx7d: Correct speed grading fuse settings

2019-09-11 Thread Anson Huang
The 800MHz opp speed grading fuse mask should be 0xd instead
of 0xf according to fuse map definition:

SPEED_GRADING[1:0]  MHz
00  800
01  500
10  1000
11  1200

Signed-off-by: Anson Huang 
---
 arch/arm/boot/dts/imx7d.dtsi | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm/boot/dts/imx7d.dtsi b/arch/arm/boot/dts/imx7d.dtsi
index 9c8dd32..0083272 100644
--- a/arch/arm/boot/dts/imx7d.dtsi
+++ b/arch/arm/boot/dts/imx7d.dtsi
@@ -43,7 +43,7 @@
opp-hz = /bits/ 64 <79200>;
opp-microvolt = <100>;
clock-latency-ns = <15>;
-   opp-supported-hw = <0xf>, <0xf>;
+   opp-supported-hw = <0xd>, <0xf>;
};
 
opp-99600 {
-- 
2.7.4



[PATCH 2/2] ARM: dts: imx7d: Add opp-suspend property

2019-09-11 Thread Anson Huang
Add "opp-suspend" property for i.MX7D to make sure system
suspend with max available opp.

Signed-off-by: Anson Huang 
---
 arch/arm/boot/dts/imx7d.dtsi | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/arm/boot/dts/imx7d.dtsi b/arch/arm/boot/dts/imx7d.dtsi
index 0083272..2792767 100644
--- a/arch/arm/boot/dts/imx7d.dtsi
+++ b/arch/arm/boot/dts/imx7d.dtsi
@@ -44,6 +44,7 @@
opp-microvolt = <100>;
clock-latency-ns = <15>;
opp-supported-hw = <0xd>, <0xf>;
+   opp-suspend;
};
 
opp-99600 {
@@ -51,6 +52,7 @@
opp-microvolt = <110>;
clock-latency-ns = <15>;
opp-supported-hw = <0xc>, <0xf>;
+   opp-suspend;
};
 
opp-12 {
@@ -58,6 +60,7 @@
opp-microvolt = <1225000>;
clock-latency-ns = <15>;
opp-supported-hw = <0x8>, <0xf>;
+   opp-suspend;
};
};
 
-- 
2.7.4



[PATCH 08/13] nvdimm: Use typical kernel style indentation

2019-09-11 Thread Joe Perches
Make the nvdimm code more like the rest of the kernel.

Avoid indentation of labels and spaces where tabs should be used.

Signed-off-by: Joe Perches 
---
 drivers/nvdimm/btt.c | 2 +-
 drivers/nvdimm/region_devs.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/nvdimm/btt.c b/drivers/nvdimm/btt.c
index 39851edc2cc5..0df4461fe607 100644
--- a/drivers/nvdimm/btt.c
+++ b/drivers/nvdimm/btt.c
@@ -1320,7 +1320,7 @@ static int btt_write_pg(struct btt *btt, struct 
bio_integrity_payload *bip,
u32 cur_len;
int e_flag;
 
-   retry:
+retry:
lane = nd_region_acquire_lane(btt->nd_region);
 
ret = lba_to_arena(btt, sector, , );
diff --git a/drivers/nvdimm/region_devs.c b/drivers/nvdimm/region_devs.c
index 16dfdbdbf1c8..65df07481909 100644
--- a/drivers/nvdimm/region_devs.c
+++ b/drivers/nvdimm/region_devs.c
@@ -1044,7 +1044,7 @@ static struct nd_region *nd_region_create(struct 
nvdimm_bus *nvdimm_bus,
if (!nd_region->lane)
goto err_percpu;
 
-for (i = 0; i < nr_cpu_ids; i++) {
+   for (i = 0; i < nr_cpu_ids; i++) {
struct nd_percpu_lane *ndl;
 
ndl = per_cpu_ptr(nd_region->lane, i);
-- 
2.15.0



[PATCH 09/13] nvdimm: btt.h: Neaten #defines to improve readability

2019-09-11 Thread Joe Perches
Use tab alignment to make the content and macro a bit more intelligible.

Use the BIT and BIT_ULL macros.
Convert MAP_LBA_MASK to use the already defined shift masks.

Signed-off-by: Joe Perches 
---
 drivers/nvdimm/btt.h | 54 ++--
 1 file changed, 27 insertions(+), 27 deletions(-)

diff --git a/drivers/nvdimm/btt.h b/drivers/nvdimm/btt.h
index 1da76da3e159..fb0f4546153f 100644
--- a/drivers/nvdimm/btt.h
+++ b/drivers/nvdimm/btt.h
@@ -10,34 +10,34 @@
 #include 
 #include 
 
-#define BTT_SIG_LEN 16
-#define BTT_SIG "BTT_ARENA_INFO\0"
-#define MAP_ENT_SIZE 4
-#define MAP_TRIM_SHIFT 31
-#define MAP_TRIM_MASK (1 << MAP_TRIM_SHIFT)
-#define MAP_ERR_SHIFT 30
-#define MAP_ERR_MASK (1 << MAP_ERR_SHIFT)
-#define MAP_LBA_MASK (~((1 << MAP_TRIM_SHIFT) | (1 << MAP_ERR_SHIFT)))
-#define MAP_ENT_NORMAL 0xC000
-#define LOG_GRP_SIZE sizeof(struct log_group)
-#define LOG_ENT_SIZE sizeof(struct log_entry)
-#define ARENA_MIN_SIZE (1UL << 24) /* 16 MB */
-#define ARENA_MAX_SIZE (1ULL << 39)/* 512 GB */
-#define RTT_VALID (1UL << 31)
-#define RTT_INVALID 0
-#define BTT_PG_SIZE 4096
-#define BTT_DEFAULT_NFREE ND_MAX_LANES
-#define LOG_SEQ_INIT 1
-
-#define IB_FLAG_ERROR 0x0001
-#define IB_FLAG_ERROR_MASK 0x0001
-
-#define ent_lba(ent) (ent & MAP_LBA_MASK)
-#define ent_e_flag(ent) (!!(ent & MAP_ERR_MASK))
-#define ent_z_flag(ent) (!!(ent & MAP_TRIM_MASK))
-#define set_e_flag(ent) (ent |= MAP_ERR_MASK)
+#define BTT_SIG_LEN16
+#define BTT_SIG"BTT_ARENA_INFO\0"
+#define MAP_ENT_SIZE   4
+#define MAP_TRIM_SHIFT 31
+#define MAP_TRIM_MASK  BIT(MAP_TRIM_SHIFT)
+#define MAP_ERR_SHIFT  30
+#define MAP_ERR_MASK   BIT(MAP_ERR_SHIFT)
+#define MAP_LBA_MASK   (~(MAP_TRIM_MASK | MAP_ERR_MASK))
+#define MAP_ENT_NORMAL 0xC000
+#define LOG_GRP_SIZE   sizeof(struct log_group)
+#define LOG_ENT_SIZE   sizeof(struct log_entry)
+#define ARENA_MIN_SIZE BIT(24) /* 16 MB */
+#define ARENA_MAX_SIZE BIT_ULL(39) /* 512 GB */
+#define RTT_VALID  BIT(31)
+#define RTT_INVALID0
+#define BTT_PG_SIZE4096
+#define BTT_DEFAULT_NFREE  ND_MAX_LANES
+#define LOG_SEQ_INIT   1
+
+#define IB_FLAG_ERROR  0x0001
+#define IB_FLAG_ERROR_MASK 0x0001
+
+#define ent_lba(ent)   ((ent) & MAP_LBA_MASK)
+#define ent_e_flag(ent)(!!((ent) & MAP_ERR_MASK))
+#define ent_z_flag(ent)(!!((ent) & MAP_TRIM_MASK))
+#define set_e_flag(ent)((ent) |= MAP_ERR_MASK)
 /* 'normal' is both e and z flags set */
-#define ent_normal(ent) (ent_e_flag(ent) && ent_z_flag(ent))
+#define ent_normal(ent)(ent_e_flag(ent) && ent_z_flag(ent))
 
 enum btt_init_state {
INIT_UNCHECKED = 0,
-- 
2.15.0



[PATCH 13/13] nvdimm: Miscellaneous neatening

2019-09-11 Thread Joe Perches
Random neatening, mostly trivially wrapping to 80 columns, to make the
code a bit more kernel style compatible.

Use casts to (u64) and not (unsigned long long)

Signed-off-by: Joe Perches 
---
 drivers/nvdimm/badrange.c   |   3 +-
 drivers/nvdimm/blk.c|  18 ---
 drivers/nvdimm/btt.c|  22 
 drivers/nvdimm/btt_devs.c   |  42 +---
 drivers/nvdimm/bus.c|  25 -
 drivers/nvdimm/claim.c  |  11 ++--
 drivers/nvdimm/core.c   |   4 +-
 drivers/nvdimm/dimm_devs.c  |  18 ---
 drivers/nvdimm/label.c  |  35 +++--
 drivers/nvdimm/label.h  |   6 ++-
 drivers/nvdimm/namespace_devs.c | 109 +++-
 drivers/nvdimm/nd-core.h|  13 ++---
 drivers/nvdimm/nd.h |  26 +-
 drivers/nvdimm/nd_virtio.c  |   3 +-
 drivers/nvdimm/pfn_devs.c   |  43 
 drivers/nvdimm/pmem.c   |  14 +++---
 drivers/nvdimm/region_devs.c|  36 +++--
 drivers/nvdimm/security.c   |  28 +--
 drivers/nvdimm/virtio_pmem.c|   4 +-
 19 files changed, 254 insertions(+), 206 deletions(-)

diff --git a/drivers/nvdimm/badrange.c b/drivers/nvdimm/badrange.c
index 681d99c59f52..4d231643c095 100644
--- a/drivers/nvdimm/badrange.c
+++ b/drivers/nvdimm/badrange.c
@@ -24,7 +24,8 @@ void badrange_init(struct badrange *badrange)
 EXPORT_SYMBOL_GPL(badrange_init);
 
 static void append_badrange_entry(struct badrange *badrange,
- struct badrange_entry *bre, u64 addr, u64 
length)
+ struct badrange_entry *bre,
+ u64 addr, u64 length)
 {
lockdep_assert_held(>lock);
bre->start = addr;
diff --git a/drivers/nvdimm/blk.c b/drivers/nvdimm/blk.c
index db3973c7f506..fc15aa9220c8 100644
--- a/drivers/nvdimm/blk.c
+++ b/drivers/nvdimm/blk.c
@@ -29,7 +29,8 @@ static u32 nsblk_sector_size(struct nd_namespace_blk *nsblk)
 }
 
 static resource_size_t to_dev_offset(struct nd_namespace_blk *nsblk,
-resource_size_t ns_offset, unsigned int 
len)
+resource_size_t ns_offset,
+unsigned int len)
 {
int i;
 
@@ -61,7 +62,8 @@ static struct nd_blk_region *to_ndbr(struct nd_namespace_blk 
*nsblk)
 
 #ifdef CONFIG_BLK_DEV_INTEGRITY
 static int nd_blk_rw_integrity(struct nd_namespace_blk *nsblk,
-  struct bio_integrity_payload *bip, u64 lba, int 
rw)
+  struct bio_integrity_payload *bip,
+  u64 lba, int rw)
 {
struct nd_blk_region *ndbr = to_ndbr(nsblk);
unsigned int len = nsblk_meta_size(nsblk);
@@ -107,7 +109,8 @@ static int nd_blk_rw_integrity(struct nd_namespace_blk 
*nsblk,
 
 #else /* CONFIG_BLK_DEV_INTEGRITY */
 static int nd_blk_rw_integrity(struct nd_namespace_blk *nsblk,
-  struct bio_integrity_payload *bip, u64 lba, int 
rw)
+  struct bio_integrity_payload *bip,
+  u64 lba, int rw)
 {
return 0;
 }
@@ -115,7 +118,8 @@ static int nd_blk_rw_integrity(struct nd_namespace_blk 
*nsblk,
 
 static int nsblk_do_bvec(struct nd_namespace_blk *nsblk,
 struct bio_integrity_payload *bip, struct page *page,
-unsigned int len, unsigned int off, int rw, sector_t 
sector)
+unsigned int len, unsigned int off, int rw,
+sector_t sector)
 {
struct nd_blk_region *ndbr = to_ndbr(nsblk);
resource_size_t dev_offset, ns_offset;
@@ -187,9 +191,9 @@ static blk_qc_t nd_blk_make_request(struct request_queue 
*q, struct bio *bio)
bvec.bv_offset, rw, iter.bi_sector);
if (err) {
dev_dbg(>common.dev,
-   "io error in %s sector %lld, len %d,\n",
-   (rw == READ) ? "READ" : "WRITE",
-   (unsigned long long)iter.bi_sector, len);
+   "io error in %s sector %lld, len %d\n",
+   rw == READ ? "READ" : "WRITE",
+   (u64)iter.bi_sector, len);
bio->bi_status = errno_to_blk_status(err);
break;
}
diff --git a/drivers/nvdimm/btt.c b/drivers/nvdimm/btt.c
index 0df4461fe607..6c18d7bba6af 100644
--- a/drivers/nvdimm/btt.c
+++ b/drivers/nvdimm/btt.c
@@ -589,7 +589,8 @@ static int btt_freelist_init(struct arena_info *arena)
 * to complete the map write. So fix up the map.
 */
ret = btt_map_write(arena, le32_to_cpu(log_new.lba),
-   le32_to_cpu(log_new.new_map), 0, 0, 
0);
+ 

[PATCH 04/13] nvdimm: Use a more common kernel spacing style

2019-09-11 Thread Joe Perches
Use the more common kernel spacing styles per line.

git diff -w shows no difference.

Signed-off-by: Joe Perches 
---
 drivers/nvdimm/badrange.c   |  4 ++--
 drivers/nvdimm/blk.c|  2 +-
 drivers/nvdimm/btt.c|  4 ++--
 drivers/nvdimm/btt_devs.c   |  2 +-
 drivers/nvdimm/bus.c| 14 +++---
 drivers/nvdimm/core.c   |  2 +-
 drivers/nvdimm/label.c  | 28 ++--
 drivers/nvdimm/namespace_devs.c | 22 +++---
 drivers/nvdimm/nd-core.h|  2 +-
 drivers/nvdimm/nd.h |  4 ++--
 drivers/nvdimm/pfn_devs.c   |  6 +++---
 drivers/nvdimm/pmem.c   |  2 +-
 drivers/nvdimm/region.c |  2 +-
 drivers/nvdimm/region_devs.c|  2 +-
 drivers/nvdimm/security.c   | 18 +-
 15 files changed, 57 insertions(+), 57 deletions(-)

diff --git a/drivers/nvdimm/badrange.c b/drivers/nvdimm/badrange.c
index b997c2007b83..f2a742c6258a 100644
--- a/drivers/nvdimm/badrange.c
+++ b/drivers/nvdimm/badrange.c
@@ -165,11 +165,11 @@ EXPORT_SYMBOL_GPL(badrange_forget);
 static void set_badblock(struct badblocks *bb, sector_t s, int num)
 {
dev_dbg(bb->dev, "Found a bad range (0x%llx, 0x%llx)\n",
-   (u64) s * 512, (u64) num * 512);
+   (u64)s * 512, (u64)num * 512);
/* this isn't an error as the hardware will still throw an exception */
if (badblocks_set(bb, s, num, 1))
dev_info_once(bb->dev, "%s: failed for sector %llx\n",
- __func__, (u64) s);
+ __func__, (u64)s);
 }
 
 /**
diff --git a/drivers/nvdimm/blk.c b/drivers/nvdimm/blk.c
index edd3e1664edc..95acb48bfaed 100644
--- a/drivers/nvdimm/blk.c
+++ b/drivers/nvdimm/blk.c
@@ -189,7 +189,7 @@ static blk_qc_t nd_blk_make_request(struct request_queue 
*q, struct bio *bio)
dev_dbg(>common.dev,
"io error in %s sector %lld, len %d,\n",
(rw == READ) ? "READ" : "WRITE",
-   (unsigned long long) iter.bi_sector, len);
+   (unsigned long long)iter.bi_sector, len);
bio->bi_status = errno_to_blk_status(err);
break;
}
diff --git a/drivers/nvdimm/btt.c b/drivers/nvdimm/btt.c
index 9cad4dca6eac..28b65413abd8 100644
--- a/drivers/nvdimm/btt.c
+++ b/drivers/nvdimm/btt.c
@@ -1007,7 +1007,7 @@ static int btt_arena_write_layout(struct arena_info 
*arena)
super->info2off = cpu_to_le64(arena->info2off - arena->infooff);
 
super->flags = 0;
-   sum = nd_sb_checksum((struct nd_gen_sb *) super);
+   sum = nd_sb_checksum((struct nd_gen_sb *)super);
super->checksum = cpu_to_le64(sum);
 
ret = btt_info_write(arena, super);
@@ -1469,7 +1469,7 @@ static blk_qc_t btt_make_request(struct request_queue *q, 
struct bio *bio)
"io error in %s sector %lld, len %d,\n",
(op_is_write(bio_op(bio))) ? "WRITE" :
"READ",
-   (unsigned long long) iter.bi_sector, len);
+   (unsigned long long)iter.bi_sector, len);
bio->bi_status = errno_to_blk_status(err);
break;
}
diff --git a/drivers/nvdimm/btt_devs.c b/drivers/nvdimm/btt_devs.c
index 9c4cbda834be..f6429842f1b6 100644
--- a/drivers/nvdimm/btt_devs.c
+++ b/drivers/nvdimm/btt_devs.c
@@ -256,7 +256,7 @@ bool nd_btt_arena_is_valid(struct nd_btt *nd_btt, struct 
btt_sb *super)
 
checksum = le64_to_cpu(super->checksum);
super->checksum = 0;
-   if (checksum != nd_sb_checksum((struct nd_gen_sb *) super))
+   if (checksum != nd_sb_checksum((struct nd_gen_sb *)super))
return false;
super->checksum = cpu_to_le64(checksum);
 
diff --git a/drivers/nvdimm/bus.c b/drivers/nvdimm/bus.c
index 6d4d4c72ac92..35591f492d27 100644
--- a/drivers/nvdimm/bus.c
+++ b/drivers/nvdimm/bus.c
@@ -876,7 +876,7 @@ u32 nd_cmd_out_size(struct nvdimm *nvdimm, int cmd,
return remainder;
return out_field[1] - 8;
} else if (cmd == ND_CMD_CALL) {
-   struct nd_cmd_pkg *pkg = (struct nd_cmd_pkg *) in_field;
+   struct nd_cmd_pkg *pkg = (struct nd_cmd_pkg *)in_field;
 
return pkg->nd_size_out;
}
@@ -984,7 +984,7 @@ static int __nd_ioctl(struct nvdimm_bus *nvdimm_bus, struct 
nvdimm *nvdimm,
const struct nd_cmd_desc *desc = NULL;
unsigned int cmd = _IOC_NR(ioctl_cmd);
struct device *dev = _bus->dev;
-   void __user *p = (void __user *) arg;
+   void __user *p = (void __user *)arg;
char *out_env = NULL, *in_env = NULL;
const char *cmd_name, *dimm_name;
u32 in_len = 0, out_len = 0;
@@ -1073,7 +1073,7 @@ 

[PATCH 05/13] nvdimm: Use "unsigned int" in preference to "unsigned"

2019-09-11 Thread Joe Perches
Use the more common kernel type.

Signed-off-by: Joe Perches 
---
 drivers/nvdimm/label.c | 2 +-
 drivers/nvdimm/nd.h| 6 +++---
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/nvdimm/label.c b/drivers/nvdimm/label.c
index 2c780c5352dc..5700d9b35b8f 100644
--- a/drivers/nvdimm/label.c
+++ b/drivers/nvdimm/label.c
@@ -34,7 +34,7 @@ static u32 best_seq(u32 a, u32 b)
return a;
 }
 
-unsigned sizeof_namespace_label(struct nvdimm_drvdata *ndd)
+unsigned int sizeof_namespace_label(struct nvdimm_drvdata *ndd)
 {
return ndd->nslabel_size;
 }
diff --git a/drivers/nvdimm/nd.h b/drivers/nvdimm/nd.h
index c10a4b94d44a..1636061b1f93 100644
--- a/drivers/nvdimm/nd.h
+++ b/drivers/nvdimm/nd.h
@@ -81,7 +81,7 @@ static inline struct nd_namespace_index 
*to_next_namespace_index(
return to_namespace_index(ndd, ndd->ns_next);
 }
 
-unsigned sizeof_namespace_label(struct nvdimm_drvdata *ndd);
+unsigned int sizeof_namespace_label(struct nvdimm_drvdata *ndd);
 
 #define namespace_label_has(ndd, field)\
(offsetof(struct nd_namespace_label, field) \
@@ -170,9 +170,9 @@ struct nd_blk_region {
 /*
  * Lookup next in the repeating sequence of 01, 10, and 11.
  */
-static inline unsigned nd_inc_seq(unsigned seq)
+static inline unsigned int nd_inc_seq(unsigned int seq)
 {
-   static const unsigned next[] = { 0, 2, 3, 1 };
+   static const unsigned int next[] = { 0, 2, 3, 1 };
 
return next[seq & 3];
 }
-- 
2.15.0



[PATCH 10/13] nvdimm: namespace_devs: Move assignment operators

2019-09-11 Thread Joe Perches
Kernel code uses assignment operators where the statement is split
on multiple lines on the first line.

Move 2 unusual uses.

Signed-off-by: Joe Perches 
---
 drivers/nvdimm/namespace_devs.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/nvdimm/namespace_devs.c b/drivers/nvdimm/namespace_devs.c
index 70e1d752c12c..8c75ef84bad7 100644
--- a/drivers/nvdimm/namespace_devs.c
+++ b/drivers/nvdimm/namespace_devs.c
@@ -2023,8 +2023,8 @@ static struct device *create_namespace_pmem(struct 
nd_region *nd_region,
nspm->lbasize = __le64_to_cpu(label0->lbasize);
ndd = to_ndd(nd_mapping);
if (namespace_label_has(ndd, abstraction_guid))
-   nspm->nsio.common.claim_class
-   = to_nvdimm_cclass(>abstraction_guid);
+   nspm->nsio.common.claim_class =
+   to_nvdimm_cclass(>abstraction_guid);
}
 
if (!nspm->alt_name || !nspm->uuid) {
@@ -2267,8 +2267,8 @@ static struct device *create_namespace_blk(struct 
nd_region *nd_region,
nsblk->uuid = kmemdup(nd_label->uuid, NSLABEL_UUID_LEN,
  GFP_KERNEL);
if (namespace_label_has(ndd, abstraction_guid))
-   nsblk->common.claim_class
-   = to_nvdimm_cclass(_label->abstraction_guid);
+   nsblk->common.claim_class =
+   to_nvdimm_cclass(_label->abstraction_guid);
if (!nsblk->uuid)
goto blk_err;
memcpy(name, nd_label->name, NSLABEL_NAME_LEN);
-- 
2.15.0



[PATCH 07/13] nvdimm: Use typical kernel brace styles

2019-09-11 Thread Joe Perches
Make the nvdimm code more like the rest of the kernel code to
improve readability.

Add balanced braces to multiple test blocks.
Remove else statements from blocks where the block above uses return.

e.g.:
if (foo) {
[code...];
return FOO;
} else if (bar) {
[code...];
return BAR;
} else
return BAZ;

becomes
if (foo) {
[code...];
return FOO;
}
if (bar) {
[code...];
return BAR;
}
return BAZ;

Signed-off-by: Joe Perches 
---
 drivers/nvdimm/badrange.c   |  3 +-
 drivers/nvdimm/blk.c|  9 +++--
 drivers/nvdimm/btt.c|  5 +--
 drivers/nvdimm/btt_devs.c   |  4 +--
 drivers/nvdimm/bus.c| 10 +++---
 drivers/nvdimm/claim.c  |  7 ++--
 drivers/nvdimm/dax_devs.c   |  3 +-
 drivers/nvdimm/dimm_devs.c  | 13 ---
 drivers/nvdimm/label.c  | 13 +++
 drivers/nvdimm/namespace_devs.c | 78 ++---
 drivers/nvdimm/pfn_devs.c   | 24 +++--
 drivers/nvdimm/pmem.c   |  8 ++---
 drivers/nvdimm/region_devs.c| 10 +++---
 drivers/nvdimm/security.c   |  8 +++--
 14 files changed, 118 insertions(+), 77 deletions(-)

diff --git a/drivers/nvdimm/badrange.c b/drivers/nvdimm/badrange.c
index f2a742c6258a..681d99c59f52 100644
--- a/drivers/nvdimm/badrange.c
+++ b/drivers/nvdimm/badrange.c
@@ -206,8 +206,9 @@ static void __add_badblock_range(struct badblocks *bb, u64 
ns_offset, u64 len)
remaining -= done;
s += done;
}
-   } else
+   } else {
set_badblock(bb, start_sector, num_sectors);
+   }
 }
 
 static void badblocks_populate(struct badrange *badrange,
diff --git a/drivers/nvdimm/blk.c b/drivers/nvdimm/blk.c
index 95acb48bfaed..db3973c7f506 100644
--- a/drivers/nvdimm/blk.c
+++ b/drivers/nvdimm/blk.c
@@ -301,13 +301,16 @@ static int nd_blk_probe(struct device *dev)
dev_set_drvdata(dev, nsblk);
 
ndns->rw_bytes = nsblk_rw_bytes;
+
if (is_nd_btt(dev))
return nvdimm_namespace_attach_btt(ndns);
-   else if (nd_btt_probe(dev, ndns) == 0) {
+
+   if (nd_btt_probe(dev, ndns) == 0) {
/* we'll come back as btt-blk */
return -ENXIO;
-   } else
-   return nsblk_attach_disk(nsblk);
+   }
+
+   return nsblk_attach_disk(nsblk);
 }
 
 static int nd_blk_remove(struct device *dev)
diff --git a/drivers/nvdimm/btt.c b/drivers/nvdimm/btt.c
index 0927cbdc5cc6..39851edc2cc5 100644
--- a/drivers/nvdimm/btt.c
+++ b/drivers/nvdimm/btt.c
@@ -702,9 +702,10 @@ static int log_set_indices(struct arena_info *arena)
 * Only allow the known permutations of log/padding indices,
 * i.e. (0, 1), and (0, 2)
 */
-   if ((log_index[0] == 0) && ((log_index[1] == 1) || (log_index[1] == 2)))
+   if ((log_index[0] == 0) &&
+   ((log_index[1] == 1) || (log_index[1] == 2))) {
; /* known index possibilities */
-   else {
+   } else {
dev_err(to_dev(arena), "Found an unknown padding scheme\n");
return -ENXIO;
}
diff --git a/drivers/nvdimm/btt_devs.c b/drivers/nvdimm/btt_devs.c
index f6429842f1b6..9e0f17045e69 100644
--- a/drivers/nvdimm/btt_devs.c
+++ b/drivers/nvdimm/btt_devs.c
@@ -139,9 +139,9 @@ static ssize_t size_show(struct device *dev,
ssize_t rc;
 
nd_device_lock(dev);
-   if (dev->driver)
+   if (dev->driver) {
rc = sprintf(buf, "%llu\n", nd_btt->size);
-   else {
+   } else {
/* no size to convey if the btt instance is disabled */
rc = -ENXIO;
}
diff --git a/drivers/nvdimm/bus.c b/drivers/nvdimm/bus.c
index 5ffd61c9c4b7..620f07ac306c 100644
--- a/drivers/nvdimm/bus.c
+++ b/drivers/nvdimm/bus.c
@@ -857,9 +857,9 @@ u32 nd_cmd_out_size(struct nvdimm *nvdimm, int cmd,
 
if (nvdimm && cmd == ND_CMD_GET_CONFIG_DATA && idx == 1)
return in_field[1];
-   else if (nvdimm && cmd == ND_CMD_VENDOR && idx == 2)
+   if (nvdimm && cmd == ND_CMD_VENDOR && idx == 2)
return out_field[1];
-   else if (!nvdimm && cmd == ND_CMD_ARS_STATUS && idx == 2) {
+   if (!nvdimm && cmd == ND_CMD_ARS_STATUS && idx == 2) {
/*
 * Per table 9-276 ARS Data in ACPI 6.1, out_field[1] is
 * "Size of Output Buffer in bytes, including this
@@ -876,7 +876,8 @@ u32 nd_cmd_out_size(struct nvdimm *nvdimm, int cmd,
if (out_field[1] - 4 == remainder)
return remainder;
return out_field[1] - 8;
-   } else if (cmd == ND_CMD_CALL) {
+   }
+   if (cmd == ND_CMD_CALL) {
struct nd_cmd_pkg *pkg = (struct nd_cmd_pkg *)in_field;
 

[PATCH 06/13] nvdimm: Add and remove blank lines

2019-09-11 Thread Joe Perches
Use a more common kernel style.

Remove unnecessary multiple blank lines.
Remove blank lines before and after braces.
Add blank lines after functions definitions and enums.
Add blank lines around #define pr_fmt.

Signed-off-by: Joe Perches 
---
 drivers/nvdimm/btt.c| 2 --
 drivers/nvdimm/bus.c| 5 ++---
 drivers/nvdimm/dimm.c   | 1 -
 drivers/nvdimm/dimm_devs.c  | 2 ++
 drivers/nvdimm/label.c  | 1 -
 drivers/nvdimm/namespace_devs.c | 5 -
 drivers/nvdimm/nd-core.h| 4 
 drivers/nvdimm/nd.h | 6 ++
 drivers/nvdimm/nd_virtio.c  | 1 -
 drivers/nvdimm/region_devs.c| 1 +
 drivers/nvdimm/security.c   | 2 --
 11 files changed, 15 insertions(+), 15 deletions(-)

diff --git a/drivers/nvdimm/btt.c b/drivers/nvdimm/btt.c
index 28b65413abd8..0927cbdc5cc6 100644
--- a/drivers/nvdimm/btt.c
+++ b/drivers/nvdimm/btt.c
@@ -1354,7 +1354,6 @@ static int btt_write_pg(struct btt *btt, struct 
bio_integrity_payload *bip,
while (arena->rtt[i] == (RTT_VALID | new_postmap))
cpu_relax();
 
-
if (new_postmap >= arena->internal_nlba) {
ret = -EIO;
goto out_lane;
@@ -1496,7 +1495,6 @@ static int btt_rw_page(struct block_device *bdev, 
sector_t sector,
return rc;
 }
 
-
 static int btt_getgeo(struct block_device *bd, struct hd_geometry *geo)
 {
/* some standard values */
diff --git a/drivers/nvdimm/bus.c b/drivers/nvdimm/bus.c
index 35591f492d27..5ffd61c9c4b7 100644
--- a/drivers/nvdimm/bus.c
+++ b/drivers/nvdimm/bus.c
@@ -2,7 +2,9 @@
 /*
  * Copyright(c) 2013-2015 Intel Corporation. All rights reserved.
  */
+
 #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
 #include 
 #include 
 #include 
@@ -643,7 +645,6 @@ int nvdimm_revalidate_disk(struct gendisk *disk)
set_disk_ro(disk, 1);
 
return 0;
-
 }
 EXPORT_SYMBOL(nvdimm_revalidate_disk);
 
@@ -881,7 +882,6 @@ u32 nd_cmd_out_size(struct nvdimm *nvdimm, int cmd,
return pkg->nd_size_out;
}
 
-
return UINT_MAX;
 }
 EXPORT_SYMBOL_GPL(nd_cmd_out_size);
@@ -940,7 +940,6 @@ static int nd_pmem_forget_poison_check(struct device *dev, 
void *data)
return -EBUSY;
 
return 0;
-
 }
 
 static int nd_ns_forget_poison_check(struct device *dev, void *data)
diff --git a/drivers/nvdimm/dimm.c b/drivers/nvdimm/dimm.c
index 916710ae647f..5783c6d6dbdc 100644
--- a/drivers/nvdimm/dimm.c
+++ b/drivers/nvdimm/dimm.c
@@ -62,7 +62,6 @@ static int nvdimm_probe(struct device *dev)
if (rc < 0)
dev_dbg(dev, "failed to unlock dimm: %d\n", rc);
 
-
/*
 * EACCES failures reading the namespace label-area-properties
 * are interpreted as the DIMM capacity being locked but the
diff --git a/drivers/nvdimm/dimm_devs.c b/drivers/nvdimm/dimm_devs.c
index cb5598b3c389..873df96795b0 100644
--- a/drivers/nvdimm/dimm_devs.c
+++ b/drivers/nvdimm/dimm_devs.c
@@ -2,7 +2,9 @@
 /*
  * Copyright(c) 2013-2015 Intel Corporation. All rights reserved.
  */
+
 #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
 #include 
 #include 
 #include 
diff --git a/drivers/nvdimm/label.c b/drivers/nvdimm/label.c
index 5700d9b35b8f..bf58357927c4 100644
--- a/drivers/nvdimm/label.c
+++ b/drivers/nvdimm/label.c
@@ -972,7 +972,6 @@ static int __blk_label_update(struct nd_region *nd_region,
}
/* from here on we need to abort on error */
 
-
/* assign all resources to the namespace before writing the labels */
nsblk->res = NULL;
nsblk->num_resources = 0;
diff --git a/drivers/nvdimm/namespace_devs.c b/drivers/nvdimm/namespace_devs.c
index 2bf4b6344926..600df84b4d2d 100644
--- a/drivers/nvdimm/namespace_devs.c
+++ b/drivers/nvdimm/namespace_devs.c
@@ -367,7 +367,6 @@ resource_size_t nd_namespace_blk_validate(struct 
nd_namespace_blk *nsblk)
 }
 EXPORT_SYMBOL(nd_namespace_blk_validate);
 
-
 static int nd_namespace_label_update(struct nd_region *nd_region,
 struct device *dev)
 {
@@ -543,7 +542,6 @@ static resource_size_t init_dpa_allocation(struct 
nd_label_id *label_id,
return rc ? n : 0;
 }
 
-
 /**
  * space_valid() - validate free dpa space against constraints
  * @nd_region: hosting region of the free space
@@ -2009,7 +2007,6 @@ static struct device *create_namespace_pmem(struct 
nd_region *nd_region,
if (namespace_label_has(ndd, abstraction_guid))
nspm->nsio.common.claim_class
= to_nvdimm_cclass(>abstraction_guid);
-
}
 
if (!nspm->alt_name || !nspm->uuid) {
@@ -2217,7 +2214,6 @@ static int add_namespace_resource(struct nd_region 
*nd_region,
 static struct device *create_namespace_blk(struct nd_region *nd_region,
   struct nd_namespace_label *nd_label, 
int count)
 {
-
struct nd_mapping *nd_mapping = 

[PATCH 12/13] nvdimm: namespace_devs: Change progess typo to progress

2019-09-11 Thread Joe Perches
Typing is hard.

Signed-off-by: Joe Perches 
---
 drivers/nvdimm/namespace_devs.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/nvdimm/namespace_devs.c b/drivers/nvdimm/namespace_devs.c
index 7a16340f9853..253f07d97b73 100644
--- a/drivers/nvdimm/namespace_devs.c
+++ b/drivers/nvdimm/namespace_devs.c
@@ -1718,7 +1718,7 @@ struct nd_namespace_common 
*nvdimm_namespace_common_probe(struct device *dev)
return ERR_PTR(-ENODEV);
 
/*
-* Flush any in-progess probes / removals in the driver
+* Flush any in-progress probes / removals in the driver
 * for the raw personality of this namespace.
 */
nd_device_lock(>dev);
-- 
2.15.0



[PATCH 11/13] nvdimm: Use more common logic testing styles and bare ; positions

2019-09-11 Thread Joe Perches
Avoid using uncommon logic testing styles to make the code a
bit more like other kernel code.

e.g.:
if (foo) {
;
} else {

}

is typically written

if (!foo) {

}

Also put bare semicolons before the comment not after the comment

e.g.:

if (foo) {
/* comment */;
} else if (bar) {

} else {
baz;
}

is typically written

if (foo) {
;   /* comment */
} else if (bar) {

} else {
baz;
}

Signed-off-by: Joe Perches 
---
 drivers/nvdimm/claim.c  |  4 +---
 drivers/nvdimm/dimm_devs.c  | 11 --
 drivers/nvdimm/label.c  |  4 +---
 drivers/nvdimm/namespace_devs.c | 46 +++--
 drivers/nvdimm/region_devs.c|  4 +---
 5 files changed, 28 insertions(+), 41 deletions(-)

diff --git a/drivers/nvdimm/claim.c b/drivers/nvdimm/claim.c
index 3732925aadb8..244631f5308c 100644
--- a/drivers/nvdimm/claim.c
+++ b/drivers/nvdimm/claim.c
@@ -149,9 +149,7 @@ ssize_t nd_namespace_store(struct device *dev,
return -ENOMEM;
strim(name);
 
-   if (strncmp(name, "namespace", 9) == 0 || strcmp(name, "") == 0) {
-   /* pass */;
-   } else {
+   if (!(strncmp(name, "namespace", 9) == 0 || strcmp(name, "") == 0)) {
len = -EINVAL;
goto out;
}
diff --git a/drivers/nvdimm/dimm_devs.c b/drivers/nvdimm/dimm_devs.c
index 4df85dd72682..cac62bb726bb 100644
--- a/drivers/nvdimm/dimm_devs.c
+++ b/drivers/nvdimm/dimm_devs.c
@@ -593,13 +593,10 @@ int alias_dpa_busy(struct device *dev, void *data)
 * looking to validate against PMEM aliasing collision rules
 * (i.e. BLK is allocated after all aliased PMEM).
 */
-   if (info->res) {
-   if (info->res->start >= nd_mapping->start &&
-   info->res->start < map_end)
-   /* pass */;
-   else
-   return 0;
-   }
+   if (info->res &&
+   (info->res->start < nd_mapping->start ||
+info->res->start >= map_end))
+   return 0;
 
 retry:
/*
diff --git a/drivers/nvdimm/label.c b/drivers/nvdimm/label.c
index e4632dbebead..ae466c6faa90 100644
--- a/drivers/nvdimm/label.c
+++ b/drivers/nvdimm/label.c
@@ -1180,9 +1180,7 @@ static int init_labels(struct nd_mapping *nd_mapping, int 
num_labels)
mutex_unlock(_mapping->lock);
}
 
-   if (ndd->ns_current == -1 || ndd->ns_next == -1)
-   /* pass */;
-   else
+   if (ndd->ns_current != -1 && ndd->ns_next != -1)
return max(num_labels, old_num_labels);
 
nsindex = to_namespace_index(ndd, 0);
diff --git a/drivers/nvdimm/namespace_devs.c b/drivers/nvdimm/namespace_devs.c
index 8c75ef84bad7..7a16340f9853 100644
--- a/drivers/nvdimm/namespace_devs.c
+++ b/drivers/nvdimm/namespace_devs.c
@@ -162,7 +162,7 @@ unsigned int pmem_sector_size(struct nd_namespace_common 
*ndns)
 
nspm = to_nd_namespace_pmem(>dev);
if (nspm->lbasize == 0 || nspm->lbasize == 512)
-   /* default */;
+   ;   /* default */
else if (nspm->lbasize == 4096)
return 4096;
else
@@ -387,7 +387,7 @@ static int nd_namespace_label_update(struct nd_region 
*nd_region,
resource_size_t size = resource_size(>nsio.res);
 
if (size == 0 && nspm->uuid)
-   /* delete allocation */;
+   ;   /* delete allocation */
else if (!nspm->uuid)
return 0;
 
@@ -398,7 +398,7 @@ static int nd_namespace_label_update(struct nd_region 
*nd_region,
resource_size_t size = nd_namespace_blk_size(nsblk);
 
if (size == 0 && nsblk->uuid)
-   /* delete allocation */;
+   ;   /* delete allocation */
else if (!nsblk->uuid || !nsblk->lbasize)
return 0;
 
@@ -1900,10 +1900,8 @@ static int select_pmem_id(struct nd_region *nd_region, 
u8 *pmem_id)
hw_end = hw_start + nd_mapping->size;
pmem_start = __le64_to_cpu(nd_label->dpa);
pmem_end = pmem_start + __le64_to_cpu(nd_label->rawsize);
-   if (pmem_start >= hw_start && pmem_start < hw_end &&
-   pmem_end <= hw_end && pmem_end > hw_start) {
-   /* pass */;
-   } else {
+   if (!(pmem_start >= hw_start && pmem_start < hw_end &&
+ pmem_end <= hw_end && pmem_end > hw_start)) {
dev_dbg(_region->dev, "%s invalid label for %pUb\n",

[PATCH 03/13] nvdimm: Use octal permissions

2019-09-11 Thread Joe Perches
Avoid the use of the S_IRUGO define and use 0444 to improve readability
and use a more common kernel style.

Signed-off-by: Joe Perches 
---
 drivers/nvdimm/btt.c | 39 ++-
 1 file changed, 18 insertions(+), 21 deletions(-)

diff --git a/drivers/nvdimm/btt.c b/drivers/nvdimm/btt.c
index 6362d96dfc16..9cad4dca6eac 100644
--- a/drivers/nvdimm/btt.c
+++ b/drivers/nvdimm/btt.c
@@ -229,27 +229,24 @@ static void arena_debugfs_init(struct arena_info *a, 
struct dentry *parent,
return;
a->debugfs_dir = d;
 
-   debugfs_create_x64("size", S_IRUGO, d, >size);
-   debugfs_create_x64("external_lba_start", S_IRUGO, d,
-  >external_lba_start);
-   debugfs_create_x32("internal_nlba", S_IRUGO, d, >internal_nlba);
-   debugfs_create_u32("internal_lbasize", S_IRUGO, d,
-  >internal_lbasize);
-   debugfs_create_x32("external_nlba", S_IRUGO, d, >external_nlba);
-   debugfs_create_u32("external_lbasize", S_IRUGO, d,
-  >external_lbasize);
-   debugfs_create_u32("nfree", S_IRUGO, d, >nfree);
-   debugfs_create_u16("version_major", S_IRUGO, d, >version_major);
-   debugfs_create_u16("version_minor", S_IRUGO, d, >version_minor);
-   debugfs_create_x64("nextoff", S_IRUGO, d, >nextoff);
-   debugfs_create_x64("infooff", S_IRUGO, d, >infooff);
-   debugfs_create_x64("dataoff", S_IRUGO, d, >dataoff);
-   debugfs_create_x64("mapoff", S_IRUGO, d, >mapoff);
-   debugfs_create_x64("logoff", S_IRUGO, d, >logoff);
-   debugfs_create_x64("info2off", S_IRUGO, d, >info2off);
-   debugfs_create_x32("flags", S_IRUGO, d, >flags);
-   debugfs_create_u32("log_index_0", S_IRUGO, d, >log_index[0]);
-   debugfs_create_u32("log_index_1", S_IRUGO, d, >log_index[1]);
+   debugfs_create_x64("size", 0444, d, >size);
+   debugfs_create_x64("external_lba_start", 0444, d, 
>external_lba_start);
+   debugfs_create_x32("internal_nlba", 0444, d, >internal_nlba);
+   debugfs_create_u32("internal_lbasize", 0444, d, >internal_lbasize);
+   debugfs_create_x32("external_nlba", 0444, d, >external_nlba);
+   debugfs_create_u32("external_lbasize", 0444, d, >external_lbasize);
+   debugfs_create_u32("nfree", 0444, d, >nfree);
+   debugfs_create_u16("version_major", 0444, d, >version_major);
+   debugfs_create_u16("version_minor", 0444, d, >version_minor);
+   debugfs_create_x64("nextoff", 0444, d, >nextoff);
+   debugfs_create_x64("infooff", 0444, d, >infooff);
+   debugfs_create_x64("dataoff", 0444, d, >dataoff);
+   debugfs_create_x64("mapoff", 0444, d, >mapoff);
+   debugfs_create_x64("logoff", 0444, d, >logoff);
+   debugfs_create_x64("info2off", 0444, d, >info2off);
+   debugfs_create_x32("flags", 0444, d, >flags);
+   debugfs_create_u32("log_index_0", 0444, d, >log_index[0]);
+   debugfs_create_u32("log_index_1", 0444, d, >log_index[1]);
 }
 
 static void btt_debugfs_init(struct btt *btt)
-- 
2.15.0



[PATCH 02/13] nvdimm: Move logical continuations to previous line

2019-09-11 Thread Joe Perches
Make the logical continuation style more like the rest of the kernel.

No change in object files.

Signed-off-by: Joe Perches 
---
 drivers/nvdimm/btt.c|  9 +
 drivers/nvdimm/bus.c|  4 ++--
 drivers/nvdimm/claim.c  |  4 ++--
 drivers/nvdimm/dimm_devs.c  | 23 ---
 drivers/nvdimm/label.c  |  8 
 drivers/nvdimm/namespace_devs.c | 40 +---
 drivers/nvdimm/pfn_devs.c   | 17 +
 drivers/nvdimm/pmem.c   |  5 +++--
 drivers/nvdimm/region.c |  6 +++---
 drivers/nvdimm/region_devs.c| 23 ---
 drivers/nvdimm/security.c   | 34 --
 11 files changed, 93 insertions(+), 80 deletions(-)

diff --git a/drivers/nvdimm/btt.c b/drivers/nvdimm/btt.c
index d3e187ac43eb..6362d96dfc16 100644
--- a/drivers/nvdimm/btt.c
+++ b/drivers/nvdimm/btt.c
@@ -603,8 +603,9 @@ static int btt_freelist_init(struct arena_info *arena)
 
 static bool ent_is_padding(struct log_entry *ent)
 {
-   return (ent->lba == 0) && (ent->old_map == 0) && (ent->new_map == 0)
-   && (ent->seq == 0);
+   return (ent->lba == 0) &&
+   (ent->old_map == 0) && (ent->new_map == 0) &&
+   (ent->seq == 0);
 }
 
 /*
@@ -1337,8 +1338,8 @@ static int btt_write_pg(struct btt *btt, struct 
bio_integrity_payload *bip,
if (btt_is_badblock(btt, arena, arena->freelist[lane].block))
arena->freelist[lane].has_err = 1;
 
-   if (mutex_is_locked(>err_lock)
-   || arena->freelist[lane].has_err) {
+   if (mutex_is_locked(>err_lock) ||
+   arena->freelist[lane].has_err) {
nd_region_release_lane(btt->nd_region, lane);
 
ret = arena_clear_freelist_error(arena, lane);
diff --git a/drivers/nvdimm/bus.c b/drivers/nvdimm/bus.c
index 83b6fcbb252d..6d4d4c72ac92 100644
--- a/drivers/nvdimm/bus.c
+++ b/drivers/nvdimm/bus.c
@@ -189,8 +189,8 @@ static int nvdimm_clear_badblocks_region(struct device 
*dev, void *data)
ndr_end = nd_region->ndr_start + nd_region->ndr_size - 1;
 
/* make sure we are in the region */
-   if (ctx->phys < nd_region->ndr_start
-   || (ctx->phys + ctx->cleared) > ndr_end)
+   if (ctx->phys < nd_region->ndr_start ||
+   (ctx->phys + ctx->cleared) > ndr_end)
return 0;
 
sector = (ctx->phys - nd_region->ndr_start) / 512;
diff --git a/drivers/nvdimm/claim.c b/drivers/nvdimm/claim.c
index 62f3afaa7d27..ff66a3cc349c 100644
--- a/drivers/nvdimm/claim.c
+++ b/drivers/nvdimm/claim.c
@@ -274,8 +274,8 @@ static int nsio_rw_bytes(struct nd_namespace_common *ndns,
}
 
if (unlikely(is_bad_pmem(>bb, sector, sz_align))) {
-   if (IS_ALIGNED(offset, 512) && IS_ALIGNED(size, 512)
-   && !(flags & NVDIMM_IO_ATOMIC)) {
+   if (IS_ALIGNED(offset, 512) && IS_ALIGNED(size, 512) &&
+   !(flags & NVDIMM_IO_ATOMIC)) {
long cleared;
 
might_sleep();
diff --git a/drivers/nvdimm/dimm_devs.c b/drivers/nvdimm/dimm_devs.c
index 52b00078939b..cb5598b3c389 100644
--- a/drivers/nvdimm/dimm_devs.c
+++ b/drivers/nvdimm/dimm_devs.c
@@ -437,10 +437,11 @@ static umode_t nvdimm_visible(struct kobject *kobj, 
struct attribute *a, int n)
 
if (a == _attr_security.attr) {
/* Are there any state mutation ops (make writable)? */
-   if (nvdimm->sec.ops->freeze || nvdimm->sec.ops->disable
-   || nvdimm->sec.ops->change_key
-   || nvdimm->sec.ops->erase
-   || nvdimm->sec.ops->overwrite)
+   if (nvdimm->sec.ops->freeze ||
+   nvdimm->sec.ops->disable ||
+   nvdimm->sec.ops->change_key ||
+   nvdimm->sec.ops->erase ||
+   nvdimm->sec.ops->overwrite)
return a->mode;
return 0444;
}
@@ -516,8 +517,9 @@ int nvdimm_security_setup_events(struct device *dev)
 {
struct nvdimm *nvdimm = to_nvdimm(dev);
 
-   if (!nvdimm->sec.flags || !nvdimm->sec.ops
-   || !nvdimm->sec.ops->overwrite)
+   if (!nvdimm->sec.flags ||
+   !nvdimm->sec.ops ||
+   !nvdimm->sec.ops->overwrite)
return 0;
nvdimm->sec.overwrite_state = sysfs_get_dirent(dev->kobj.sd, 
"security");
if (!nvdimm->sec.overwrite_state)
@@ -589,8 +591,8 @@ int alias_dpa_busy(struct device *dev, void *data)
 * (i.e. BLK is allocated after all aliased PMEM).
 */
if (info->res) {
-   if (info->res->start >= nd_mapping->start
-   && info->res->start < map_end)
+   if (info->res->start >= nd_mapping->start &&
+   info->res->start < map_end)
 

Re: [PATCH] mm/memblock: fix typo in memblock doc

2019-09-11 Thread Cao jin
On 9/11/19 10:42 PM, Mike Rapoport wrote:
> On Wed, Sep 11, 2019 at 11:08:56AM +0800, Cao jin wrote:
>> elaboarte -> elaborate
>> architecure -> architecture
>> compltes -> completes
>>
>> Signed-off-by: Cao jin 
>> ---
>>  mm/memblock.c | 6 +++---
>>  1 file changed, 3 insertions(+), 3 deletions(-)
>>
>> diff --git a/mm/memblock.c b/mm/memblock.c
>> index 7d4f61ae666a..0d0f92003d18 100644
>> --- a/mm/memblock.c
>> +++ b/mm/memblock.c
>> @@ -83,16 +83,16 @@
>>   * Note, that both API variants use implict assumptions about allowed
>>   * memory ranges and the fallback methods. Consult the documentation
>>   * of :c:func:`memblock_alloc_internal` and
>> - * :c:func:`memblock_alloc_range_nid` functions for more elaboarte
>> + * :c:func:`memblock_alloc_range_nid` functions for more elaborate
> 
> While on it, could you please replace the
> :c:func:`memblock_alloc_range_nid` construct with
> memblock_alloc_range_nid()?
> 
> And that would be really great to see all the :c:func:`foo` changed to
> foo().
> 

Sure. BTW, do you want convert all the markups too?

:c:type:`foo` -> struct foo
%FOO -> FOO
``foo`` -> foo
**foo** -> foo

-- 
Sincerely,
Cao jin




[PATCH 00/13] nvdimm: Use more common kernel coding style

2019-09-11 Thread Joe Perches
Rather than have a local coding style, use the typical kernel style.

Joe Perches (13):
  nvdimm: Use more typical whitespace
  nvdimm: Move logical continuations to previous line
  nvdimm: Use octal permissions
  nvdimm: Use a more common kernel spacing style
  nvdimm: Use "unsigned int" in preference to "unsigned"
  nvdimm: Add and remove blank lines
  nvdimm: Use typical kernel brace styles
  nvdimm: Use typical kernel style indentation
  nvdimm: btt.h: Neaten #defines to improve readability
  nvdimm: namespace_devs: Move assignment operators
  nvdimm: Use more common logic testing styles and bare ; positions
  nvdimm: namespace_devs: Change progess typo to progress
  nvdimm: Miscellaneous neatening

 drivers/nvdimm/badrange.c   |  22 +-
 drivers/nvdimm/blk.c|  39 ++--
 drivers/nvdimm/btt.c| 249 +++--
 drivers/nvdimm/btt.h|  56 ++---
 drivers/nvdimm/btt_devs.c   |  68 +++---
 drivers/nvdimm/bus.c| 138 ++--
 drivers/nvdimm/claim.c  |  50 ++---
 drivers/nvdimm/core.c   |  42 ++--
 drivers/nvdimm/dax_devs.c   |   3 +-
 drivers/nvdimm/dimm.c   |   3 +-
 drivers/nvdimm/dimm_devs.c  | 107 -
 drivers/nvdimm/e820.c   |   2 +-
 drivers/nvdimm/label.c  | 213 +-
 drivers/nvdimm/label.h  |   6 +-
 drivers/nvdimm/namespace_devs.c | 472 +---
 drivers/nvdimm/nd-core.h|  31 +--
 drivers/nvdimm/nd.h |  94 
 drivers/nvdimm/nd_virtio.c  |  20 +-
 drivers/nvdimm/of_pmem.c|   6 +-
 drivers/nvdimm/pfn_devs.c   | 136 ++--
 drivers/nvdimm/pmem.c   |  57 ++---
 drivers/nvdimm/pmem.h   |   2 +-
 drivers/nvdimm/region.c |  20 +-
 drivers/nvdimm/region_devs.c| 160 +++---
 drivers/nvdimm/security.c   | 138 ++--
 drivers/nvdimm/virtio_pmem.c|  10 +-
 26 files changed, 1115 insertions(+), 1029 deletions(-)

-- 
2.15.0



Re: [PATCH] fs/userfaultfd.c: simplify the calculation of new_flags

2019-09-11 Thread Wei Yang
Ping~

On Tue, Aug 06, 2019 at 01:38:59PM +0800, Wei Yang wrote:
>Finally new_flags equals old vm_flags *OR* vm_flags.
>
>It is not necessary to mask them first.
>
>Signed-off-by: Wei Yang 
>---
> fs/userfaultfd.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
>diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c
>index ccbdbd62f0d8..653d8f7c453c 100644
>--- a/fs/userfaultfd.c
>+++ b/fs/userfaultfd.c
>@@ -1457,7 +1457,7 @@ static int userfaultfd_register(struct userfaultfd_ctx 
>*ctx,
>   start = vma->vm_start;
>   vma_end = min(end, vma->vm_end);
> 
>-  new_flags = (vma->vm_flags & ~vm_flags) | vm_flags;
>+  new_flags = vma->vm_flags | vm_flags;
>   prev = vma_merge(mm, prev, start, vma_end, new_flags,
>vma->anon_vma, vma->vm_file, vma->vm_pgoff,
>vma_policy(vma),
>-- 
>2.17.1

-- 
Wei Yang
Help you, Help me


Re: [PATCH] dts: arm64: imx8mq: Enable gpu passive throttling

2019-09-11 Thread Guido Günther
Hi,
On Wed, Sep 11, 2019 at 07:14:25PM -0700, Guido Günther wrote:
> Temperature and hysteresis were picked after the CPU.

I pulled that one from the wrong branch so please disregard. I've
sent out a v2.
Cheers,
 -- Guido

> 
> Signed-off-by: Guido Günther 
> ---
>  arch/arm64/boot/dts/freescale/imx8mq.dtsi | 15 +++
>  1 file changed, 15 insertions(+)
> 
> diff --git a/arch/arm64/boot/dts/freescale/imx8mq.dtsi 
> b/arch/arm64/boot/dts/freescale/imx8mq.dtsi
> index 564045927485..fda636085bb3 100644
> --- a/arch/arm64/boot/dts/freescale/imx8mq.dtsi
> +++ b/arch/arm64/boot/dts/freescale/imx8mq.dtsi
> @@ -235,12 +235,26 @@
>   thermal-sensors = < 1>;
>  
>   trips {
> + gpu-alert {
> + temperature = <8>;
> + hysteresis = <2000>;
> + type = "passive";
> + };
> +
>   gpu-crit {
>   temperature = <9>;
>   hysteresis = <2000>;
>   type = "critical";
>   };
>   };
> +
> + cooling-maps {
> + map0 {
> + trip = <_alert>;
> + cooling-device =
> + < THERMAL_NO_LIMIT 
> THERMAL_NO_LIMIT>;
> + };
> + };
>   };
>  
>   vpu-thermal {
> @@ -1006,6 +1020,7 @@
>< IMX8MQ_CLK_GPU_AXI>,
>< IMX8MQ_CLK_GPU_AHB>;
>   clock-names = "core", "shader", "bus", "reg";
> + #cooling-cells = <2>;
>   assigned-clocks = < IMX8MQ_CLK_GPU_CORE_SRC>,
> < IMX8MQ_CLK_GPU_SHADER_SRC>,
> < IMX8MQ_CLK_GPU_AXI>,
> -- 
> 2.23.0.rc1
> 
> 
> ___
> linux-arm-kernel mailing list
> linux-arm-ker...@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel


[PATCH v2 3/4] mm: avoid slub allocation while holding list_lock

2019-09-11 Thread Yu Zhao
If we are already under list_lock, don't call kmalloc(). Otherwise we
will run into deadlock because kmalloc() also tries to grab the same
lock.

Fixing the problem by using a static bitmap instead.

  WARNING: possible recursive locking detected
  
  mount-encrypted/4921 is trying to acquire lock:
  (&(>list_lock)->rlock){-.-.}, at: ___slab_alloc+0x104/0x437

  but task is already holding lock:
  (&(>list_lock)->rlock){-.-.}, at: __kmem_cache_shutdown+0x81/0x3cb

  other info that might help us debug this:
   Possible unsafe locking scenario:

 CPU0
 
lock(&(>list_lock)->rlock);
lock(&(>list_lock)->rlock);

   *** DEADLOCK ***

Signed-off-by: Yu Zhao 
---
 mm/slub.c | 88 +--
 1 file changed, 47 insertions(+), 41 deletions(-)

diff --git a/mm/slub.c b/mm/slub.c
index 7b7e1ee264ef..baa60dd73942 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -443,19 +443,38 @@ static inline bool cmpxchg_double_slab(struct kmem_cache 
*s, struct page *page,
 }
 
 #ifdef CONFIG_SLUB_DEBUG
+static unsigned long object_map[BITS_TO_LONGS(MAX_OBJS_PER_PAGE)];
+static DEFINE_SPINLOCK(object_map_lock);
+
 /*
  * Determine a map of object in use on a page.
  *
  * Node listlock must be held to guarantee that the page does
  * not vanish from under us.
  */
-static void get_map(struct kmem_cache *s, struct page *page, unsigned long 
*map)
+static unsigned long *get_map(struct kmem_cache *s, struct page *page)
 {
void *p;
void *addr = page_address(page);
 
+   VM_BUG_ON(!irqs_disabled());
+
+   spin_lock(_map_lock);
+
+   bitmap_zero(object_map, page->objects);
+
for (p = page->freelist; p; p = get_freepointer(s, p))
-   set_bit(slab_index(p, s, addr), map);
+   set_bit(slab_index(p, s, addr), object_map);
+
+   return object_map;
+}
+
+static void put_map(unsigned long *map)
+{
+   VM_BUG_ON(map != object_map);
+   lockdep_assert_held(_map_lock);
+
+   spin_unlock(_map_lock);
 }
 
 static inline unsigned int size_from_object(struct kmem_cache *s)
@@ -3685,13 +3704,12 @@ static void list_slab_objects(struct kmem_cache *s, 
struct page *page,
 #ifdef CONFIG_SLUB_DEBUG
void *addr = page_address(page);
void *p;
-   unsigned long *map = bitmap_zalloc(page->objects, GFP_ATOMIC);
-   if (!map)
-   return;
+   unsigned long *map;
+
slab_err(s, page, text, s->name);
slab_lock(page);
 
-   get_map(s, page, map);
+   map = get_map(s, page);
for_each_object(p, s, addr, page->objects) {
 
if (!test_bit(slab_index(p, s, addr), map)) {
@@ -3699,8 +3717,9 @@ static void list_slab_objects(struct kmem_cache *s, 
struct page *page,
print_tracking(s, p);
}
}
+   put_map(map);
+
slab_unlock(page);
-   bitmap_free(map);
 #endif
 }
 
@@ -4386,19 +4405,19 @@ static int count_total(struct page *page)
 #endif
 
 #ifdef CONFIG_SLUB_DEBUG
-static void validate_slab(struct kmem_cache *s, struct page *page,
-   unsigned long *map)
+static void validate_slab(struct kmem_cache *s, struct page *page)
 {
void *p;
void *addr = page_address(page);
+   unsigned long *map;
+
+   slab_lock(page);
 
if (!check_slab(s, page) || !on_freelist(s, page, NULL))
-   return;
+   goto unlock;
 
/* Now we know that a valid freelist exists */
-   bitmap_zero(map, page->objects);
-
-   get_map(s, page, map);
+   map = get_map(s, page);
for_each_object(p, s, addr, page->objects) {
u8 val = test_bit(slab_index(p, s, addr), map) ?
 SLUB_RED_INACTIVE : SLUB_RED_ACTIVE;
@@ -4406,18 +4425,13 @@ static void validate_slab(struct kmem_cache *s, struct 
page *page,
if (!check_object(s, page, p, val))
break;
}
-}
-
-static void validate_slab_slab(struct kmem_cache *s, struct page *page,
-   unsigned long *map)
-{
-   slab_lock(page);
-   validate_slab(s, page, map);
+   put_map(map);
+unlock:
slab_unlock(page);
 }
 
 static int validate_slab_node(struct kmem_cache *s,
-   struct kmem_cache_node *n, unsigned long *map)
+   struct kmem_cache_node *n)
 {
unsigned long count = 0;
struct page *page;
@@ -4426,7 +4440,7 @@ static int validate_slab_node(struct kmem_cache *s,
spin_lock_irqsave(>list_lock, flags);
 
list_for_each_entry(page, >partial, slab_list) {
-   validate_slab_slab(s, page, map);
+   validate_slab(s, page);
count++;
}
if (count != n->nr_partial)
@@ -4437,7 +4451,7 @@ static int validate_slab_node(struct kmem_cache *s,
goto out;
 

[PATCH v2 2/4] mm: clean up validate_slab()

2019-09-11 Thread Yu Zhao
The function doesn't need to return any value, and the check can be
done in one pass.

There is a behavior change: before the patch, we stop at the first
invalid free object; after the patch, we stop at the first invalid
object, free or in use. This shouldn't matter because the original
behavior isn't intended anyway.

Signed-off-by: Yu Zhao 
---
 mm/slub.c | 21 -
 1 file changed, 8 insertions(+), 13 deletions(-)

diff --git a/mm/slub.c b/mm/slub.c
index 62053ceb4464..7b7e1ee264ef 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -4386,31 +4386,26 @@ static int count_total(struct page *page)
 #endif
 
 #ifdef CONFIG_SLUB_DEBUG
-static int validate_slab(struct kmem_cache *s, struct page *page,
+static void validate_slab(struct kmem_cache *s, struct page *page,
unsigned long *map)
 {
void *p;
void *addr = page_address(page);
 
-   if (!check_slab(s, page) ||
-   !on_freelist(s, page, NULL))
-   return 0;
+   if (!check_slab(s, page) || !on_freelist(s, page, NULL))
+   return;
 
/* Now we know that a valid freelist exists */
bitmap_zero(map, page->objects);
 
get_map(s, page, map);
for_each_object(p, s, addr, page->objects) {
-   if (test_bit(slab_index(p, s, addr), map))
-   if (!check_object(s, page, p, SLUB_RED_INACTIVE))
-   return 0;
-   }
+   u8 val = test_bit(slab_index(p, s, addr), map) ?
+SLUB_RED_INACTIVE : SLUB_RED_ACTIVE;
 
-   for_each_object(p, s, addr, page->objects)
-   if (!test_bit(slab_index(p, s, addr), map))
-   if (!check_object(s, page, p, SLUB_RED_ACTIVE))
-   return 0;
-   return 1;
+   if (!check_object(s, page, p, val))
+   break;
+   }
 }
 
 static void validate_slab_slab(struct kmem_cache *s, struct page *page,
-- 
2.23.0.162.g0b9fbb3734-goog



[PATCH v2 4/4] mm: lock slub page when listing objects

2019-09-11 Thread Yu Zhao
Though I have no idea what the side effect of such race would be,
apparently we want to prevent the free list from being changed
while debugging the objects.

Signed-off-by: Yu Zhao 
---
 mm/slub.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/mm/slub.c b/mm/slub.c
index baa60dd73942..1c9726c28f0b 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -4608,11 +4608,15 @@ static void process_slab(struct loc_track *t, struct 
kmem_cache *s,
void *p;
unsigned long *map;
 
+   slab_lock(page);
+
map = get_map(s, page);
for_each_object(p, s, addr, page->objects)
if (!test_bit(slab_index(p, s, addr), map))
add_location(t, s, get_track(s, p, alloc));
put_map(map);
+
+   slab_unlock(page);
 }
 
 static int list_locations(struct kmem_cache *s, char *buf,
-- 
2.23.0.162.g0b9fbb3734-goog



[PATCH v2 1/4] mm: correct mask size for slub page->objects

2019-09-11 Thread Yu Zhao
Mask of slub objects per page shouldn't be larger than what
page->objects can hold.

It requires more than 2^15 objects to hit the problem, and I don't
think anybody would. It'd be nice to have the mask fixed, but not
really worth cc'ing the stable.

Fixes: 50d5c41cd151 ("slub: Do not use frozen page flag but a bit in the page 
counters")
Signed-off-by: Yu Zhao 
---
 mm/slub.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/mm/slub.c b/mm/slub.c
index 8834563cdb4b..62053ceb4464 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -187,7 +187,7 @@ static inline bool kmem_cache_has_cpu_partial(struct 
kmem_cache *s)
  */
 #define DEBUG_METADATA_FLAGS (SLAB_RED_ZONE | SLAB_POISON | SLAB_STORE_USER)
 
-#define OO_SHIFT   16
+#define OO_SHIFT   15
 #define OO_MASK((1 << OO_SHIFT) - 1)
 #define MAX_OBJS_PER_PAGE  32767 /* since page.objects is u15 */
 
@@ -343,6 +343,8 @@ static inline unsigned int oo_order(struct 
kmem_cache_order_objects x)
 
 static inline unsigned int oo_objects(struct kmem_cache_order_objects x)
 {
+   BUILD_BUG_ON(OO_MASK > MAX_OBJS_PER_PAGE);
+
return x.x & OO_MASK;
 }
 
-- 
2.23.0.162.g0b9fbb3734-goog



CONFIG_SHUFFLE_PAGE_ALLOCATOR=y lockdep splat (WAS Re: page_alloc.shuffle=1 + CONFIG_PROVE_LOCKING=y = arm64 hang)

2019-09-11 Thread Qian Cai
Adjusted Cc a bit as this looks like more of the scheduler territory.

> On Sep 10, 2019, at 3:49 PM, Qian Cai  wrote:
> 
> Hmm, it feels like that CONFIG_SHUFFLE_PAGE_ALLOCATOR=y introduces some unique
> locking patterns that the lockdep does not like via,
> 
> allocate_slab
>  shuffle_freelist
>get_random_u32
> 
> Here is another splat with while compiling/installing a kernel,
> 
> [ 1254.443119][C2] WARNING: possible circular locking dependency detected
> [ 1254.450038][C2] 5.3.0-rc5-next-20190822 #1 Not tainted
> [ 1254.49][C2] --
> [ 1254.462988][C2] swapper/2/0 is trying to acquire lock:
> [ 1254.468509][C2] a2925218 (random_write_wait.lock){..-.}, at:
> __wake_up_common_lock+0xc6/0x150
> [ 1254.478154][C2] 
> [ 1254.478154][C2] but task is already holding lock:
> [ 1254.485896][C2] 88845373fda0 (batched_entropy_u32.lock){-.-.}, at:
> get_random_u32+0x4c/0xe0
> [ 1254.495007][C2] 
> [ 1254.495007][C2] which lock already depends on the new lock.
> [ 1254.495007][C2] 
> [ 1254.505331][C2] 
> [ 1254.505331][C2] the existing dependency chain (in reverse order) is:
> [ 1254.514755][C2] 
> [ 1254.514755][C2] -> #3 (batched_entropy_u32.lock){-.-.}:
> [ 1254.522553][C2]__lock_acquire+0x5b3/0xb40
> [ 1254.527638][C2]lock_acquire+0x126/0x280
> [ 1254.533016][C2]_raw_spin_lock_irqsave+0x3a/0x50
> [ 1254.538624][C2]get_random_u32+0x4c/0xe0
> [ 1254.543539][C2]allocate_slab+0x6d6/0x19c0
> [ 1254.548625][C2]new_slab+0x46/0x70
> [ 1254.553010][C2]___slab_alloc+0x58b/0x960
> [ 1254.558533][C2]__slab_alloc+0x43/0x70
> [ 1254.563269][C2]kmem_cache_alloc+0x354/0x460
> [ 1254.568534][C2]fill_pool+0x272/0x4b0
> [ 1254.573182][C2]__debug_object_init+0x86/0x7a0
> [ 1254.578615][C2]debug_object_init+0x16/0x20
> [ 1254.584256][C2]hrtimer_init+0x27/0x1e0
> [ 1254.589079][C2]init_dl_task_timer+0x20/0x40
> [ 1254.594342][C2]__sched_fork+0x10b/0x1f0
> [ 1254.599253][C2]init_idle+0xac/0x520
> [ 1254.603816][C2]fork_idle+0x18c/0x230
> [ 1254.608933][C2]idle_threads_init+0xf0/0x187
> [ 1254.614193][C2]smp_init+0x1d/0x12d
> [ 1254.618671][C2]kernel_init_freeable+0x37e/0x76e
> [ 1254.624282][C2]kernel_init+0x11/0x12f
> [ 1254.629016][C2]ret_from_fork+0x27/0x50
> [ 1254.634344][C2] 
> [ 1254.634344][C2] -> #2 (>lock){-.-.}:
> [ 1254.640831][C2]__lock_acquire+0x5b3/0xb40
> [ 1254.645917][C2]lock_acquire+0x126/0x280
> [ 1254.650827][C2]_raw_spin_lock+0x2f/0x40
> [ 1254.655741][C2]task_fork_fair+0x43/0x200
> [ 1254.661213][C2]sched_fork+0x29b/0x420
> [ 1254.665949][C2]copy_process+0xf12/0x3180
> [ 1254.670947][C2]_do_fork+0xef/0x950
> [ 1254.675422][C2]kernel_thread+0xa8/0xe0
> [ 1254.680244][C2]rest_init+0x28/0x311
> [ 1254.685298][C2]arch_call_rest_init+0xe/0x1b
> [ 1254.690558][C2]start_kernel+0x6eb/0x724
> [ 1254.695469][C2]x86_64_start_reservations+0x24/0x26
> [ 1254.701339][C2]x86_64_start_kernel+0xf4/0xfb
> [ 1254.706689][C2]secondary_startup_64+0xb6/0xc0
> [ 1254.712601][C2] 
> [ 1254.712601][C2] -> #1 (>pi_lock){-.-.}:
> [ 1254.719263][C2]__lock_acquire+0x5b3/0xb40
> [ 1254.724349][C2]lock_acquire+0x126/0x280
> [ 1254.729260][C2]_raw_spin_lock_irqsave+0x3a/0x50
> [ 1254.735317][C2]try_to_wake_up+0xad/0x1050
> [ 1254.740403][C2]default_wake_function+0x2f/0x40
> [ 1254.745929][C2]pollwake+0x10d/0x160
> [ 1254.750491][C2]__wake_up_common+0xc4/0x2a0
> [ 1254.755663][C2]__wake_up_common_lock+0xea/0x150
> [ 1254.761756][C2]__wake_up+0x13/0x20
> [ 1254.766230][C2]account.constprop.9+0x217/0x340
> [ 1254.771754][C2]extract_entropy.constprop.7+0xcf/0x220
> [ 1254.777886][C2]_xfer_secondary_pool+0x19a/0x3d0
> [ 1254.783981][C2]push_to_pool+0x3e/0x230
> [ 1254.788805][C2]process_one_work+0x52a/0xb40
> [ 1254.794064][C2]worker_thread+0x63/0x5b0
> [ 1254.798977][C2]kthread+0x1df/0x200
> [ 1254.803451][C2]ret_from_fork+0x27/0x50
> [ 1254.808787][C2] 
> [ 1254.808787][C2] -> #0 (random_write_wait.lock){..-.}:
> [ 1254.816409][C2]check_prev_add+0x107/0xea0
> [ 1254.821494][C2]validate_chain+0x8fc/0x1200
> [ 1254.826667][C2]__lock_acquire+0x5b3/0xb40
> [ 1254.831751][C2]lock_acquire+0x126/0x280
> [ 1254.837189][C2]_raw_spin_lock_irqsave+0x3a/0x50
> [ 1254.842797][C2]

[PATCH] ASoC: Intel: kbl_rt5663_rt5514_max98927: Add dmic format constraint

2019-09-11 Thread Yu-Hsuan Hsu
24 bits recording from DMIC is not supported for KBL platform because
the TDM slot between PCH and codec is 16 bits only. We should add a
constraint to remove that unsupported format.

Signed-off-by: Yu-Hsuan Hsu 
---
 sound/soc/intel/boards/kbl_rt5663_rt5514_max98927.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/sound/soc/intel/boards/kbl_rt5663_rt5514_max98927.c 
b/sound/soc/intel/boards/kbl_rt5663_rt5514_max98927.c
index 74dda8784f1a01..67b276a65a8d2d 100644
--- a/sound/soc/intel/boards/kbl_rt5663_rt5514_max98927.c
+++ b/sound/soc/intel/boards/kbl_rt5663_rt5514_max98927.c
@@ -400,6 +400,9 @@ static int kabylake_dmic_startup(struct snd_pcm_substream 
*substream)
snd_pcm_hw_constraint_list(runtime, 0, SNDRV_PCM_HW_PARAM_CHANNELS,
dmic_constraints);
 
+   runtime->hw.formats = SNDRV_PCM_FMTBIT_S16_LE;
+   snd_pcm_hw_constraint_msbits(runtime, 0, 16, 16);
+
return snd_pcm_hw_constraint_list(substream->runtime, 0,
SNDRV_PCM_HW_PARAM_RATE, _rates);
 }
-- 
2.23.0.162.g0b9fbb3734-goog



[PATCH] dts: arm64: imx8mq: Enable gpu passive throttling

2019-09-11 Thread Guido Günther
Temperature and hysteresis were picked after the CPU.

Signed-off-by: Guido Günther 
---
 arch/arm64/boot/dts/freescale/imx8mq.dtsi | 15 +++
 1 file changed, 15 insertions(+)

diff --git a/arch/arm64/boot/dts/freescale/imx8mq.dtsi 
b/arch/arm64/boot/dts/freescale/imx8mq.dtsi
index 564045927485..fda636085bb3 100644
--- a/arch/arm64/boot/dts/freescale/imx8mq.dtsi
+++ b/arch/arm64/boot/dts/freescale/imx8mq.dtsi
@@ -235,12 +235,26 @@
thermal-sensors = < 1>;
 
trips {
+   gpu-alert {
+   temperature = <8>;
+   hysteresis = <2000>;
+   type = "passive";
+   };
+
gpu-crit {
temperature = <9>;
hysteresis = <2000>;
type = "critical";
};
};
+
+   cooling-maps {
+   map0 {
+   trip = <_alert>;
+   cooling-device =
+   < THERMAL_NO_LIMIT 
THERMAL_NO_LIMIT>;
+   };
+   };
};
 
vpu-thermal {
@@ -1006,6 +1020,7 @@
 < IMX8MQ_CLK_GPU_AXI>,
 < IMX8MQ_CLK_GPU_AHB>;
clock-names = "core", "shader", "bus", "reg";
+   #cooling-cells = <2>;
assigned-clocks = < IMX8MQ_CLK_GPU_CORE_SRC>,
  < IMX8MQ_CLK_GPU_SHADER_SRC>,
  < IMX8MQ_CLK_GPU_AXI>,
-- 
2.23.0.rc1



Re: [PATCH net 1/2] sctp: remove redundant assignment when call sctp_get_port_local

2019-09-11 Thread maowenan



On 2019/9/11 22:39, Marcelo Ricardo Leitner wrote:
> On Wed, Sep 11, 2019 at 11:30:08AM -0300, Marcelo Ricardo Leitner wrote:
>> On Wed, Sep 11, 2019 at 11:30:38AM +0300, Dan Carpenter wrote:
>>> On Wed, Sep 11, 2019 at 09:30:47AM +0800, maowenan wrote:


 On 2019/9/11 3:22, Dan Carpenter wrote:
> On Tue, Sep 10, 2019 at 09:57:10PM +0300, Dan Carpenter wrote:
>> On Tue, Sep 10, 2019 at 03:13:42PM +0800, Mao Wenan wrote:
>>> There are more parentheses in if clause when call sctp_get_port_local
>>> in sctp_do_bind, and redundant assignment to 'ret'. This patch is to
>>> do cleanup.
>>>
>>> Signed-off-by: Mao Wenan 
>>> ---
>>>  net/sctp/socket.c | 3 +--
>>>  1 file changed, 1 insertion(+), 2 deletions(-)
>>>
>>> diff --git a/net/sctp/socket.c b/net/sctp/socket.c
>>> index 9d1f83b10c0a..766b68b55ebe 100644
>>> --- a/net/sctp/socket.c
>>> +++ b/net/sctp/socket.c
>>> @@ -399,9 +399,8 @@ static int sctp_do_bind(struct sock *sk, union 
>>> sctp_addr *addr, int len)
>>>  * detection.
>>>  */
>>> addr->v4.sin_port = htons(snum);
>>> -   if ((ret = sctp_get_port_local(sk, addr))) {
>>> +   if (sctp_get_port_local(sk, addr))
>>> return -EADDRINUSE;
>>
>> sctp_get_port_local() returns a long which is either 0,1 or a pointer
>> casted to long.  It's not documented what it means and neither of the
>> callers use the return since commit 62208f12451f ("net: sctp: simplify
>> sctp_get_port").
>
> Actually it was commit 4e54064e0a13 ("sctp: Allow only 1 listening
> socket with SO_REUSEADDR") from 11 years ago.  That patch fixed a bug,
> because before the code assumed that a pointer casted to an int was the
> same as a pointer casted to a long.

 commit 4e54064e0a13 treated non-zero return value as unexpected, so the 
 current
 cleanup is ok?
>>>
>>> Yeah.  It's fine, I was just confused why we weren't preserving the
>>> error code and then I saw that we didn't return errors at all and got
>>> confused.
>>
>> But please lets seize the moment and do the change Dean suggested.
> 
> *Dan*, sorry.
> 
>> This was the last place saving this return value somewhere. It makes
>> sense to cleanup sctp_get_port_local() now and remove that masked
>> pointer return.
>>
>> Then you may also cleanup:
>> socket.c:   return !!sctp_get_port_local(sk, );
>> as it will be a direct map.

Thanks Marcelo, shall I post a new individual patch for cleanup as your suggest?
>>
>>   Marcelo
>>
> 
> .
> 



[PATCH v3] tas2770: add tas2770 smart PA kernel driver

2019-09-11 Thread shifu0704
From: Frank Shi 

add tas2770 smart PA kernel driver

Signed-off-by: Frank Shi 
---
 sound/soc/codecs/Kconfig   |   5 +
 sound/soc/codecs/Makefile  |   2 +
 sound/soc/codecs/tas2770.c | 952 +
 sound/soc/codecs/tas2770.h | 166 
 4 files changed, 1125 insertions(+)
 create mode 100644 sound/soc/codecs/tas2770.c
 create mode 100644 sound/soc/codecs/tas2770.h

diff --git a/sound/soc/codecs/Kconfig b/sound/soc/codecs/Kconfig
index 8f3e787..cc92da3 100644
--- a/sound/soc/codecs/Kconfig
+++ b/sound/soc/codecs/Kconfig
@@ -111,6 +111,7 @@ config SND_SOC_ALL_CODECS
select SND_SOC_STAC9766 if SND_SOC_AC97_BUS
select SND_SOC_STI_SAS
select SND_SOC_TAS2552 if I2C
+   select SND_SOC_TAS2770 if I2C
select SND_SOC_TAS5086 if I2C
select SND_SOC_TAS571X if I2C
select SND_SOC_TFA9879 if I2C
@@ -652,6 +653,10 @@ config SND_SOC_TAS2552
tristate "Texas Instruments TAS2552 Mono Audio amplifier"
depends on I2C
 
+config SND_SOC_TAS2770
+   tristate "Texas Instruments TAS2770 speaker amplifier"
+   depends on I2C
+
 config SND_SOC_TAS5086
tristate "Texas Instruments TAS5086 speaker amplifier"
depends on I2C
diff --git a/sound/soc/codecs/Makefile b/sound/soc/codecs/Makefile
index 5305cc6..63b8488 100644
--- a/sound/soc/codecs/Makefile
+++ b/sound/soc/codecs/Makefile
@@ -116,6 +116,7 @@ snd-soc-stac9766-objs := stac9766.o
 snd-soc-sti-sas-objs := sti-sas.o
 snd-soc-tas5086-objs := tas5086.o
 snd-soc-tas571x-objs := tas571x.o
+snd-soc-tas2770-objs := tas2770.o
 snd-soc-tfa9879-objs := tfa9879.o
 snd-soc-tlv320aic23-objs := tlv320aic23.o
 snd-soc-tlv320aic23-i2c-objs := tlv320aic23-i2c.o
@@ -332,6 +333,7 @@ obj-$(CONFIG_SND_SOC_STI_SAS)   += snd-soc-sti-sas.o
 obj-$(CONFIG_SND_SOC_TAS2552)  += snd-soc-tas2552.o
 obj-$(CONFIG_SND_SOC_TAS5086)  += snd-soc-tas5086.o
 obj-$(CONFIG_SND_SOC_TAS571X)  += snd-soc-tas571x.o
+obj-$(CONFIG_SND_SOC_TAS2770) += snd-soc-tas2770.o
 obj-$(CONFIG_SND_SOC_TFA9879)  += snd-soc-tfa9879.o
 obj-$(CONFIG_SND_SOC_TLV320AIC23)  += snd-soc-tlv320aic23.o
 obj-$(CONFIG_SND_SOC_TLV320AIC23_I2C)  += snd-soc-tlv320aic23-i2c.o
diff --git a/sound/soc/codecs/tas2770.c b/sound/soc/codecs/tas2770.c
new file mode 100644
index 000..65aeace
--- /dev/null
+++ b/sound/soc/codecs/tas2770.c
@@ -0,0 +1,952 @@
+// SPDX-License-Identifier: GPL-2.0
+//
+// ALSA SoC Texas Instruments TAS2770 20-W Digital Input Mono Class-D
+// Audio Amplifier with Speaker I/V Sense
+//
+// Copyright (C) 2016-2017 Texas Instruments Incorporated - http://www.ti.com/
+// Author: Tracy Yi 
+// Frank Shi 
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "tas2770.h"
+
+#define TAS2770_MDELAY 0xFFFE
+#define TAS2770_CHECK_PERIOD   5000/* 5 second */
+
+static void tas2770_reset(struct tas2770_priv *tas2770)
+{
+   if (tas2770->reset_gpio) {
+   gpiod_set_value_cansleep(tas2770->reset_gpio, 0);
+   msleep(20);
+   gpiod_set_value_cansleep(tas2770->reset_gpio, 1);
+   }
+   snd_soc_component_write(tas2770->component, TAS2770_SW_RST,
+   TAS2770_RST);
+}
+
+static int tas2770_regmap_read(struct tas2770_priv *tas2770,
+   unsigned int reg, unsigned int *value)
+{
+   int result = 0;
+   int retry_count = TAS2770_I2C_RETRY_COUNT;
+
+   while (retry_count--) {
+   result = snd_soc_component_read(tas2770->component, reg,
+   value);
+   if (!result)
+   break;
+
+   msleep(20);
+   }
+   if (!retry_count)
+   return -ETIMEDOUT;
+
+   return 0;
+}
+
+static int tas2770_set_bias_level(struct snd_soc_component *component,
+enum snd_soc_bias_level level)
+{
+   struct tas2770_priv *tas2770 =
+   snd_soc_component_get_drvdata(component);
+
+   switch (level) {
+   case SND_SOC_BIAS_ON:
+   break;
+
+   case SND_SOC_BIAS_STANDBY:
+   snd_soc_component_update_bits(component, TAS2770_PWR_CTRL,
+   TAS2770_PWR_CTRL_MASK,
+   TAS2770_PWR_CTRL_ACTIVE);
+   tas2770->power_state = SND_SOC_BIAS_STANDBY;
+   break;
+
+   case SND_SOC_BIAS_PREPARE:
+   snd_soc_component_update_bits(component, TAS2770_PWR_CTRL,
+   TAS2770_PWR_CTRL_MASK,
+   TAS2770_PWR_CTRL_MUTE);
+   tas2770->power_state = SND_SOC_BIAS_PREPARE;
+   break;
+
+   case SND_SOC_BIAS_OFF:
+   snd_soc_component_update_bits(component, TAS2770_PWR_CTRL,
+   TAS2770_PWR_CTRL_MASK,
+   

[PATCH] sched: fix migration to invalid cpu in __set_cpus_allowed_ptr

2019-09-11 Thread shikemeng
>From 089dbf0216628ac6ae98742ab90725ca9c2bf201 Mon Sep 17 00:00:00 2001
From:  
Date: Tue, 10 Sep 2019 09:44:58 -0400
Subject: [PATCH] sched: fix migration to invalid cpu in __set_cpus_allowed_ptr

reason: migration to invalid cpu in __set_cpus_allowed_ptr
archive path: patches/euleros/sched

Oops occur when running qemu on arm64:
 Unable to handle kernel paging request at virtual address 08effe40
 Internal error: Oops: 9607 [#1] SMP
 Process migration/0 (pid: 12, stack limit = 0x084e3736)
 pstate: 2085 (nzCv daIf -PAN -UAO)
 pc : __ll_sc___cmpxchg_case_acq_4+0x4/0x20
 lr : move_queued_task.isra.21+0x124/0x298
 ...
 Call trace:
  __ll_sc___cmpxchg_case_acq_4+0x4/0x20
  __migrate_task+0xc8/0xe0
  migration_cpu_stop+0x170/0x180
  cpu_stopper_thread+0xec/0x178
  smpboot_thread_fn+0x1ac/0x1e8
  kthread+0x134/0x138
  ret_from_fork+0x10/0x18

__set_cpus_allowed_ptr will choose an active dest_cpu in affinity mask to 
migrage the process if process is not
currently running on any one of the CPUs specified in affinity 
mask.__set_cpus_allowed_ptr will choose an invalid
dest_cpu(>= nr_cpu_ids, 1024 in my virtual machine) if CPUS in affinity mask 
are deactived by cpu_down after
cpumask_intersects check.Cpumask_test_cpu of dest_cpu afterwards is overflow 
and may passes if corresponding bit
is coincidentally set.As a consequence, kernel will access a invalid rq address 
associate with the invalid cpu in
migration_cpu_stop->__migrate_task->move_queued_task and the Oops occurs. 
Process as follows may trigger the Oops:
1) A process repeatedly bind itself to cpu0 and cpu1 in turn by calling 
sched_setaffinity
2) A shell script repeatedly "echo 0 > /sys/devices/system/cpu/cpu1/online" and 
"echo 1 > /sys/devices/system/cpu/cpu1/online" in turn
3) Oops appears if the invalid cpu is set in memory after tested cpumask.


Change-Id: I9c2f95aecd3da568991b7408397215f26c990e40
---
 kernel/sched/core.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 4b63fef..5181ea9 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1112,7 +1112,8 @@ static int __set_cpus_allowed_ptr(struct task_struct *p,
if (cpumask_equal(>cpus_allowed, new_mask))
goto out;

-   if (!cpumask_intersects(new_mask, cpu_valid_mask)) {
+   dest_cpu = cpumask_any_and(cpu_valid_mask, new_mask);
+   if (dest_cpu >= nr_cpu_ids) {
ret = -EINVAL;
goto out;
}
@@ -1133,7 +1134,6 @@ static int __set_cpus_allowed_ptr(struct task_struct *p,
if (cpumask_test_cpu(task_cpu(p), new_mask))
goto out;

-   dest_cpu = cpumask_any_and(cpu_valid_mask, new_mask);
if (task_running(rq, p) || p->state == TASK_WAKING) {
struct migration_arg arg = { p, dest_cpu };
/* Need help from migration thread: drop lock and wait. */
--
1.8.5.6



I am Mrs.KADI, a widow

2019-09-11 Thread Ms Kadi Balla
Greetings to you and your family, I know that my message will come to you as a 
surprise, please accept my apology, I am (Mrs.Kadi Balla), a widow from Burkina 
Faso west African, I am sending this brief letter to solicit your partnership 
to transfer, seven million five hundred thousand united states dollars. I shall 
send you more information and procedures when I receive positive response from 
you, please and please, I will like you to kindly respond to my mail via my 
private address for security and confidential reasons: mrskadib...@gmail.com 

Best Regards
Mrs.Kadi BALLA


Re: [PATCH 02/10] mm,madvise: call soft_offline_page() without MF_COUNT_INCREASED

2019-09-11 Thread Naoya Horiguchi
On Wed, Sep 11, 2019 at 12:27:22PM +0200, David Hildenbrand wrote:
> On 10.09.19 12:30, Oscar Salvador wrote:
> > From: Naoya Horiguchi 
> > 
> > Currently madvise_inject_error() pins the target via get_user_pages_fast.
> > The call to get_user_pages_fast is only to get the respective page
> > of a given address, but it is the job of the memory-poisoning handler
> > to deal with races, so drop the refcount grabbed by get_user_pages_fast.
> > 
> 
> Oh, and another question "it is the job of the memory-poisoning handler"
> - is that already properly implemented? (newbee question ¯\_(ツ)_/¯)

The above description might be confusing, sorry. It's intended likes

  The call to get_user_pages_fast is only to get the pointer to struct
  page of a given address, pinning it is memory-poisoning handler's job,
  so drop the refcount grabbed by get_user_pages_fast.

And pinning is done in get_hwpoison_page() for hard-offline and
get_any_page() for soft-offline.  For soft-offline case, the semantics of
refcount of poisoned pages is what this patchset tries to change/improve.

Thanks,
Naoya Horiguchi

[PATCH v7 10/12] thermal: qoriq: Do not report invalid temperature reading

2019-09-11 Thread Andrey Smirnov
Before returning measured temperature data to upper layer we need to
make sure that the reading was marked as "valid" to avoid reporting
bogus data.

Signed-off-by: Andrey Smirnov 
Reviewed-by: Daniel Lezcano 
Tested-by: Lucas Stach 
Cc: Chris Healy 
Cc: Lucas Stach 
Cc: Eduardo Valentin 
Cc: Daniel Lezcano 
Cc: Angus Ainslie (Purism) 
Cc: linux-...@nxp.com
Cc: linux...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
---
 drivers/thermal/qoriq_thermal.c | 19 ++-
 1 file changed, 18 insertions(+), 1 deletion(-)

diff --git a/drivers/thermal/qoriq_thermal.c b/drivers/thermal/qoriq_thermal.c
index 1cc53a4a5c47..48853192514a 100644
--- a/drivers/thermal/qoriq_thermal.c
+++ b/drivers/thermal/qoriq_thermal.c
@@ -38,6 +38,7 @@
 #define REGS_TRITSR(n) (0x100 + 16 * (n)) /* Immediate Temperature
* Site Register
*/
+#define TRITSR_V   BIT(31)
 #define REGS_TTRnCR(n) (0xf10 + 4 * (n)) /* Temperature Range n
   * Control Register
   */
@@ -64,8 +65,24 @@ static int tmu_get_temp(void *p, int *temp)
struct qoriq_sensor *qsensor = p;
struct qoriq_tmu_data *qdata = qoriq_sensor_to_data(qsensor);
u32 val;
+   /*
+* REGS_TRITSR(id) has the following layout:
+*
+* 31  ... 7 6 5 4 3 2 1 0
+*  V  TEMP
+*
+* Where V bit signifies if the measurement is ready and is
+* within sensor range. TEMP is an 8 bit value representing
+* temperature in C.
+*/
+   if (regmap_read_poll_timeout(qdata->regmap,
+REGS_TRITSR(qsensor->id),
+val,
+val & TRITSR_V,
+USEC_PER_MSEC,
+10 * USEC_PER_MSEC))
+   return -ENODATA;
 
-   regmap_read(qdata->regmap, REGS_TRITSR(qsensor->id), );
*temp = (val & 0xff) * 1000;
 
return 0;
-- 
2.21.0



[PATCH v7 11/12] thermal_hwmon: Add devres wrapper for thermal_add_hwmon_sysfs()

2019-09-11 Thread Andrey Smirnov
Add devres wrapper for thermal_add_hwmon_sysfs() to simplify driver
code.

Signed-off-by: Andrey Smirnov 
Reviewed-by: Daniel Lezcano 
Tested-by: Lucas Stach 
Cc: Chris Healy 
Cc: Lucas Stach 
Cc: Eduardo Valentin 
Cc: Daniel Lezcano 
Cc: Angus Ainslie (Purism) 
Cc: linux-...@nxp.com
Cc: linux...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
---
 drivers/thermal/thermal_hwmon.c | 28 
 drivers/thermal/thermal_hwmon.h |  7 +++
 2 files changed, 35 insertions(+)

diff --git a/drivers/thermal/thermal_hwmon.c b/drivers/thermal/thermal_hwmon.c
index dd5d8ee37928..c8d2620f2e42 100644
--- a/drivers/thermal/thermal_hwmon.c
+++ b/drivers/thermal/thermal_hwmon.c
@@ -248,3 +248,31 @@ void thermal_remove_hwmon_sysfs(struct thermal_zone_device 
*tz)
kfree(hwmon);
 }
 EXPORT_SYMBOL_GPL(thermal_remove_hwmon_sysfs);
+
+static void devm_thermal_hwmon_release(struct device *dev, void *res)
+{
+   thermal_remove_hwmon_sysfs(*(struct thermal_zone_device **)res);
+}
+
+int devm_thermal_add_hwmon_sysfs(struct thermal_zone_device *tz)
+{
+   struct thermal_zone_device **ptr;
+   int ret;
+
+   ptr = devres_alloc(devm_thermal_hwmon_release, sizeof(*ptr),
+  GFP_KERNEL);
+   if (!ptr)
+   return -ENOMEM;
+
+   ret = thermal_add_hwmon_sysfs(tz);
+   if (ret) {
+   devres_free(ptr);
+   return ret;
+   }
+
+   *ptr = tz;
+   devres_add(>device, ptr);
+
+   return ret;
+}
+EXPORT_SYMBOL_GPL(devm_thermal_add_hwmon_sysfs);
diff --git a/drivers/thermal/thermal_hwmon.h b/drivers/thermal/thermal_hwmon.h
index a160b9d62dd0..1a9d65f6a6a8 100644
--- a/drivers/thermal/thermal_hwmon.h
+++ b/drivers/thermal/thermal_hwmon.h
@@ -17,6 +17,7 @@
 
 #ifdef CONFIG_THERMAL_HWMON
 int thermal_add_hwmon_sysfs(struct thermal_zone_device *tz);
+int devm_thermal_add_hwmon_sysfs(struct thermal_zone_device *tz);
 void thermal_remove_hwmon_sysfs(struct thermal_zone_device *tz);
 #else
 static inline int
@@ -25,6 +26,12 @@ thermal_add_hwmon_sysfs(struct thermal_zone_device *tz)
return 0;
 }
 
+static inline int
+devm_thermal_add_hwmon_sysfs(struct thermal_zone_device *tz)
+{
+   return 0;
+}
+
 static inline void
 thermal_remove_hwmon_sysfs(struct thermal_zone_device *tz)
 {
-- 
2.21.0



[PATCH v7 08/12] thermal: qoriq: Convert driver to use regmap API

2019-09-11 Thread Andrey Smirnov
Convert driver to use regmap API, drop custom LE/BE IO helpers and
simplify bit manipulation using regmap_update_bits(). This also allows
us to convert some register initialization to use loops and adds
convenient debug access to TMU registers via debugfs.

Signed-off-by: Andrey Smirnov 
Reviewed-by: Daniel Lezcano 
Tested-by: Lucas Stach 
Cc: Chris Healy 
Cc: Lucas Stach 
Cc: Eduardo Valentin 
Cc: Daniel Lezcano 
Cc: Angus Ainslie (Purism) 
Cc: linux-...@nxp.com
Cc: linux...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
---
 drivers/thermal/qoriq_thermal.c | 153 +++-
 1 file changed, 72 insertions(+), 81 deletions(-)

diff --git a/drivers/thermal/qoriq_thermal.c b/drivers/thermal/qoriq_thermal.c
index 8a28a4433d44..32bf5ed57f5c 100644
--- a/drivers/thermal/qoriq_thermal.c
+++ b/drivers/thermal/qoriq_thermal.c
@@ -9,6 +9,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 #include "thermal_core.h"
@@ -18,48 +19,27 @@
 /*
  * QorIQ TMU Registers
  */
-struct qoriq_tmu_site_regs {
-   u32 tritsr; /* Immediate Temperature Site Register */
-   u32 tratsr; /* Average Temperature Site Register */
-   u8 res0[0x8];
-};
 
-struct qoriq_tmu_regs {
-   u32 tmr;/* Mode Register */
+#define REGS_TMR   0x000   /* Mode Register */
 #define TMR_DISABLE0x0
 #define TMR_ME 0x8000
 #define TMR_ALPF   0x0c00
-   u32 tsr;/* Status Register */
-   u32 tmtmir; /* Temperature measurement interval Register */
+
+#define REGS_TMTMIR0x008   /* Temperature measurement interval Register */
 #define TMTMIR_DEFAULT 0x000f
-   u8 res0[0x14];
-   u32 tier;   /* Interrupt Enable Register */
+
+#define REGS_TIER  0x020   /* Interrupt Enable Register */
 #define TIER_DISABLE   0x0
-   u32 tidr;   /* Interrupt Detect Register */
-   u32 tiscr;  /* Interrupt Site Capture Register */
-   u32 ticscr; /* Interrupt Critical Site Capture Register */
-   u8 res1[0x10];
-   u32 tmhtcrh;/* High Temperature Capture Register */
-   u32 tmhtcrl;/* Low Temperature Capture Register */
-   u8 res2[0x8];
-   u32 tmhtitr;/* High Temperature Immediate Threshold */
-   u32 tmhtatr;/* High Temperature Average Threshold */
-   u32 tmhtactr;   /* High Temperature Average Crit Threshold */
-   u8 res3[0x24];
-   u32 ttcfgr; /* Temperature Configuration Register */
-   u32 tscfgr; /* Sensor Configuration Register */
-   u8 res4[0x78];
-   struct qoriq_tmu_site_regs site[SITES_MAX];
-   u8 res5[0x9f8];
-   u32 ipbrr0; /* IP Block Revision Register 0 */
-   u32 ipbrr1; /* IP Block Revision Register 1 */
-   u8 res6[0x310];
-   u32 ttr0cr; /* Temperature Range 0 Control Register */
-   u32 ttr1cr; /* Temperature Range 1 Control Register */
-   u32 ttr2cr; /* Temperature Range 2 Control Register */
-   u32 ttr3cr; /* Temperature Range 3 Control Register */
-};
 
+#define REGS_TTCFGR0x080   /* Temperature Configuration Register */
+#define REGS_TSCFGR0x084   /* Sensor Configuration Register */
+
+#define REGS_TRITSR(n) (0x100 + 16 * (n)) /* Immediate Temperature
+   * Site Register
+   */
+#define REGS_TTRnCR(n) (0xf10 + 4 * (n)) /* Temperature Range n
+  * Control Register
+  */
 /*
  * Thermal zone data
  */
@@ -68,9 +48,8 @@ struct qoriq_sensor {
 };
 
 struct qoriq_tmu_data {
-   struct qoriq_tmu_regs __iomem *regs;
+   struct regmap *regmap;
struct clk *clk;
-   bool little_endian;
struct qoriq_sensor sensor[SITES_MAX];
 };
 
@@ -79,29 +58,13 @@ static struct qoriq_tmu_data *qoriq_sensor_to_data(struct 
qoriq_sensor *s)
return container_of(s, struct qoriq_tmu_data, sensor[s->id]);
 }
 
-static void tmu_write(struct qoriq_tmu_data *p, u32 val, void __iomem *addr)
-{
-   if (p->little_endian)
-   iowrite32(val, addr);
-   else
-   iowrite32be(val, addr);
-}
-
-static u32 tmu_read(struct qoriq_tmu_data *p, void __iomem *addr)
-{
-   if (p->little_endian)
-   return ioread32(addr);
-   else
-   return ioread32be(addr);
-}
-
 static int tmu_get_temp(void *p, int *temp)
 {
struct qoriq_sensor *qsensor = p;
struct qoriq_tmu_data *qdata = qoriq_sensor_to_data(qsensor);
u32 val;
 
-   val = tmu_read(qdata, >regs->site[qsensor->id].tritsr);
+   regmap_read(qdata->regmap, REGS_TRITSR(qsensor->id), );
*temp = (val & 0xff) * 1000;
 
return 0;
@@ -139,7 +102,8 @@ static int 

[PATCH v7 09/12] thermal: qoriq: Enable all sensors before registering them

2019-09-11 Thread Andrey Smirnov
Tmu_get_temp will get called as a part of sensor registration via
devm_thermal_zone_of_sensor_register(). To prevent it from retruning
bogus data we need to enable sensor monitoring before that. Looking at
the datasheet (i.MX8MQ RM) there doesn't seem to be any harm in
enabling them all, so, for the sake of simplicity, change the code to
do just that.

Signed-off-by: Andrey Smirnov 
Tested-by: Lucas Stach 
Cc: Chris Healy 
Cc: Lucas Stach 
Cc: Eduardo Valentin 
Cc: Daniel Lezcano 
Cc: Angus Ainslie (Purism) 
Cc: linux-...@nxp.com
Cc: linux...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
---
 drivers/thermal/qoriq_thermal.c | 18 --
 1 file changed, 8 insertions(+), 10 deletions(-)

diff --git a/drivers/thermal/qoriq_thermal.c b/drivers/thermal/qoriq_thermal.c
index 32bf5ed57f5c..1cc53a4a5c47 100644
--- a/drivers/thermal/qoriq_thermal.c
+++ b/drivers/thermal/qoriq_thermal.c
@@ -24,6 +24,7 @@
 #define TMR_DISABLE0x0
 #define TMR_ME 0x8000
 #define TMR_ALPF   0x0c00
+#define TMR_MSITE_ALL  GENMASK(15, 0)
 
 #define REGS_TMTMIR0x008   /* Temperature measurement interval Register */
 #define TMTMIR_DEFAULT 0x000f
@@ -77,7 +78,10 @@ static const struct thermal_zone_of_device_ops tmu_tz_ops = {
 static int qoriq_tmu_register_tmu_zone(struct device *dev,
   struct qoriq_tmu_data *qdata)
 {
-   int id, sites = 0;
+   int id;
+
+   regmap_write(qdata->regmap, REGS_TMR,
+TMR_MSITE_ALL | TMR_ME | TMR_ALPF);
 
for (id = 0; id < SITES_MAX; id++) {
struct thermal_zone_device *tzd;
@@ -93,18 +97,12 @@ static int qoriq_tmu_register_tmu_zone(struct device *dev,
if (ret) {
if (ret == -ENODEV)
continue;
-   else
-   return ret;
-   }
 
-   sites |= 0x1 << (15 - id);
+   regmap_write(qdata->regmap, REGS_TMR, TMR_DISABLE);
+   return ret;
+   }
}
 
-   /* Enable monitoring */
-   if (sites != 0)
-   regmap_write(qdata->regmap, REGS_TMR,
-sites | TMR_ME | TMR_ALPF);
-
return 0;
 }
 
-- 
2.21.0



[PATCH v7 12/12] thermal: qoriq: Add hwmon support

2019-09-11 Thread Andrey Smirnov
Expose thermal readings as a HWMON device, so that it could be
accessed using lm-sensors.

Signed-off-by: Andrey Smirnov 
Reviewed-by: Daniel Lezcano 
Tested-by: Lucas Stach 
Cc: Chris Healy 
Cc: Lucas Stach 
Cc: Eduardo Valentin 
Cc: Daniel Lezcano 
Cc: Angus Ainslie (Purism) 
Cc: linux-...@nxp.com
Cc: linux...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
---
 drivers/thermal/qoriq_thermal.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/drivers/thermal/qoriq_thermal.c b/drivers/thermal/qoriq_thermal.c
index 48853192514a..e907f0d2103f 100644
--- a/drivers/thermal/qoriq_thermal.c
+++ b/drivers/thermal/qoriq_thermal.c
@@ -13,6 +13,7 @@
 #include 
 
 #include "thermal_core.h"
+#include "thermal_hwmon.h"
 
 #define SITES_MAX  16
 
@@ -118,6 +119,11 @@ static int qoriq_tmu_register_tmu_zone(struct device *dev,
regmap_write(qdata->regmap, REGS_TMR, TMR_DISABLE);
return ret;
}
+
+   if (devm_thermal_add_hwmon_sysfs(tzd))
+   dev_warn(dev,
+"Failed to add hwmon sysfs attributes\n");
+
}
 
return 0;
-- 
2.21.0



[PATCH v7 07/12] thermal: qoriq: Drop unnecessary drvdata cleanup

2019-09-11 Thread Andrey Smirnov
Driver data of underlying struct device will be set to NULL by Linux's
driver infrastructure. Clearing it here is unnecessary.

Signed-off-by: Andrey Smirnov 
Cc: Chris Healy 
Cc: Lucas Stach 
Cc: Eduardo Valentin 
Cc: Daniel Lezcano 
Cc: Angus Ainslie (Purism) 
Cc: linux-...@nxp.com
Cc: linux...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
---
 drivers/thermal/qoriq_thermal.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/thermal/qoriq_thermal.c b/drivers/thermal/qoriq_thermal.c
index af596c3342d0..8a28a4433d44 100644
--- a/drivers/thermal/qoriq_thermal.c
+++ b/drivers/thermal/qoriq_thermal.c
@@ -253,8 +253,6 @@ static int qoriq_tmu_remove(struct platform_device *pdev)
 
clk_disable_unprepare(data->clk);
 
-   platform_set_drvdata(pdev, NULL);
-
return 0;
 }
 
-- 
2.21.0



Re: [PATCH v5 07/13] dt-bindings: pwm: add a property "num-pwms"

2019-09-11 Thread Sam Shih
On Mon, 2019-09-02 at 18:04 +0200, Uwe Kleine-König wrote:
> On Tue, Aug 27, 2019 at 01:39:24PM -0500, Rob Herring wrote:
> > On Thu, Aug 22, 2019 at 02:58:37PM +0800, Sam Shih wrote:
> > > From: Ryder Lee 
> > 
> > The subject should indicate this is for Mediatek.
> > 
> > > 
> > > This adds a property "num-pwms" in example so that we could
> > > specify the number of PWM channels via device tree.
> > > 
> > > Signed-off-by: Ryder Lee 
> > > Signed-off-by: Sam Shih 
> > > Reviewed-by: Matthias Brugger 
> > > Acked-by: Uwe Kleine-König 
> > > ---
> > > Changes since v5:
> > > - Add an Acked-by tag
> > > - This file is original v4 patch 5/10
> > > (https://patchwork.kernel.org/patch/11102577/)
> > > 
> > > Change-Id: I429048afeffa96f3f14533910efe242f88776043
> > > ---
> > >  Documentation/devicetree/bindings/pwm/pwm-mediatek.txt | 7 ---
> > >  1 file changed, 4 insertions(+), 3 deletions(-)
> > > 
> > > diff --git a/Documentation/devicetree/bindings/pwm/pwm-mediatek.txt 
> > > b/Documentation/devicetree/bindings/pwm/pwm-mediatek.txt
> > > index 991728cb46cb..ea95b490a913 100644
> > > --- a/Documentation/devicetree/bindings/pwm/pwm-mediatek.txt
> > > +++ b/Documentation/devicetree/bindings/pwm/pwm-mediatek.txt
> > > @@ -14,12 +14,12 @@ Required properties:
> > >  has no clocks
> > > - "top": the top clock generator
> > > - "main": clock used by the PWM core
> > > -   - "pwm1-8": the eight per PWM clocks for mt2712
> > > -   - "pwm1-6": the six per PWM clocks for mt7622
> > > -   - "pwm1-5": the five per PWM clocks for mt7623
> > > +   - "pwm1-N": the PWM clocks for each channel
> > > +   where N starting from 1 to the maximum number of PWM channels
> > 
> > Once converted to schema, you are going to be back to listing them out.
> > 
> > >   - pinctrl-names: Must contain a "default" entry.
> > >   - pinctrl-0: One property must exist for each entry in pinctrl-names.
> > > See pinctrl/pinctrl-bindings.txt for details of the property values.
> > > + - num-pwms: the number of PWM channels.
> > 
> > You can't add new required properties without breaking compatibility. 
> > 
> > You already have to imply the number of channels from the compatible (or 
> > number of clocks) and you have to keep doing so to maintain 
> > compatibility, so why not just keep doing that for new chips?
> 
> This was a suggestion by me. The driver still handles compatibility
> (i.e. falls back to the number of PWMs that was implied by the
> compatible before). Given that there are various drivers that all solve
> the same problem (i.e. different variants with different number of PWMs)
> I thought it would be a good idea to introduce a property in the device
> tree that specifies this number.
> 
> Only for newly introduced compatibles the num-pwms property is really
> required. Differentiating the ones that need it and the ones that don't
> seems over-engineered to me.
> 
> (BTW, using the number of clks doesn't really work because there are
> also some variants without clocks. It is still under discussion if in
> this case dummy-clocks should be provided IIRC.)
> 
> Best regards
> Uwe
> 


Any conclusions ? 

just a friendly reminder :)


regards Sam




Re: [PATCH 02/10] mm,madvise: call soft_offline_page() without MF_COUNT_INCREASED

2019-09-11 Thread Naoya Horiguchi
Hi David,

On Wed, Sep 11, 2019 at 12:23:24PM +0200, David Hildenbrand wrote:
> On 10.09.19 12:30, Oscar Salvador wrote:
> > From: Naoya Horiguchi 
> > 
> > Currently madvise_inject_error() pins the target via get_user_pages_fast.
> > The call to get_user_pages_fast is only to get the respective page
> > of a given address, but it is the job of the memory-poisoning handler
> > to deal with races, so drop the refcount grabbed by get_user_pages_fast.
> > 
> > Signed-off-by: Naoya Horiguchi 
> > Signed-off-by: Oscar Salvador 
> > ---
> >  mm/madvise.c | 25 +++--
> >  1 file changed, 11 insertions(+), 14 deletions(-)
> > 
> > diff --git a/mm/madvise.c b/mm/madvise.c
> > index 6e023414f5c1..fbe6d402232c 100644
> > --- a/mm/madvise.c
> > +++ b/mm/madvise.c
> > @@ -883,6 +883,16 @@ static int madvise_inject_error(int behavior,
> > ret = get_user_pages_fast(start, 1, 0, );
> > if (ret != 1)
> > return ret;
> > +   /*
> > +* The get_user_pages_fast() is just to get the pfn of the
> > +* given address, and the refcount has nothing to do with
> > +* what we try to test, so it should be released immediately.
> > +* This is racy but it's intended because the real hardware
> > +* errors could happen at any moment and memory error handlers
> > +* must properly handle the race.
> > +*/
> > +   put_page(page);
> > +
> 
> I wonder if it would be clearer to do that after the page has been fully
> used  - e.g. after getting the pfn and the order (and then e.g.,
> symbolically setting the page pointer to 0).

Yes, this could be called just after page_to_pfn() below.

> I guess the important part of this patch is to not have an elevated
> refcount while calling soft_offline_page().
> 

That's right.

> > pfn = page_to_pfn(page);
> >  
> > /*
> > @@ -892,16 +902,11 @@ static int madvise_inject_error(int behavior,
> >  */
> > order = compound_order(compound_head(page));
> >  
> > -   if (PageHWPoison(page)) {
> > -   put_page(page);
> > -   continue;
> > -   }
> 
> This change is not reflected in the changelog. I would have expected
> that only the put_page() would go. If this should go completely, I
> suggest a separate patch.
> 

I forget why I tried to remove the if block, and now I think only the
put_page() should go as you point out.

Thanks for the comment.

- Naoya


Re: [PATCH 2/3] mm: avoid slub allocation while holding list_lock

2019-09-11 Thread Yu Zhao
On Thu, Sep 12, 2019 at 03:44:01AM +0300, Kirill A. Shutemov wrote:
> On Wed, Sep 11, 2019 at 06:29:28PM -0600, Yu Zhao wrote:
> > If we are already under list_lock, don't call kmalloc(). Otherwise we
> > will run into deadlock because kmalloc() also tries to grab the same
> > lock.
> > 
> > Instead, statically allocate bitmap in struct kmem_cache_node. Given
> > currently page->objects has 15 bits, we bloat the per-node struct by
> > 4K. So we waste some memory but only do so when slub debug is on.
> 
> Why not have single page total protected by a lock?
> 
> Listing object from two pages at the same time doesn't make sense anyway.
> Cuncurent validating is not something sane to do.

Okay, cutting down to static global bitmap.


[PATCH v7 01/12] thermal: qoriq: Add local struct device pointer

2019-09-11 Thread Andrey Smirnov
Use a local "struct device *dev" for brevity. No functional change
intended.

Signed-off-by: Andrey Smirnov 
Acked-by: Daniel Lezcano 
Tested-by: Lucas Stach 
Cc: Chris Healy 
Cc: Lucas Stach 
Cc: Eduardo Valentin 
Cc: Daniel Lezcano 
Cc: Angus Ainslie (Purism) 
Cc: linux-...@nxp.com
Cc: linux...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
---
 drivers/thermal/qoriq_thermal.c | 11 ++-
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/drivers/thermal/qoriq_thermal.c b/drivers/thermal/qoriq_thermal.c
index 39542c670301..5df6267a5da0 100644
--- a/drivers/thermal/qoriq_thermal.c
+++ b/drivers/thermal/qoriq_thermal.c
@@ -194,8 +194,9 @@ static int qoriq_tmu_probe(struct platform_device *pdev)
int ret;
struct qoriq_tmu_data *data;
struct device_node *np = pdev->dev.of_node;
+   struct device *dev = >dev;
 
-   data = devm_kzalloc(>dev, sizeof(struct qoriq_tmu_data),
+   data = devm_kzalloc(dev, sizeof(struct qoriq_tmu_data),
GFP_KERNEL);
if (!data)
return -ENOMEM;
@@ -206,17 +207,17 @@ static int qoriq_tmu_probe(struct platform_device *pdev)
 
data->regs = devm_platform_ioremap_resource(pdev, 0);
if (IS_ERR(data->regs)) {
-   dev_err(>dev, "Failed to get memory region\n");
+   dev_err(dev, "Failed to get memory region\n");
return PTR_ERR(data->regs);
}
 
-   data->clk = devm_clk_get_optional(>dev, NULL);
+   data->clk = devm_clk_get_optional(dev, NULL);
if (IS_ERR(data->clk))
return PTR_ERR(data->clk);
 
ret = clk_prepare_enable(data->clk);
if (ret) {
-   dev_err(>dev, "Failed to enable clock\n");
+   dev_err(dev, "Failed to enable clock\n");
return ret;
}
 
@@ -228,7 +229,7 @@ static int qoriq_tmu_probe(struct platform_device *pdev)
 
ret = qoriq_tmu_register_tmu_zone(pdev);
if (ret < 0) {
-   dev_err(>dev, "Failed to register sensors\n");
+   dev_err(dev, "Failed to register sensors\n");
ret = -ENODEV;
goto err;
}
-- 
2.21.0



[PATCH v7 02/12] thermal: qoriq: Don't store struct thermal_zone_device reference

2019-09-11 Thread Andrey Smirnov
Struct thermal_zone_device reference stored as sensor's private data
isn't really used anywhere in the code. Drop it.

Signed-off-by: Andrey Smirnov 
Acked-by: Daniel Lezcano 
Tested-by: Lucas Stach 
Cc: Chris Healy 
Cc: Lucas Stach 
Cc: Eduardo Valentin 
Cc: Daniel Lezcano 
Cc: Angus Ainslie (Purism) 
Cc: linux-...@nxp.com
Cc: linux...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
---
 drivers/thermal/qoriq_thermal.c | 17 +++--
 1 file changed, 11 insertions(+), 6 deletions(-)

diff --git a/drivers/thermal/qoriq_thermal.c b/drivers/thermal/qoriq_thermal.c
index 5df6267a5da0..b471c226f06b 100644
--- a/drivers/thermal/qoriq_thermal.c
+++ b/drivers/thermal/qoriq_thermal.c
@@ -66,7 +66,6 @@ struct qoriq_tmu_data;
  * Thermal zone data
  */
 struct qoriq_sensor {
-   struct thermal_zone_device  *tzd;
struct qoriq_tmu_data   *qdata;
int id;
 };
@@ -116,6 +115,9 @@ static int qoriq_tmu_register_tmu_zone(struct 
platform_device *pdev)
int id, sites = 0;
 
for (id = 0; id < SITES_MAX; id++) {
+   struct thermal_zone_device *tzd;
+   int ret;
+
qdata->sensor[id] = devm_kzalloc(>dev,
sizeof(struct qoriq_sensor), GFP_KERNEL);
if (!qdata->sensor[id])
@@ -123,13 +125,16 @@ static int qoriq_tmu_register_tmu_zone(struct 
platform_device *pdev)
 
qdata->sensor[id]->id = id;
qdata->sensor[id]->qdata = qdata;
-   qdata->sensor[id]->tzd = devm_thermal_zone_of_sensor_register(
-   >dev, id, qdata->sensor[id], _tz_ops);
-   if (IS_ERR(qdata->sensor[id]->tzd)) {
-   if (PTR_ERR(qdata->sensor[id]->tzd) == -ENODEV)
+
+   tzd = devm_thermal_zone_of_sensor_register(>dev, id,
+  qdata->sensor[id],
+  _tz_ops);
+   ret = PTR_ERR_OR_ZERO(tzd);
+   if (ret) {
+   if (ret == -ENODEV)
continue;
else
-   return PTR_ERR(qdata->sensor[id]->tzd);
+   return ret;
}
 
sites |= 0x1 << (15 - id);
-- 
2.21.0



[PATCH v7 05/12] thermal: qoriq: Pass data to qoriq_tmu_register_tmu_zone() directly

2019-09-11 Thread Andrey Smirnov
Pass all necessary data to qoriq_tmu_register_tmu_zone() directly
instead of passing a platform device and then deriving it. This is
done as a first step to simplify resource deallocation code.

Signed-off-by: Andrey Smirnov 
Acked-by: Daniel Lezcano 
Tested-by: Lucas Stach 
Cc: Chris Healy 
Cc: Lucas Stach 
Cc: Eduardo Valentin 
Cc: Daniel Lezcano 
Cc: Angus Ainslie (Purism) 
Cc: linux-...@nxp.com
Cc: linux...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
---
 drivers/thermal/qoriq_thermal.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/thermal/qoriq_thermal.c b/drivers/thermal/qoriq_thermal.c
index f8f5228d83af..5b9f2a31d275 100644
--- a/drivers/thermal/qoriq_thermal.c
+++ b/drivers/thermal/qoriq_thermal.c
@@ -111,9 +111,9 @@ static const struct thermal_zone_of_device_ops tmu_tz_ops = 
{
.get_temp = tmu_get_temp,
 };
 
-static int qoriq_tmu_register_tmu_zone(struct platform_device *pdev)
+static int qoriq_tmu_register_tmu_zone(struct device *dev,
+  struct qoriq_tmu_data *qdata)
 {
-   struct qoriq_tmu_data *qdata = platform_get_drvdata(pdev);
int id, sites = 0;
 
for (id = 0; id < SITES_MAX; id++) {
@@ -123,7 +123,7 @@ static int qoriq_tmu_register_tmu_zone(struct 
platform_device *pdev)
 
sensor->id = id;
 
-   tzd = devm_thermal_zone_of_sensor_register(>dev, id,
+   tzd = devm_thermal_zone_of_sensor_register(dev, id,
   sensor,
   _tz_ops);
ret = PTR_ERR_OR_ZERO(tzd);
@@ -229,7 +229,7 @@ static int qoriq_tmu_probe(struct platform_device *pdev)
if (ret < 0)
goto err;
 
-   ret = qoriq_tmu_register_tmu_zone(pdev);
+   ret = qoriq_tmu_register_tmu_zone(dev, data);
if (ret < 0) {
dev_err(dev, "Failed to register sensors\n");
ret = -ENODEV;
-- 
2.21.0



[PATCH v7 00/12] QorIQ TMU multi-sensor and HWMON support

2019-09-11 Thread Andrey Smirnov
Everyone:

This series contains patches adding support for HWMON integration, bug
fixes and general improvements (hopefully) for TMU driver I made while
working on it on i.MX8MQ.

Feedback is welcome!

Thanks,
Andrey Smirnov

Changes since [v6]:

   - Rebased on top of Zhang's "next" branch

   - Added "thermal: qoriq: Drop unnecessary drvdata cleanup"

Changes since [v5]

- Rebased on recent linux-next, dropped "thermal: qoriq: Remove
  unnecessary DT node is NULL check" since it is already in the
  tree

- Dropped dependency on [rfc]

Changes since [v4]

- Collected Tested-by from Lucas

- Collected Reviewed-by from Daniel

- Converted "thermal: qoriq: Enable all sensors before registering
  them" to use if instead of switch statement for error checking

Changes since [v3]

- Series reabse on top of [rfc]

- Fixed incorrect goto label in "thermal: qoriq: Pass data to
  qoriq_tmu_calibration()"
  
- Added REGS_TRITSR() register description to "thermal: qoriq: Do
  not report invalid temperature reading"
  
- Reworded commit message of "thermal: qoriq: Remove unnecessary
  DT node is NULL check"

Changes since [v2]

- Patches rebased on v5.1-rc1

Changes since [v1]

- Rebased on "linus" branch of
  git.kernel.org/pub/scm/linux/kernel/git/evalenti/linux-soc-thermal.git
  that included latest chagnes adding multi-sensors support

- Dropped

thermal: qoriq: Add support for multiple thremal sites
thermal: qoriq: Be more strict when parsing
thermal: qoriq: Simplify error handling in qoriq_tmu_get_sensor_id()

  since they are no longer relevant

- Added

thermal: qoriq: Don't store struct thermal_zone_device reference
thermal: qoriq: Add local struct qoriq_sensor pointer
thermal: qoriq: Embed per-sensor data into struct qoriq_tmu_data
thermal: qoriq: Pass data to qoriq_tmu_register_tmu_zone() directly

  to simplify latest codebase

- Changed "thermal: qoriq: Do not report invalid temperature
  reading" to use regmap_read_poll_timeout() to make sure that
  tmu_get_temp() waits for fist sample to be ready before
  reporting it. This case is triggered on my setup if
  qoriq_thermal is compiled as a module

[v1] lore.kernel.org/lkml/20190218191141.3729-1-andrew.smir...@gmail.com
[v2] lore.kernel.org/lkml/2019000508.26325-1-andrew.smir...@gmail.com
[v3] lore.kernel.org/lkml/20190401041418.5999-1-andrew.smir...@gmail.com
[v4] lore.kernel.org/lkml/20190413082748.29990-1-andrew.smir...@gmail.com
[v5] lore.kernel.org/lkml/20190424064830.18179-1-andrew.smir...@gmail.com
[v6] lore.kernel.org/lkml/20190821012612.7823-1-andrew.smir...@gmail.com
[rfc] lore.kernel.org/lkml/20190404080647.8173-1-daniel.lezc...@linaro.org

Andrey Smirnov (12):
  thermal: qoriq: Add local struct device pointer
  thermal: qoriq: Don't store struct thermal_zone_device reference
  thermal: qoriq: Add local struct qoriq_sensor pointer
  thermal: qoriq: Embed per-sensor data into struct qoriq_tmu_data
  thermal: qoriq: Pass data to qoriq_tmu_register_tmu_zone() directly
  thermal: qoriq: Pass data to qoriq_tmu_calibration() directly
  thermal: qoriq: Drop unnecessary drvdata cleanup
  thermal: qoriq: Convert driver to use regmap API
  thermal: qoriq: Enable all sensors before registering them
  thermal: qoriq: Do not report invalid temperature reading
  thermal_hwmon: Add devres wrapper for thermal_add_hwmon_sysfs()
  thermal: qoriq: Add hwmon support

 drivers/thermal/qoriq_thermal.c | 252 +---
 drivers/thermal/thermal_hwmon.c |  28 
 drivers/thermal/thermal_hwmon.h |   7 +
 3 files changed, 167 insertions(+), 120 deletions(-)

-- 
2.21.0



[PATCH v7 06/12] thermal: qoriq: Pass data to qoriq_tmu_calibration() directly

2019-09-11 Thread Andrey Smirnov
We can simplify error cleanup code if instead of passing a "struct
platform_device *" to qoriq_tmu_calibration() and deriving a bunch of
pointers from it, we pass those pointers directly. This way we won't
be force to call platform_set_drvdata() as early in qoriq_tmu_probe()
and need to have "platform_set_drvdata(pdev, NULL);" in error path.

Signed-off-by: Andrey Smirnov 
Reviewed-by: Daniel Lezcano 
Tested-by: Lucas Stach 
Cc: Chris Healy 
Cc: Lucas Stach 
Cc: Eduardo Valentin 
Cc: Daniel Lezcano 
Cc: Angus Ainslie (Purism) 
Cc: linux-...@nxp.com
Cc: linux...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
---
 drivers/thermal/qoriq_thermal.c | 17 -
 1 file changed, 8 insertions(+), 9 deletions(-)

diff --git a/drivers/thermal/qoriq_thermal.c b/drivers/thermal/qoriq_thermal.c
index 5b9f2a31d275..af596c3342d0 100644
--- a/drivers/thermal/qoriq_thermal.c
+++ b/drivers/thermal/qoriq_thermal.c
@@ -144,16 +144,16 @@ static int qoriq_tmu_register_tmu_zone(struct device *dev,
return 0;
 }
 
-static int qoriq_tmu_calibration(struct platform_device *pdev)
+static int qoriq_tmu_calibration(struct device *dev,
+struct qoriq_tmu_data *data)
 {
int i, val, len;
u32 range[4];
const u32 *calibration;
-   struct device_node *np = pdev->dev.of_node;
-   struct qoriq_tmu_data *data = platform_get_drvdata(pdev);
+   struct device_node *np = dev->of_node;
 
if (of_property_read_u32_array(np, "fsl,tmu-range", range, 4)) {
-   dev_err(>dev, "missing calibration range.\n");
+   dev_err(dev, "missing calibration range.\n");
return -ENODEV;
}
 
@@ -165,7 +165,7 @@ static int qoriq_tmu_calibration(struct platform_device 
*pdev)
 
calibration = of_get_property(np, "fsl,tmu-calibration", );
if (calibration == NULL || len % 8) {
-   dev_err(>dev, "invalid calibration data.\n");
+   dev_err(dev, "invalid calibration data.\n");
return -ENODEV;
}
 
@@ -203,8 +203,6 @@ static int qoriq_tmu_probe(struct platform_device *pdev)
if (!data)
return -ENOMEM;
 
-   platform_set_drvdata(pdev, data);
-
data->little_endian = of_property_read_bool(np, "little-endian");
 
data->regs = devm_platform_ioremap_resource(pdev, 0);
@@ -225,7 +223,7 @@ static int qoriq_tmu_probe(struct platform_device *pdev)
 
qoriq_tmu_init_device(data);/* TMU initialization */
 
-   ret = qoriq_tmu_calibration(pdev);  /* TMU calibration */
+   ret = qoriq_tmu_calibration(dev, data); /* TMU calibration */
if (ret < 0)
goto err;
 
@@ -236,11 +234,12 @@ static int qoriq_tmu_probe(struct platform_device *pdev)
goto err;
}
 
+   platform_set_drvdata(pdev, data);
+
return 0;
 
 err:
clk_disable_unprepare(data->clk);
-   platform_set_drvdata(pdev, NULL);
 
return ret;
 }
-- 
2.21.0



[PATCH v7 04/12] thermal: qoriq: Embed per-sensor data into struct qoriq_tmu_data

2019-09-11 Thread Andrey Smirnov
Embed per-sensor data into struct qoriq_tmu_data so we can drop the
code allocating it. This also allows us to get rid of per-sensor back
reference to struct qoriq_tmu_data since now its address can be
calculated using container_of().

Signed-off-by: Andrey Smirnov 
Cc: Chris Healy 
Cc: Lucas Stach 
Cc: Zhang Rui 
Cc: Eduardo Valentin 
Cc: Daniel Lezcano 
Cc: Angus Ainslie (Purism) 
Cc: linux-...@nxp.com
Cc: linux...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
---
 drivers/thermal/qoriq_thermal.c | 23 ---
 1 file changed, 8 insertions(+), 15 deletions(-)

diff --git a/drivers/thermal/qoriq_thermal.c b/drivers/thermal/qoriq_thermal.c
index ae22836c471d..f8f5228d83af 100644
--- a/drivers/thermal/qoriq_thermal.c
+++ b/drivers/thermal/qoriq_thermal.c
@@ -60,13 +60,10 @@ struct qoriq_tmu_regs {
u32 ttr3cr; /* Temperature Range 3 Control Register */
 };
 
-struct qoriq_tmu_data;
-
 /*
  * Thermal zone data
  */
 struct qoriq_sensor {
-   struct qoriq_tmu_data   *qdata;
int id;
 };
 
@@ -74,9 +71,14 @@ struct qoriq_tmu_data {
struct qoriq_tmu_regs __iomem *regs;
struct clk *clk;
bool little_endian;
-   struct qoriq_sensor *sensor[SITES_MAX];
+   struct qoriq_sensor sensor[SITES_MAX];
 };
 
+static struct qoriq_tmu_data *qoriq_sensor_to_data(struct qoriq_sensor *s)
+{
+   return container_of(s, struct qoriq_tmu_data, sensor[s->id]);
+}
+
 static void tmu_write(struct qoriq_tmu_data *p, u32 val, void __iomem *addr)
 {
if (p->little_endian)
@@ -96,7 +98,7 @@ static u32 tmu_read(struct qoriq_tmu_data *p, void __iomem 
*addr)
 static int tmu_get_temp(void *p, int *temp)
 {
struct qoriq_sensor *qsensor = p;
-   struct qoriq_tmu_data *qdata = qsensor->qdata;
+   struct qoriq_tmu_data *qdata = qoriq_sensor_to_data(qsensor);
u32 val;
 
val = tmu_read(qdata, >regs->site[qsensor->id].tritsr);
@@ -116,19 +118,10 @@ static int qoriq_tmu_register_tmu_zone(struct 
platform_device *pdev)
 
for (id = 0; id < SITES_MAX; id++) {
struct thermal_zone_device *tzd;
-   struct qoriq_sensor *sensor;
+   struct qoriq_sensor *sensor = >sensor[id];
int ret;
 
-   sensor = devm_kzalloc(>dev,
- sizeof(struct qoriq_sensor),
- GFP_KERNEL);
-   if (!qdata->sensor[id])
-   return -ENOMEM;
-
-   qdata->sensor[id] = sensor;
-
sensor->id = id;
-   sensor->qdata = qdata;
 
tzd = devm_thermal_zone_of_sensor_register(>dev, id,
   sensor,
-- 
2.21.0



[PATCH v7 03/12] thermal: qoriq: Add local struct qoriq_sensor pointer

2019-09-11 Thread Andrey Smirnov
Add local struct qoriq_sensor pointer in qoriq_tmu_register_tmu_zone()
for brevity.

Signed-off-by: Andrey Smirnov 
Cc: Chris Healy 
Cc: Lucas Stach 
Cc: Zhang Rui 
Cc: Eduardo Valentin 
Cc: Daniel Lezcano 
Cc: Angus Ainslie (Purism) 
Cc: linux-...@nxp.com
Cc: linux...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
---
 drivers/thermal/qoriq_thermal.c | 14 +-
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/drivers/thermal/qoriq_thermal.c b/drivers/thermal/qoriq_thermal.c
index b471c226f06b..ae22836c471d 100644
--- a/drivers/thermal/qoriq_thermal.c
+++ b/drivers/thermal/qoriq_thermal.c
@@ -116,18 +116,22 @@ static int qoriq_tmu_register_tmu_zone(struct 
platform_device *pdev)
 
for (id = 0; id < SITES_MAX; id++) {
struct thermal_zone_device *tzd;
+   struct qoriq_sensor *sensor;
int ret;
 
-   qdata->sensor[id] = devm_kzalloc(>dev,
-   sizeof(struct qoriq_sensor), GFP_KERNEL);
+   sensor = devm_kzalloc(>dev,
+ sizeof(struct qoriq_sensor),
+ GFP_KERNEL);
if (!qdata->sensor[id])
return -ENOMEM;
 
-   qdata->sensor[id]->id = id;
-   qdata->sensor[id]->qdata = qdata;
+   qdata->sensor[id] = sensor;
+
+   sensor->id = id;
+   sensor->qdata = qdata;
 
tzd = devm_thermal_zone_of_sensor_register(>dev, id,
-  qdata->sensor[id],
+  sensor,
   _tz_ops);
ret = PTR_ERR_OR_ZERO(tzd);
if (ret) {
-- 
2.21.0



Re: [PATCH 1/4] arm64: Kconfig: Fix XGENE driver dependencies

2019-09-11 Thread Stephen Boyd
Quoting Amit Kucheria (2019-09-11 15:18:45)
> diff --git a/drivers/clk/Kconfig b/drivers/clk/Kconfig
> index 801fa1cd0321..9b2790d3f18a 100644
> --- a/drivers/clk/Kconfig
> +++ b/drivers/clk/Kconfig
> @@ -225,7 +225,7 @@ config CLK_QORIQ
>  
>  config COMMON_CLK_XGENE
> bool "Clock driver for APM XGene SoC"
> -   default ARCH_XGENE
> +   depends on ARCH_XGENE
> depends on ARM64 || COMPILE_TEST

Is ARCH_XGENE supported outside of ARM64? I'd expect to see something
more like depends on ARCH_XGENE || COMPILE_TEST and default ARCH_XGENE
so that if the config is supported it becomes the default. Or at least
depends on ARCH_XGENE && ARM64 || COMPILE_TEST

> ---help---
>   Sypport for the APM X-Gene SoC reference, PLL, and device clocks.


Re: [PATCH V2 net-next 4/7] net: hns3: fix port setting handle for fibre port

2019-09-11 Thread tanhuazhong




On 2019/9/11 18:16, Sergei Shtylyov wrote:

Hello!

On 11.09.2019 5:40, Huazhong Tan wrote:


From: Guangbin Huang 

For hardware doesn't support use specified speed and duplex


    Can't pasre that. "For hardware that does not support using", perhaps?


Yes, thanks. Will check the grammar more carefully next time.




to negotiate, it's unnecessary to check and modify the port
speed and duplex for fibre port when autoneg is on.

Fixes: 22f48e24a23d ("net: hns3: add autoneg and change speed support 
for fibre port")

Signed-off-by: Guangbin Huang 
Signed-off-by: Huazhong Tan 
---
  drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c | 15 +++
  1 file changed, 15 insertions(+)

diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c 
b/drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c

index f5a681d..680c350 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c
@@ -726,6 +726,12 @@ static int hns3_check_ksettings_param(const 
struct net_device *netdev,

  u8 duplex;
  int ret;
+    /* hw doesn't support use specified speed and duplex to negotiate,


    I can't parse that, did you mean "using"?


yes, thanks.




+ * unnecessary to check them when autoneg on.
+ */
+    if (cmd->base.autoneg)
+    return 0;
+
  if (ops->get_ksettings_an_result) {
  ops->get_ksettings_an_result(handle, , , 
);

  if (cmd->base.autoneg == autoneg && cmd->base.speed == speed &&
@@ -787,6 +793,15 @@ static int hns3_set_link_ksettings(struct 
net_device *netdev,

  return ret;
  }
+    /* hw doesn't support use specified speed and duplex to negotiate,


    Here too...




yes, thanks.


+ * ignore them when autoneg on.
+ */
+    if (cmd->base.autoneg) {
+    netdev_info(netdev,
+    "autoneg is on, ignore the speed and duplex\n");
+    return 0;
+    }
+
  if (ops->cfg_mac_speed_dup_h)
  ret = ops->cfg_mac_speed_dup_h(handle, cmd->base.speed,
 cmd->base.duplex);


MBR, Sergei

.





Re: [PATCH] ocfs2: fix spelling mistake "ambigous" -> "ambiguous"

2019-09-11 Thread Joseph Qi



On 19/9/12 00:07, Colin King wrote:
> From: Colin Ian King 
> 
> There is a spelling mistake in a mlog_bug_on_msg message. Fix it.
> 
> Signed-off-by: Colin Ian King 

Acked-by: Joseph Qi 
> ---
>  fs/ocfs2/inode.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/fs/ocfs2/inode.c b/fs/ocfs2/inode.c
> index 7ad9d6590818..7c9dfd50c1c1 100644
> --- a/fs/ocfs2/inode.c
> +++ b/fs/ocfs2/inode.c
> @@ -534,7 +534,7 @@ static int ocfs2_read_locked_inode(struct inode *inode,
>*/
>   mlog_bug_on_msg(!!(fe->i_flags & cpu_to_le32(OCFS2_SYSTEM_FL)) !=
>   !!(args->fi_flags & OCFS2_FI_FLAG_SYSFILE),
> - "Inode %llu: system file state is ambigous\n",
> + "Inode %llu: system file state is ambiguous\n",
>   (unsigned long long)args->fi_blkno);
>  
>   if (S_ISCHR(le16_to_cpu(fe->i_mode)) ||
> 


Re: nvme vs. hibernation ( again )

2019-09-11 Thread Ming Lei
On Thu, Sep 12, 2019 at 12:27 AM Gabriel C  wrote:
>
> Hi Christoph,
>
> I see this was already discussed in 2 threads:
>
>  https://lists.infradead.org/pipermail/linux-nvme/2019-April/023234.html
>  https://lkml.org/lkml/2019/5/24/668
>
> but in latest git the issue still exists.
>
> I hit that on each resume on my Acer Nitro 5 (AN515-43-R8BF) Laptop.
>
> .
> Sep 11 16:16:30 nitro5 kernel: Freezing remaining freezable tasks ...
> (elapsed 0.000 seconds) done.
> Sep 11 16:16:30 nitro5 kernel: printk: Suspending console(s) (use
> no_console_suspend to debug)
> Sep 11 16:16:30 nitro5 kernel: WARNING: CPU: 0 PID: 882 at
> kernel/irq/chip.c:210 irq_startup+0xe6/0xf0
> Sep 11 16:16:30 nitro5 kernel: Modules linked in: af_packet bnep
> amdgpu ath10k_pci ath10k_core ath mac80211 joydev uvcvideo
> videobuf2_vmalloc videobuf2_memops edac_mce_amd videobuf2_v4l2
> amd_iommu_v2 kvm_amd gpu_sched btusb snd_hda_codec_realtek ttm btrtl
> btbcm btintel hid_multitouch ccp snd_hda_codec_generic nls_utf8
> bluetooth drm_kms_helper hid_generic videobuf2_common ledtrig_audio
> snd_hda_codec_hdmi nls_cp437 cfg80211 drm kvm snd_hda_intel vfat
> videodev fat agpgart efi_pstore r8169 snd_hda_codec ecdh_generic
> i2c_algo_bit realtek irqbypass pcspkr mc rfkill fb_sys_fops efivars
> syscopyarea snd_hda_core ecc k10temp wmi_bmof sysfillrect tpm_crb
> crc16 libphy i2c_piix4 libarc4 snd_hwdep hwmon sysimgblt tpm_tis
> tpm_tis_core evdev ac tpm battery mac_hid i2c_designware_platform
> pinctrl_amd i2c_designware_core rng_core acer_wireless button
> acpi_cpufreq ppdev sch_fq_codel fuse snd_pcm_oss snd_mixer_oss snd_pcm
> snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device
> snd_timer snd soundcore lp parport_pc
> Sep 11 16:16:30 nitro5 kernel:  parport xfs libcrc32c crc32c_generic
> crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel ahci
> libahci libata xhci_pci xhci_hcd aesni_intel usbcore scsi_mod
> aes_x86_64 crypto_simd cryptd glue_helper serio_raw i2c_hid hid video
> i2c_core wmi dm_mirror dm_region_hash dm_log dm_mod unix sha1_ssse3
> sha1_generic hmac ipv6 nf_defrag_ipv6 autofs4
> Sep 11 16:16:30 nitro5 kernel: CPU: 0 PID: 882 Comm: kworker/u32:9 Not
> tainted 5.3.0-rc8-7-g3120b9a6a3f7-dirty #2
> Sep 11 16:16:30 nitro5 kernel: Hardware name: Acer Nitro
> AN515-43/Octavia_PKS, BIOS V1.05 08/07/2019
> Sep 11 16:16:30 nitro5 kernel: Workqueue: events_unbound async_run_entry_fn
> Sep 11 16:16:30 nitro5 kernel: RIP: 0010:irq_startup+0xe6/0xf0
> Sep 11 16:16:30 nitro5 kernel: Code: e8 7f 3c 00 00 85 c0 0f 85 e3 09
> 00 00 4c 89 e7 31 d2 4c 89 ee e8 1a cf ff ff 48 89 ef e8 b2 fe ff ff
> 41 89 c4 e9 51 ff ff ff <0f> 0b eb b2 66 0f 1f 44 00 00 0f 1f 44 00 00
> 55 48 89 fd 53 48 8b
> Sep 11 16:16:30 nitro5 kernel: RSP: 0018:be9b00793c38 EFLAGS: 00010002
> Sep 11 16:16:30 nitro5 kernel: RAX: 0010 RBX:
> 0001 RCX: 0040
> Sep 11 16:16:30 nitro5 kernel: RDX:  RSI:
> 9d1b8800 RDI: 9c9d9e136598
> Sep 11 16:16:30 nitro5 kernel: RBP: 9c9d981e5400 R08:
>  R09: 9c9d9e8003f0
> Sep 11 16:16:30 nitro5 kernel: R10:  R11:
> 9d057688 R12: 0001
> Sep 11 16:16:30 nitro5 kernel: R13: 9c9d9e136598 R14:
>  R15: 9c9d9e346000
> Sep 11 16:16:30 nitro5 kernel: FS:  ()
> GS:9c9da080() knlGS:
> Sep 11 16:16:30 nitro5 kernel: CS:  0010 DS:  ES:  CR0: 
> 80050033
> Sep 11 16:16:30 nitro5 kernel: CR2: 5633ad8d0060 CR3:
> 0003db8d CR4: 003406f0
> Sep 11 16:16:30 nitro5 kernel: Call Trace:
> Sep 11 16:16:30 nitro5 kernel:  enable_irq+0x48/0x90
> Sep 11 16:16:30 nitro5 kernel:  nvme_poll_irqdisable+0x20c/0x280
> Sep 11 16:16:30 nitro5 kernel:  __nvme_disable_io_queues+0x19d/0x1d0
> Sep 11 16:16:30 nitro5 kernel:  ? nvme_del_queue_end+0x20/0x20
> Sep 11 16:16:30 nitro5 kernel:  nvme_dev_disable+0x15c/0x210
> Sep 11 16:16:30 nitro5 kernel:  nvme_suspend+0x40/0x130
> Sep 11 16:16:30 nitro5 kernel:  pci_pm_suspend+0x72/0x130
> Sep 11 16:16:30 nitro5 kernel:  ? pci_pm_freeze+0xb0/0xb0
> Sep 11 16:16:30 nitro5 kernel:  dpm_run_callback+0x29/0x120
> Sep 11 16:16:30 nitro5 kernel:  __device_suspend+0x1b2/0x400
> Sep 11 16:16:30 nitro5 kernel:  async_suspend+0x1b/0x90
> Sep 11 16:16:30 nitro5 kernel:  async_run_entry_fn+0x37/0xe0
> Sep 11 16:16:30 nitro5 kernel:  process_one_work+0x1d1/0x3a0
> Sep 11 16:16:30 nitro5 kernel:  worker_thread+0x4a/0x3d0
> Sep 11 16:16:30 nitro5 kernel:  kthread+0xf9/0x130
> Sep 11 16:16:30 nitro5 kernel:  ? process_one_work+0x3a0/0x3a0
> Sep 11 16:16:30 nitro5 kernel:  ? kthread_park+0x80/0x80
> Sep 11 16:16:30 nitro5 kernel:  ret_from_fork+0x22/0x40
> Sep 11 16:16:30 nitro5 kernel: ---[ end trace c598a86b44574730 ]---
>
> ...
>
> The patch from Dongli Zhang was rejected the time without any other fix
> or work on this issue I could find.
>
> Are there any plans to fix that or any 

Re: [PATCH 2/3] mm: avoid slub allocation while holding list_lock

2019-09-11 Thread Kirill A. Shutemov
On Wed, Sep 11, 2019 at 06:29:28PM -0600, Yu Zhao wrote:
> If we are already under list_lock, don't call kmalloc(). Otherwise we
> will run into deadlock because kmalloc() also tries to grab the same
> lock.
> 
> Instead, statically allocate bitmap in struct kmem_cache_node. Given
> currently page->objects has 15 bits, we bloat the per-node struct by
> 4K. So we waste some memory but only do so when slub debug is on.

Why not have single page total protected by a lock?

Listing object from two pages at the same time doesn't make sense anyway.
Cuncurent validating is not something sane to do.

-- 
 Kirill A. Shutemov


[PATCH 2/3] mm: avoid slub allocation while holding list_lock

2019-09-11 Thread Yu Zhao
If we are already under list_lock, don't call kmalloc(). Otherwise we
will run into deadlock because kmalloc() also tries to grab the same
lock.

Instead, statically allocate bitmap in struct kmem_cache_node. Given
currently page->objects has 15 bits, we bloat the per-node struct by
4K. So we waste some memory but only do so when slub debug is on.

  WARNING: possible recursive locking detected
  
  mount-encrypted/4921 is trying to acquire lock:
  (&(>list_lock)->rlock){-.-.}, at: ___slab_alloc+0x104/0x437

  but task is already holding lock:
  (&(>list_lock)->rlock){-.-.}, at: __kmem_cache_shutdown+0x81/0x3cb

  other info that might help us debug this:
   Possible unsafe locking scenario:

 CPU0
 
lock(&(>list_lock)->rlock);
lock(&(>list_lock)->rlock);

   *** DEADLOCK ***

Signed-off-by: Yu Zhao 
---
 include/linux/slub_def.h |  4 
 mm/slab.h|  1 +
 mm/slub.c| 44 ++--
 3 files changed, 20 insertions(+), 29 deletions(-)

diff --git a/include/linux/slub_def.h b/include/linux/slub_def.h
index d2153789bd9f..719d43574360 100644
--- a/include/linux/slub_def.h
+++ b/include/linux/slub_def.h
@@ -9,6 +9,10 @@
  */
 #include 
 
+#define OO_SHIFT   15
+#define OO_MASK((1 << OO_SHIFT) - 1)
+#define MAX_OBJS_PER_PAGE  32767 /* since page.objects is u15 */
+
 enum stat_item {
ALLOC_FASTPATH, /* Allocation from cpu slab */
ALLOC_SLOWPATH, /* Allocation by getting a new cpu slab */
diff --git a/mm/slab.h b/mm/slab.h
index 9057b8056b07..2d8639835db1 100644
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -556,6 +556,7 @@ struct kmem_cache_node {
atomic_long_t nr_slabs;
atomic_long_t total_objects;
struct list_head full;
+   unsigned long object_map[BITS_TO_LONGS(MAX_OBJS_PER_PAGE)];
 #endif
 #endif
 
diff --git a/mm/slub.c b/mm/slub.c
index 62053ceb4464..f28072c9f2ce 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -187,10 +187,6 @@ static inline bool kmem_cache_has_cpu_partial(struct 
kmem_cache *s)
  */
 #define DEBUG_METADATA_FLAGS (SLAB_RED_ZONE | SLAB_POISON | SLAB_STORE_USER)
 
-#define OO_SHIFT   15
-#define OO_MASK((1 << OO_SHIFT) - 1)
-#define MAX_OBJS_PER_PAGE  32767 /* since page.objects is u15 */
-
 /* Internal SLUB flags */
 /* Poison object */
 #define __OBJECT_POISON((slab_flags_t __force)0x8000U)
@@ -454,6 +450,8 @@ static void get_map(struct kmem_cache *s, struct page 
*page, unsigned long *map)
void *p;
void *addr = page_address(page);
 
+   bitmap_zero(map, page->objects);
+
for (p = page->freelist; p; p = get_freepointer(s, p))
set_bit(slab_index(p, s, addr), map);
 }
@@ -3680,14 +3678,12 @@ static int kmem_cache_open(struct kmem_cache *s, 
slab_flags_t flags)
 }
 
 static void list_slab_objects(struct kmem_cache *s, struct page *page,
-   const char *text)
+ unsigned long *map, const char *text)
 {
 #ifdef CONFIG_SLUB_DEBUG
void *addr = page_address(page);
void *p;
-   unsigned long *map = bitmap_zalloc(page->objects, GFP_ATOMIC);
-   if (!map)
-   return;
+
slab_err(s, page, text, s->name);
slab_lock(page);
 
@@ -3699,8 +3695,8 @@ static void list_slab_objects(struct kmem_cache *s, 
struct page *page,
print_tracking(s, p);
}
}
+
slab_unlock(page);
-   bitmap_free(map);
 #endif
 }
 
@@ -3721,7 +3717,7 @@ static void free_partial(struct kmem_cache *s, struct 
kmem_cache_node *n)
remove_partial(n, page);
list_add(>slab_list, );
} else {
-   list_slab_objects(s, page,
+   list_slab_objects(s, page, n->object_map,
"Objects remaining in %s on __kmem_cache_shutdown()");
}
}
@@ -4397,7 +4393,6 @@ static int validate_slab(struct kmem_cache *s, struct 
page *page,
return 0;
 
/* Now we know that a valid freelist exists */
-   bitmap_zero(map, page->objects);
 
get_map(s, page, map);
for_each_object(p, s, addr, page->objects) {
@@ -4422,7 +4417,7 @@ static void validate_slab_slab(struct kmem_cache *s, 
struct page *page,
 }
 
 static int validate_slab_node(struct kmem_cache *s,
-   struct kmem_cache_node *n, unsigned long *map)
+   struct kmem_cache_node *n)
 {
unsigned long count = 0;
struct page *page;
@@ -4431,7 +4426,7 @@ static int validate_slab_node(struct kmem_cache *s,
spin_lock_irqsave(>list_lock, flags);
 
list_for_each_entry(page, >partial, slab_list) {
-   validate_slab_slab(s, page, map);
+   validate_slab_slab(s, page, n->object_map);
   

[PATCH 3/3] mm: lock slub page when listing objects

2019-09-11 Thread Yu Zhao
Though I have no idea what the side effect of a race would be,
apparently we want to prevent the free list from being changed
while debugging objects in general.

Signed-off-by: Yu Zhao 
---
 mm/slub.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/mm/slub.c b/mm/slub.c
index f28072c9f2ce..2734a092bbff 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -4594,10 +4594,14 @@ static void process_slab(struct loc_track *t, struct 
kmem_cache *s,
void *addr = page_address(page);
void *p;
 
+   slab_lock(page);
+
get_map(s, page, map);
for_each_object(p, s, addr, page->objects)
if (!test_bit(slab_index(p, s, addr), map))
add_location(t, s, get_track(s, p, alloc));
+
+   slab_unlock(page);
 }
 
 static int list_locations(struct kmem_cache *s, char *buf,
-- 
2.23.0.162.g0b9fbb3734-goog



[PATCH 1/3] mm: correct mask size for slub page->objects

2019-09-11 Thread Yu Zhao
Mask of slub objects per page shouldn't be larger than what
page->objects can hold.

It requires more than 2^15 objects to hit the problem, and I don't
think anybody would. It'd be nice to have the mask fixed, but not
really worth cc'ing the stable.

Fixes: 50d5c41cd151 ("slub: Do not use frozen page flag but a bit in the page 
counters")
Signed-off-by: Yu Zhao 
---
 mm/slub.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/mm/slub.c b/mm/slub.c
index 8834563cdb4b..62053ceb4464 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -187,7 +187,7 @@ static inline bool kmem_cache_has_cpu_partial(struct 
kmem_cache *s)
  */
 #define DEBUG_METADATA_FLAGS (SLAB_RED_ZONE | SLAB_POISON | SLAB_STORE_USER)
 
-#define OO_SHIFT   16
+#define OO_SHIFT   15
 #define OO_MASK((1 << OO_SHIFT) - 1)
 #define MAX_OBJS_PER_PAGE  32767 /* since page.objects is u15 */
 
@@ -343,6 +343,8 @@ static inline unsigned int oo_order(struct 
kmem_cache_order_objects x)
 
 static inline unsigned int oo_objects(struct kmem_cache_order_objects x)
 {
+   BUILD_BUG_ON(OO_MASK > MAX_OBJS_PER_PAGE);
+
return x.x & OO_MASK;
 }
 
-- 
2.23.0.162.g0b9fbb3734-goog



Re: [PATCH] powerpc/prom_init: Undo relocation before entering secure mode

2019-09-11 Thread Thiago Jung Bauermann


Thiago Jung Bauermann  writes:

> The ultravisor will do an integrity check of the kernel image but we
> relocated it so the check will fail. Restore the original image by
> relocating it back to the kernel virtual base address.
>
> This works because during build vmlinux is linked with an expected virtual
> runtime address of KERNELBASE.
>
> Fixes: 6a9c930bd775 ("powerpc/prom_init: Add the ESM call to prom_init")
> Signed-off-by: Thiago Jung Bauermann 

I meant to put a Suggested-by: Paul Mackerras 

Sorry. Will add it if there's a v2.

-- 
Thiago Jung Bauermann
IBM Linux Technology Center


[PATCH] nvme-pci: Save PCI state before putting drive into deepest state

2019-09-11 Thread Mario Limonciello
The action of saving the PCI state will cause numerous PCI configuration
space reads which depending upon the vendor implementation may cause
the drive to exit the deepest NVMe state.

In these cases ASPM will typically resolve the PCIe link state and APST
may resolve the NVMe power state.  However it has also been observed
that this register access after quiesced will cause PC10 failure
on some device combinations.

To resolve this, move the PCI state saving to before SetFeatures has been
called.  This has been proven to resolve the issue across a 5000 sample
test on previously failing disk/system combinations.

Signed-off-by: Mario Limonciello 
---
 drivers/nvme/host/pci.c | 13 +++--
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 732d5b6..9b3fed4 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -2894,6 +2894,13 @@ static int nvme_suspend(struct device *dev)
if (ret < 0)
goto unfreeze;
 
+   /*
+* A saved state prevents pci pm from generically controlling the
+* device's power. If we're using protocol specific settings, we don't
+* want pci interfering.
+*/
+   pci_save_state(pdev);
+
ret = nvme_set_power_state(ctrl, ctrl->npss);
if (ret < 0)
goto unfreeze;
@@ -2908,12 +2915,6 @@ static int nvme_suspend(struct device *dev)
ret = 0;
goto unfreeze;
}
-   /*
-* A saved state prevents pci pm from generically controlling the
-* device's power. If we're using protocol specific settings, we don't
-* want pci interfering.
-*/
-   pci_save_state(pdev);
 unfreeze:
nvme_unfreeze(ctrl);
return ret;
-- 
2.7.4



[PATCH 2/3] hv_utils: Support host-initiated hibernation request

2019-09-11 Thread Dexuan Cui
Update the Shutdown IC version to 3.2, which is required for the host to
send the hibernation request.

The user is expected to create the program "/sbin/hyperv-hibernate", which
is called on the host-initiated hibernation request.

The program can be a script like

 test@localhost:~$ cat /sbin/hyperv-hibernate
 #!/bin/bash
 echo disk > /sys/power/state

Signed-off-by: Dexuan Cui 
---
 drivers/hv/hv_util.c | 66 +++-
 1 file changed, 65 insertions(+), 1 deletion(-)

diff --git a/drivers/hv/hv_util.c b/drivers/hv/hv_util.c
index 039c752..9e98c5d 100644
--- a/drivers/hv/hv_util.c
+++ b/drivers/hv/hv_util.c
@@ -24,6 +24,8 @@
 
 #define SD_MAJOR   3
 #define SD_MINOR   0
+#define SD_MINOR_2 2
+#define SD_VERSION_3_2 (SD_MAJOR << 16 | SD_MINOR_2)
 #define SD_VERSION (SD_MAJOR << 16 | SD_MINOR)
 
 #define SD_MAJOR_1 1
@@ -50,8 +52,9 @@
 static int ts_srv_version;
 static int hb_srv_version;
 
-#define SD_VER_COUNT 2
+#define SD_VER_COUNT 3
 static const int sd_versions[] = {
+   SD_VERSION_3_2,
SD_VERSION,
SD_VERSION_1
 };
@@ -75,9 +78,30 @@
UTIL_WS2K8_FW_VERSION
 };
 
+static bool execute_hibernate;
+static int hv_shutdown_init(struct hv_util_service *srv)
+{
+#if 0
+   /*
+* The patch to implement hv_is_hibernation_supported() is going
+* through the tip tree. For now, let's hardcode execute_hibernate
+* to true -- this doesn't break anything since hibernation for
+* Linux VM on Hyper-V never worked before. We'll remove the
+* conditional compilation as soon as hv_is_hibernation_supported()
+* is available in the mainline tree.
+*/
+   execute_hibernate = hv_is_hibernation_supported();
+#else
+   execute_hibernate = true;
+#endif
+
+   return 0;
+}
+
 static void shutdown_onchannelcallback(void *context);
 static struct hv_util_service util_shutdown = {
.util_cb = shutdown_onchannelcallback,
+   .util_init = hv_shutdown_init,
 };
 
 static int hv_timesync_init(struct hv_util_service *srv);
@@ -123,11 +147,38 @@ static void perform_shutdown(struct work_struct *dummy)
orderly_poweroff(true);
 }
 
+static void perform_hibernation(struct work_struct *dummy)
+{
+   /*
+* The user is expected to create the program, which can be a simple
+* script containing two lines:
+* #!/bin/bash
+* echo disk > /sys/power/state
+*/
+   static char hibernate_cmd[PATH_MAX] = "/sbin/hyperv-hibernate";
+
+   static char *envp[] = {
+   NULL,
+   };
+
+   static char *argv[] = {
+   hibernate_cmd,
+   NULL,
+   };
+
+   call_usermodehelper(hibernate_cmd, argv, envp, UMH_NO_WAIT);
+}
+
 /*
  * Perform the shutdown operation in a thread context.
  */
 static DECLARE_WORK(shutdown_work, perform_shutdown);
 
+/*
+ * Perform the hibernation operation in a thread context.
+ */
+static DECLARE_WORK(hibernate_work, perform_hibernation);
+
 static void shutdown_onchannelcallback(void *context)
 {
struct vmbus_channel *channel = context;
@@ -171,6 +222,19 @@ static void shutdown_onchannelcallback(void *context)
pr_info("Shutdown request received -"
" graceful shutdown initiated\n");
break;
+   case 4:
+   case 5:
+   pr_info("Hibernation request received -"
+   " hibernation %sinitiated\n",
+   execute_hibernate ? "" : "not ");
+
+   if (execute_hibernate) {
+   icmsghdrp->status = HV_S_OK;
+   schedule_work(_work);
+   } else {
+   icmsghdrp->status = HV_E_FAIL;
+   }
+   break;
default:
icmsghdrp->status = HV_E_FAIL;
execute_shutdown = false;
-- 
1.8.3.1



[PATCH 1/3] hv_utils: Add the support of hibernation

2019-09-11 Thread Dexuan Cui
On hibernation, Linux can not guarantee the host side utils operations
still succeed without any issue, so let's simply cancel the work items.
The host is supposed to retry the operations, if necessary.

Signed-off-by: Dexuan Cui 
---
 drivers/hv/hv_fcopy.c |  9 -
 drivers/hv/hv_kvp.c   | 11 +--
 drivers/hv/hv_snapshot.c  | 11 +--
 drivers/hv/hv_util.c  | 37 -
 drivers/hv/hyperv_vmbus.h |  3 +++
 include/linux/hyperv.h|  1 +
 6 files changed, 66 insertions(+), 6 deletions(-)

diff --git a/drivers/hv/hv_fcopy.c b/drivers/hv/hv_fcopy.c
index 7e30ae0..f44df3d 100644
--- a/drivers/hv/hv_fcopy.c
+++ b/drivers/hv/hv_fcopy.c
@@ -345,9 +345,16 @@ int hv_fcopy_init(struct hv_util_service *srv)
return 0;
 }
 
+void hv_fcopy_cancel_work(void)
+{
+   cancel_delayed_work_sync(_timeout_work);
+}
+
 void hv_fcopy_deinit(void)
 {
fcopy_transaction.state = HVUTIL_DEVICE_DYING;
-   cancel_delayed_work_sync(_timeout_work);
+
+   hv_fcopy_cancel_work();
+
hvutil_transport_destroy(hvt);
 }
diff --git a/drivers/hv/hv_kvp.c b/drivers/hv/hv_kvp.c
index 5054d11..064c384 100644
--- a/drivers/hv/hv_kvp.c
+++ b/drivers/hv/hv_kvp.c
@@ -757,11 +757,18 @@ static void kvp_on_reset(void)
return 0;
 }
 
-void hv_kvp_deinit(void)
+void hv_kvp_cancel_work(void)
 {
-   kvp_transaction.state = HVUTIL_DEVICE_DYING;
cancel_delayed_work_sync(_host_handshake_work);
cancel_delayed_work_sync(_timeout_work);
cancel_work_sync(_sendkey_work);
+}
+
+void hv_kvp_deinit(void)
+{
+   kvp_transaction.state = HVUTIL_DEVICE_DYING;
+
+   hv_kvp_cancel_work();
+
hvutil_transport_destroy(hvt);
 }
diff --git a/drivers/hv/hv_snapshot.c b/drivers/hv/hv_snapshot.c
index 20ba95b..0eb718a 100644
--- a/drivers/hv/hv_snapshot.c
+++ b/drivers/hv/hv_snapshot.c
@@ -378,10 +378,17 @@ static void vss_on_reset(void)
return 0;
 }
 
-void hv_vss_deinit(void)
+void hv_vss_cancel_work(void)
 {
-   vss_transaction.state = HVUTIL_DEVICE_DYING;
cancel_delayed_work_sync(_timeout_work);
cancel_work_sync(_handle_request_work);
+}
+
+void hv_vss_deinit(void)
+{
+   vss_transaction.state = HVUTIL_DEVICE_DYING;
+
+   hv_vss_cancel_work();
+
hvutil_transport_destroy(hvt);
 }
diff --git a/drivers/hv/hv_util.c b/drivers/hv/hv_util.c
index e32681e..039c752 100644
--- a/drivers/hv/hv_util.c
+++ b/drivers/hv/hv_util.c
@@ -81,12 +81,14 @@
 };
 
 static int hv_timesync_init(struct hv_util_service *srv);
+static void hv_timesync_cancel_work(void);
 static void hv_timesync_deinit(void);
 
 static void timesync_onchannelcallback(void *context);
 static struct hv_util_service util_timesynch = {
.util_cb = timesync_onchannelcallback,
.util_init = hv_timesync_init,
+   .util_cancel_work = hv_timesync_cancel_work,
.util_deinit = hv_timesync_deinit,
 };
 
@@ -98,18 +100,21 @@
 static struct hv_util_service util_kvp = {
.util_cb = hv_kvp_onchannelcallback,
.util_init = hv_kvp_init,
+   .util_cancel_work = hv_kvp_cancel_work,
.util_deinit = hv_kvp_deinit,
 };
 
 static struct hv_util_service util_vss = {
.util_cb = hv_vss_onchannelcallback,
.util_init = hv_vss_init,
+   .util_cancel_work = hv_vss_cancel_work,
.util_deinit = hv_vss_deinit,
 };
 
 static struct hv_util_service util_fcopy = {
.util_cb = hv_fcopy_onchannelcallback,
.util_init = hv_fcopy_init,
+   .util_cancel_work = hv_fcopy_cancel_work,
.util_deinit = hv_fcopy_deinit,
 };
 
@@ -440,6 +445,28 @@ static int util_remove(struct hv_device *dev)
return 0;
 }
 
+static int util_suspend(struct hv_device *dev)
+{
+   struct hv_util_service *srv = hv_get_drvdata(dev);
+
+   if (srv->util_cancel_work)
+   srv->util_cancel_work();
+
+   vmbus_close(dev->channel);
+
+   return 0;
+}
+
+static int util_resume(struct hv_device *dev)
+{
+   struct hv_util_service *srv = hv_get_drvdata(dev);
+   int ret;
+
+   ret = vmbus_open(dev->channel, 4 * PAGE_SIZE, 4 * PAGE_SIZE,
+NULL, 0, srv->util_cb, dev->channel);
+   return ret;
+}
+
 static const struct hv_vmbus_device_id id_table[] = {
/* Shutdown guid */
{ HV_SHUTDOWN_GUID,
@@ -476,6 +503,8 @@ static int util_remove(struct hv_device *dev)
.id_table = id_table,
.probe =  util_probe,
.remove =  util_remove,
+   .suspend = util_suspend,
+   .resume =  util_resume,
.driver = {
.probe_type = PROBE_PREFER_ASYNCHRONOUS,
},
@@ -545,11 +574,17 @@ static int hv_timesync_init(struct hv_util_service *srv)
return 0;
 }
 
+static void hv_timesync_cancel_work(void)
+{
+   cancel_work_sync(_time_work);
+}
+
 static void hv_timesync_deinit(void)
 {
if (hv_ptp_clock)
ptp_clock_unregister(hv_ptp_clock);
-   

[PATCH 3/3] hv_utils: Support host-initiated restart request

2019-09-11 Thread Dexuan Cui
To test the code, we should run this command on the host:

Restart-VM $vm -Type Reboot

Signed-off-by: Dexuan Cui 
---
 drivers/hv/hv_util.c | 23 ++-
 1 file changed, 22 insertions(+), 1 deletion(-)

diff --git a/drivers/hv/hv_util.c b/drivers/hv/hv_util.c
index 9e98c5d..6d642f5 100644
--- a/drivers/hv/hv_util.c
+++ b/drivers/hv/hv_util.c
@@ -24,7 +24,9 @@
 
 #define SD_MAJOR   3
 #define SD_MINOR   0
+#define SD_MINOR_1 1
 #define SD_MINOR_2 2
+#define SD_VERSION_3_1 (SD_MAJOR << 16 | SD_MINOR_1)
 #define SD_VERSION_3_2 (SD_MAJOR << 16 | SD_MINOR_2)
 #define SD_VERSION (SD_MAJOR << 16 | SD_MINOR)
 
@@ -52,9 +54,10 @@
 static int ts_srv_version;
 static int hb_srv_version;
 
-#define SD_VER_COUNT 3
+#define SD_VER_COUNT 4
 static const int sd_versions[] = {
SD_VERSION_3_2,
+   SD_VERSION_3_1,
SD_VERSION,
SD_VERSION_1
 };
@@ -147,6 +150,11 @@ static void perform_shutdown(struct work_struct *dummy)
orderly_poweroff(true);
 }
 
+static void perform_restart(struct work_struct *dummy)
+{
+   orderly_reboot();
+}
+
 static void perform_hibernation(struct work_struct *dummy)
 {
/*
@@ -175,6 +183,11 @@ static void perform_hibernation(struct work_struct *dummy)
 static DECLARE_WORK(shutdown_work, perform_shutdown);
 
 /*
+ * Perform the restart operation in a thread context.
+ */
+static DECLARE_WORK(restart_work, perform_restart);
+
+/*
  * Perform the hibernation operation in a thread context.
  */
 static DECLARE_WORK(hibernate_work, perform_hibernation);
@@ -222,6 +235,14 @@ static void shutdown_onchannelcallback(void *context)
pr_info("Shutdown request received -"
" graceful shutdown initiated\n");
break;
+   case 2:
+   case 3:
+   pr_info("Restart request received -"
+   " graceful restart initiated\n");
+   icmsghdrp->status = HV_S_OK;
+
+   schedule_work(_work);
+   break;
case 4:
case 5:
pr_info("Hibernation request received -"
-- 
1.8.3.1



[PATCH 0/3] Enhance hv_utils to support hibernation

2019-09-11 Thread Dexuan Cui
This patch is basically a pure Hyper-V specific change and it has a
build dependency on the commit 271b2224d42f ("Drivers: hv: vmbus: Implement
suspend/resume for VSC drivers for hibernation"), which is on Sasha Levin's
Hyper-V tree's hyperv-next branch:
https://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux.git/log/?h=hyperv-next

I request this patch should go through Sasha's tree rather than the
char-misc tree.

Dexuan Cui (3):
  hv_utils: Add the support of hibernation
  hv_utils: Support host-initiated hibernation request
  hv_utils: Support host-initiated restart request

 drivers/hv/hv_fcopy.c |   9 +++-
 drivers/hv/hv_kvp.c   |  11 +++-
 drivers/hv/hv_snapshot.c  |  11 +++-
 drivers/hv/hv_util.c  | 124 +-
 drivers/hv/hyperv_vmbus.h |   3 ++
 include/linux/hyperv.h|   1 +
 6 files changed, 152 insertions(+), 7 deletions(-)

-- 
1.8.3.1



[PATCH 4/4] PCI: hv: Change pci_protocol_version to per-hbus

2019-09-11 Thread Dexuan Cui
A VM can have multiple hbus. It looks incorrect for the second hbus's
hv_pci_protocol_negotiation() to set the global variable
'pci_protocol_version' (which was set by the first hbus), even if the
same value is written.

Signed-off-by: Dexuan Cui 
---
 drivers/pci/controller/pci-hyperv.c | 22 ++
 1 file changed, 10 insertions(+), 12 deletions(-)

diff --git a/drivers/pci/controller/pci-hyperv.c 
b/drivers/pci/controller/pci-hyperv.c
index 2655df2..55730c5 100644
--- a/drivers/pci/controller/pci-hyperv.c
+++ b/drivers/pci/controller/pci-hyperv.c
@@ -76,11 +76,6 @@ enum pci_protocol_version_t {
PCI_PROTOCOL_VERSION_1_1,
 };
 
-/*
- * Protocol version negotiated by hv_pci_protocol_negotiation().
- */
-static enum pci_protocol_version_t pci_protocol_version;
-
 #define PCI_CONFIG_MMIO_LENGTH 0x2000
 #define CFG_PAGE_OFFSET 0x1000
 #define CFG_PAGE_SIZE (PCI_CONFIG_MMIO_LENGTH - CFG_PAGE_OFFSET)
@@ -429,6 +424,8 @@ enum hv_pcibus_state {
 
 struct hv_pcibus_device {
struct pci_sysdata sysdata;
+   /* Protocol version negotiated with the host */
+   enum pci_protocol_version_t protocol_version;
enum hv_pcibus_state state;
refcount_t remove_lock;
struct hv_device *hdev;
@@ -942,7 +939,7 @@ static void hv_irq_unmask(struct irq_data *data)
 * negative effect (yet?).
 */
 
-   if (pci_protocol_version >= PCI_PROTOCOL_VERSION_1_2) {
+   if (hbus->protocol_version >= PCI_PROTOCOL_VERSION_1_2) {
/*
 * PCI_PROTOCOL_VERSION_1_2 supports the VP_SET version of the
 * HVCALL_RETARGET_INTERRUPT hypercall, which also coincides
@@ -1112,7 +1109,7 @@ static void hv_compose_msi_msg(struct irq_data *data, 
struct msi_msg *msg)
ctxt.pci_pkt.completion_func = hv_pci_compose_compl;
ctxt.pci_pkt.compl_ctxt = 
 
-   switch (pci_protocol_version) {
+   switch (hbus->protocol_version) {
case PCI_PROTOCOL_VERSION_1_1:
size = hv_compose_msi_req_v1(_pkts.v1,
dest,
@@ -2116,6 +2113,7 @@ static int hv_pci_protocol_negotiation(struct hv_device 
*hdev,
   enum pci_protocol_version_t version[],
   int num_version)
 {
+   struct hv_pcibus_device *hbus = hv_get_drvdata(hdev);
struct pci_version_request *version_req;
struct hv_pci_compl comp_pkt;
struct pci_packet *pkt;
@@ -2155,10 +2153,10 @@ static int hv_pci_protocol_negotiation(struct hv_device 
*hdev,
}
 
if (comp_pkt.completion_status >= 0) {
-   pci_protocol_version = version[i];
+   hbus->protocol_version = version[i];
dev_info(>device,
"PCI VMBus probing: Using version %#x\n",
-   pci_protocol_version);
+   hbus->protocol_version);
goto exit;
}
 
@@ -2442,7 +2440,7 @@ static int hv_send_resources_allocated(struct hv_device 
*hdev)
u32 wslot;
int ret;
 
-   size_res = (pci_protocol_version < PCI_PROTOCOL_VERSION_1_2)
+   size_res = (hbus->protocol_version < PCI_PROTOCOL_VERSION_1_2)
? sizeof(*res_assigned) : sizeof(*res_assigned2);
 
pkt = kmalloc(sizeof(*pkt) + size_res, GFP_KERNEL);
@@ -2461,7 +2459,7 @@ static int hv_send_resources_allocated(struct hv_device 
*hdev)
pkt->completion_func = hv_pci_generic_compl;
pkt->compl_ctxt = _pkt;
 
-   if (pci_protocol_version < PCI_PROTOCOL_VERSION_1_2) {
+   if (hbus->protocol_version < PCI_PROTOCOL_VERSION_1_2) {
res_assigned =
(struct pci_resources_assigned *)>message;
res_assigned->message_type.type =
@@ -2812,7 +2810,7 @@ static int hv_pci_resume(struct hv_device *hdev)
return ret;
 
/* Only use the version that was in use before hibernation. */
-   version[0] = pci_protocol_version;
+   version[0] = hbus->protocol_version;
ret = hv_pci_protocol_negotiation(hdev, version, 1);
if (ret)
goto out;
-- 
1.8.3.1



[PATCH 0/4] Enhance pci-hyperv to support hibernation

2019-09-11 Thread Dexuan Cui
This patchset is basically a pure Hyper-V specific change and it has a
build dependency on the commit 271b2224d42f ("Drivers: hv: vmbus: Implement
suspend/resume for VSC drivers for hibernation"), which is on Sasha Levin's
Hyper-V tree's hyperv-next branch:
https://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux.git/log/?h=hyperv-next

I request this patch should go through Sasha's tree rather than the
pci tree.

Dexuan Cui (4):
  PCI: hv: Reorganize the code in preparation of hibernation
  PCI: hv: Add the support of hibernation
  PCI: hv: Do not queue new work items on hibernation
  PCI: hv: Change pci_protocol_version to per-hbus

 drivers/pci/controller/pci-hyperv.c | 166 ++--
 1 file changed, 140 insertions(+), 26 deletions(-)

-- 
1.8.3.1



[PATCH 1/4] PCI: hv: Reorganize the code in preparation of hibernation

2019-09-11 Thread Dexuan Cui
There is no functional change. This is just preparatory to a later
patch which adds the hibernation support for the pci-hyperv driver.

Signed-off-by: Dexuan Cui 
---
 drivers/pci/controller/pci-hyperv.c | 43 -
 1 file changed, 28 insertions(+), 15 deletions(-)

diff --git a/drivers/pci/controller/pci-hyperv.c 
b/drivers/pci/controller/pci-hyperv.c
index 40b6254..03fa039 100644
--- a/drivers/pci/controller/pci-hyperv.c
+++ b/drivers/pci/controller/pci-hyperv.c
@@ -2080,7 +2080,9 @@ static void hv_pci_onchannelcallback(void *context)
  * failing if the host doesn't support the necessary protocol
  * level.
  */
-static int hv_pci_protocol_negotiation(struct hv_device *hdev)
+static int hv_pci_protocol_negotiation(struct hv_device *hdev,
+  enum pci_protocol_version_t version[],
+  int num_version)
 {
struct pci_version_request *version_req;
struct hv_pci_compl comp_pkt;
@@ -2104,8 +2106,8 @@ static int hv_pci_protocol_negotiation(struct hv_device 
*hdev)
version_req = (struct pci_version_request *)>message;
version_req->message_type.type = PCI_QUERY_PROTOCOL_VERSION;
 
-   for (i = 0; i < ARRAY_SIZE(pci_protocol_versions); i++) {
-   version_req->protocol_version = pci_protocol_versions[i];
+   for (i = 0; i < num_version; i++) {
+   version_req->protocol_version = version[i];
ret = vmbus_sendpacket(hdev->channel, version_req,
sizeof(struct pci_version_request),
(unsigned long)pkt, VM_PKT_DATA_INBAND,
@@ -2121,7 +2123,7 @@ static int hv_pci_protocol_negotiation(struct hv_device 
*hdev)
}
 
if (comp_pkt.completion_status >= 0) {
-   pci_protocol_version = pci_protocol_versions[i];
+   pci_protocol_version = version[i];
dev_info(>device,
"PCI VMBus probing: Using version %#x\n",
pci_protocol_version);
@@ -2572,7 +2574,8 @@ static int hv_pci_probe(struct hv_device *hdev,
 
hv_set_drvdata(hdev, hbus);
 
-   ret = hv_pci_protocol_negotiation(hdev);
+   ret = hv_pci_protocol_negotiation(hdev, pci_protocol_versions,
+ ARRAY_SIZE(pci_protocol_versions));
if (ret)
goto close;
 
@@ -2644,7 +2647,7 @@ static int hv_pci_probe(struct hv_device *hdev,
return ret;
 }
 
-static void hv_pci_bus_exit(struct hv_device *hdev)
+static int hv_pci_bus_exit(struct hv_device *hdev, bool hibernating)
 {
struct hv_pcibus_device *hbus = hv_get_drvdata(hdev);
struct {
@@ -2660,16 +2663,20 @@ static void hv_pci_bus_exit(struct hv_device *hdev)
 * access the per-channel ringbuffer any longer.
 */
if (hdev->channel->rescind)
-   return;
+   return 0;
 
-   /* Delete any children which might still exist. */
-   memset(, 0, sizeof(relations));
-   hv_pci_devices_present(hbus, );
+   if (!hibernating) {
+   /* Delete any children which might still exist. */
+   memset(, 0, sizeof(relations));
+   hv_pci_devices_present(hbus, );
+   }
 
ret = hv_send_resources_released(hdev);
-   if (ret)
+   if (ret) {
dev_err(>device,
"Couldn't send resources released packet(s)\n");
+   return ret;
+   }
 
memset(_packet, 0, sizeof(pkt.teardown_packet));
init_completion(_pkt.host_event);
@@ -2682,8 +2689,13 @@ static void hv_pci_bus_exit(struct hv_device *hdev)
   (unsigned long)_packet,
   VM_PKT_DATA_INBAND,
   VMBUS_DATA_PACKET_FLAG_COMPLETION_REQUESTED);
-   if (!ret)
-   wait_for_completion_timeout(_pkt.host_event, 10 * HZ);
+   if (ret)
+   return ret;
+
+   if (wait_for_completion_timeout(_pkt.host_event, 10 * HZ) == 0)
+   return -ETIMEDOUT;
+
+   return 0;
 }
 
 /**
@@ -2695,6 +2707,7 @@ static void hv_pci_bus_exit(struct hv_device *hdev)
 static int hv_pci_remove(struct hv_device *hdev)
 {
struct hv_pcibus_device *hbus;
+   int ret;
 
hbus = hv_get_drvdata(hdev);
if (hbus->state == hv_pcibus_installed) {
@@ -2707,7 +2720,7 @@ static int hv_pci_remove(struct hv_device *hdev)
hbus->state = hv_pcibus_removed;
}
 
-   hv_pci_bus_exit(hdev);
+   ret = hv_pci_bus_exit(hdev, false);
 
vmbus_close(hdev->channel);
 
@@ -2721,7 +2734,7 @@ static int hv_pci_remove(struct hv_device *hdev)
wait_for_completion(>remove_event);
destroy_workqueue(hbus->wq);
free_page((unsigned long)hbus);
-   return 0;
+   return ret;
 }
 
 static const 

[PATCH 3/4] PCI: hv: Do not queue new work items on hibernation

2019-09-11 Thread Dexuan Cui
We must make sure there is no pending work items before we call
vmbus_close().

Signed-off-by: Dexuan Cui 
---
 drivers/pci/controller/pci-hyperv.c | 33 ++---
 1 file changed, 30 insertions(+), 3 deletions(-)

diff --git a/drivers/pci/controller/pci-hyperv.c 
b/drivers/pci/controller/pci-hyperv.c
index 3b77a3a..2655df2 100644
--- a/drivers/pci/controller/pci-hyperv.c
+++ b/drivers/pci/controller/pci-hyperv.c
@@ -422,6 +422,7 @@ enum hv_pcibus_state {
hv_pcibus_init = 0,
hv_pcibus_probed,
hv_pcibus_installed,
+   hv_pcibus_removing,
hv_pcibus_removed,
hv_pcibus_maximum
 };
@@ -1841,6 +1842,12 @@ static void hv_pci_devices_present(struct 
hv_pcibus_device *hbus,
unsigned long flags;
bool pending_dr;
 
+   if (hbus->state == hv_pcibus_removing) {
+   dev_info(>hdev->device,
+"PCI VMBus BUS_RELATIONS: ignored\n");
+   return;
+   }
+
dr_wrk = kzalloc(sizeof(*dr_wrk), GFP_NOWAIT);
if (!dr_wrk)
return;
@@ -1957,11 +1964,19 @@ static void hv_eject_device_work(struct work_struct 
*work)
  */
 static void hv_pci_eject_device(struct hv_pci_dev *hpdev)
 {
+   struct hv_pcibus_device *hbus = hpdev->hbus;
+   struct hv_device *hdev = hbus->hdev;
+
+   if (hbus->state == hv_pcibus_removing) {
+   dev_info(>device, "PCI VMBus EJECT: ignored\n");
+   return;
+   }
+
hpdev->state = hv_pcichild_ejecting;
get_pcichild(hpdev);
INIT_WORK(>wrk, hv_eject_device_work);
-   get_hvpcibus(hpdev->hbus);
-   queue_work(hpdev->hbus->wq, >wrk);
+   get_hvpcibus(hbus);
+   queue_work(hbus->wq, >wrk);
 }
 
 /**
@@ -2757,9 +2772,21 @@ static int hv_pci_remove(struct hv_device *hdev)
 static int hv_pci_suspend(struct hv_device *hdev)
 {
struct hv_pcibus_device *hbus = hv_get_drvdata(hdev);
+   enum hv_pcibus_state old_state;
int ret;
 
-   /* XXX: Need to prevent any new work from being queued. */
+   tasklet_disable(>channel->callback_event);
+
+   /* Change the hbus state to prevent new work items. */
+   old_state = hbus->state;
+   if (hbus->state == hv_pcibus_installed)
+   hbus->state = hv_pcibus_removing;
+
+   tasklet_enable(>channel->callback_event);
+
+   if (old_state != hv_pcibus_installed)
+   return -EINVAL;
+
flush_workqueue(hbus->wq);
 
ret = hv_pci_bus_exit(hdev, true);
-- 
1.8.3.1



[PATCH 2/4] PCI: hv: Add the support of hibernation

2019-09-11 Thread Dexuan Cui
Implement the suspend/resume callbacks for hibernation.

hv_pci_suspend() needs to prevent any new work from being queued: a later
patch will address this issue.

Signed-off-by: Dexuan Cui 
---
 drivers/pci/controller/pci-hyperv.c | 76 +
 1 file changed, 76 insertions(+)

diff --git a/drivers/pci/controller/pci-hyperv.c 
b/drivers/pci/controller/pci-hyperv.c
index 03fa039..3b77a3a 100644
--- a/drivers/pci/controller/pci-hyperv.c
+++ b/drivers/pci/controller/pci-hyperv.c
@@ -1398,6 +1398,23 @@ static void prepopulate_bars(struct hv_pcibus_device 
*hbus)
 
spin_lock_irqsave(>device_list_lock, flags);
 
+   /*
+* Clear the memory enable bit, in case it's already set. This occurs
+* in the suspend path of hibernation, where the device is suspended,
+* resumed and suspended again: see hibernation_snapshot() and
+* hibernation_platform_enter().
+*
+* If the memory enable bit is already set, Hyper-V sliently ignores
+* the below BAR updates, and the related PCI device driver can not
+* work, because reading from the device register(s) always returns
+* 0x.
+*/
+   list_for_each_entry(hpdev, >children, list_entry) {
+   _hv_pcifront_read_config(hpdev, PCI_COMMAND, 2, );
+   command &= ~PCI_COMMAND_MEMORY;
+   _hv_pcifront_write_config(hpdev, PCI_COMMAND, 2, command);
+   }
+
/* Pick addresses for the BARs. */
do {
list_for_each_entry(hpdev, >children, list_entry) {
@@ -2737,6 +2754,63 @@ static int hv_pci_remove(struct hv_device *hdev)
return ret;
 }
 
+static int hv_pci_suspend(struct hv_device *hdev)
+{
+   struct hv_pcibus_device *hbus = hv_get_drvdata(hdev);
+   int ret;
+
+   /* XXX: Need to prevent any new work from being queued. */
+   flush_workqueue(hbus->wq);
+
+   ret = hv_pci_bus_exit(hdev, true);
+   if (ret)
+   return ret;
+
+   vmbus_close(hdev->channel);
+
+   return 0;
+}
+
+static int hv_pci_resume(struct hv_device *hdev)
+{
+   struct hv_pcibus_device *hbus = hv_get_drvdata(hdev);
+   enum pci_protocol_version_t version[1];
+   int ret;
+
+   hbus->state = hv_pcibus_init;
+
+   ret = vmbus_open(hdev->channel, pci_ring_size, pci_ring_size, NULL, 0,
+hv_pci_onchannelcallback, hbus);
+   if (ret)
+   return ret;
+
+   /* Only use the version that was in use before hibernation. */
+   version[0] = pci_protocol_version;
+   ret = hv_pci_protocol_negotiation(hdev, version, 1);
+   if (ret)
+   goto out;
+
+   ret = hv_pci_query_relations(hdev);
+   if (ret)
+   goto out;
+
+   ret = hv_pci_enter_d0(hdev);
+   if (ret)
+   goto out;
+
+   ret = hv_send_resources_allocated(hdev);
+   if (ret)
+   goto out;
+
+   prepopulate_bars(hbus);
+
+   hbus->state = hv_pcibus_installed;
+   return 0;
+out:
+   vmbus_close(hdev->channel);
+   return ret;
+}
+
 static const struct hv_vmbus_device_id hv_pci_id_table[] = {
/* PCI Pass-through Class ID */
/* 44C4F61D--4400-9D52-802E27EDE19F */
@@ -2751,6 +2825,8 @@ static int hv_pci_remove(struct hv_device *hdev)
.id_table   = hv_pci_id_table,
.probe  = hv_pci_probe,
.remove = hv_pci_remove,
+   .suspend= hv_pci_suspend,
+   .resume = hv_pci_resume,
 };
 
 static void __exit exit_hv_pci_drv(void)
-- 
1.8.3.1



[PATCH][PATCH net-next] hv_sock: Add the support of hibernation

2019-09-11 Thread Dexuan Cui
Add the necessary dummy callbacks for hibernation.

Signed-off-by: Dexuan Cui 
---
This patch is basically a pure Hyper-V specific change and it has a
build dependency on the commit 271b2224d42f ("Drivers: hv: vmbus: Implement
suspend/resume for VSC drivers for hibernation"), which is on Sasha Levin's
Hyper-V tree's hyperv-next branch:
https://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux.git/log/?h=hyperv-next

I request this patch should go through Sasha's tree rather than the
net-next tree.

 net/vmw_vsock/hyperv_transport.c | 20 
 1 file changed, 20 insertions(+)

diff --git a/net/vmw_vsock/hyperv_transport.c b/net/vmw_vsock/hyperv_transport.c
index f2084e3..e91a884 100644
--- a/net/vmw_vsock/hyperv_transport.c
+++ b/net/vmw_vsock/hyperv_transport.c
@@ -930,6 +930,24 @@ static int hvs_remove(struct hv_device *hdev)
return 0;
 }
 
+/* hv_sock connections can not persist across hibernation, and all the hv_sock
+ * channels are forceed to be rescinded before hibernation: see
+ * vmbus_bus_suspend(). Here the dummy hvs_suspend() and hvs_resume()
+ * are only needed because hibernation requires that every device's driver
+ * should have a .suspend and .resume callback: see vmbus_suspend().
+ */
+static int hvs_suspend(struct hv_device *hv_dev)
+{
+   /* Dummy */
+   return 0;
+}
+
+static int hvs_resume(struct hv_device *dev)
+{
+   /* Dummy */
+   return 0;
+}
+
 /* This isn't really used. See vmbus_match() and vmbus_probe() */
 static const struct hv_vmbus_device_id id_table[] = {
{},
@@ -941,6 +959,8 @@ static int hvs_remove(struct hv_device *hdev)
.id_table   = id_table,
.probe  = hvs_probe,
.remove = hvs_remove,
+   .suspend= hvs_suspend,
+   .resume = hvs_resume,
 };
 
 static int __init hvs_init(void)
-- 
1.8.3.1



[PATCH][PATCH net-next] hv_netvsc: Add the support of hibernation

2019-09-11 Thread Dexuan Cui
The existing netvsc_detach() and netvsc_attach() APIs make it easy to
implement the suspend/resume callbacks.

Signed-off-by: Dexuan Cui 
---

This patch is basically a pure Hyper-V specific change and it has a
build dependency on the commit 271b2224d42f ("Drivers: hv: vmbus: Implement
suspend/resume for VSC drivers for hibernation"), which is on Sasha Levin's
Hyper-V tree's hyperv-next branch:
https://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux.git/log/?h=hyperv-next

I request this patch should go through Sasha's tree rather than the
net-next tree.

 drivers/net/hyperv/hyperv_net.h |  3 +++
 drivers/net/hyperv/netvsc_drv.c | 59 +
 2 files changed, 62 insertions(+)

diff --git a/drivers/net/hyperv/hyperv_net.h b/drivers/net/hyperv/hyperv_net.h
index ecc9af0..b8763ee 100644
--- a/drivers/net/hyperv/hyperv_net.h
+++ b/drivers/net/hyperv/hyperv_net.h
@@ -952,6 +952,9 @@ struct net_device_context {
u32 vf_alloc;
/* Serial number of the VF to team with */
u32 vf_serial;
+
+   /* Used to temporarily save the config info across hibernation */
+   struct netvsc_device_info *saved_netvsc_dev_info;
 };
 
 /* Per channel data */
diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c
index afdcc56..f920959 100644
--- a/drivers/net/hyperv/netvsc_drv.c
+++ b/drivers/net/hyperv/netvsc_drv.c
@@ -2392,6 +2392,63 @@ static int netvsc_remove(struct hv_device *dev)
return 0;
 }
 
+static int netvsc_suspend(struct hv_device *dev)
+{
+   struct net_device_context *ndev_ctx;
+   struct net_device *vf_netdev, *net;
+   struct netvsc_device *nvdev;
+   int ret;
+
+   net = hv_get_drvdata(dev);
+
+   ndev_ctx = netdev_priv(net);
+   cancel_delayed_work_sync(_ctx->dwork);
+
+   rtnl_lock();
+
+   nvdev = rtnl_dereference(ndev_ctx->nvdev);
+   if (nvdev == NULL) {
+   ret = -ENODEV;
+   goto out;
+   }
+
+   cancel_work_sync(>subchan_work);
+
+   vf_netdev = rtnl_dereference(ndev_ctx->vf_netdev);
+   if (vf_netdev)
+   netvsc_unregister_vf(vf_netdev);
+
+   /* Save the current config info */
+   ndev_ctx->saved_netvsc_dev_info = netvsc_devinfo_get(nvdev);
+
+   ret = netvsc_detach(net, nvdev);
+out:
+   rtnl_unlock();
+
+   return ret;
+}
+
+static int netvsc_resume(struct hv_device *dev)
+{
+   struct net_device *net = hv_get_drvdata(dev);
+   struct net_device_context *net_device_ctx;
+   struct netvsc_device_info *device_info;
+   int ret;
+
+   rtnl_lock();
+
+   net_device_ctx = netdev_priv(net);
+   device_info = net_device_ctx->saved_netvsc_dev_info;
+
+   ret = netvsc_attach(net, device_info);
+
+   rtnl_unlock();
+
+   kfree(device_info);
+   net_device_ctx->saved_netvsc_dev_info = NULL;
+
+   return ret;
+}
 static const struct hv_vmbus_device_id id_table[] = {
/* Network guid */
{ HV_NIC_GUID, },
@@ -2406,6 +2463,8 @@ static int netvsc_remove(struct hv_device *dev)
.id_table = id_table,
.probe = netvsc_probe,
.remove = netvsc_remove,
+   .suspend = netvsc_suspend,
+   .resume = netvsc_resume,
.driver = {
.probe_type = PROBE_FORCE_SYNCHRONOUS,
},
-- 
1.8.3.1



[PATCH] hv_balloon: Add the support of hibernation

2019-09-11 Thread Dexuan Cui
When hibernation is enabled, we must ignore the balloon up/down and
hot-add requests from the host, if any.

Fow now, if people want to test hibernation, please blacklist hv_balloon
or do not enable Dynamic Memory and Memory Resizing. See the comment in
balloon_probe() for more info.

Signed-off-by: Dexuan Cui 
---

This patch is basically a pure Hyper-V specific change and it has a
build dependency on the commit 271b2224d42f ("Drivers: hv: vmbus: Implement
suspend/resume for VSC drivers for hibernation"), which is on Sasha Levin's
Hyper-V tree's hyperv-next branch:
https://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux.git/log/?h=hyperv-next

I request this patch should go through Sasha's tree rather than the
other tree(s).

 drivers/hv/hv_balloon.c | 101 +++-
 1 file changed, 99 insertions(+), 2 deletions(-)

diff --git a/drivers/hv/hv_balloon.c b/drivers/hv/hv_balloon.c
index 34bd735..7df0f67 100644
--- a/drivers/hv/hv_balloon.c
+++ b/drivers/hv/hv_balloon.c
@@ -24,6 +24,8 @@
 
 #include 
 
+#include 
+
 #define CREATE_TRACE_POINTS
 #include "hv_trace_balloon.h"
 
@@ -457,6 +459,7 @@ struct hot_add_wrk {
struct work_struct wrk;
 };
 
+static bool allow_hibernation;
 static bool hot_add = true;
 static bool do_hot_add;
 /*
@@ -1053,8 +1056,12 @@ static void hot_add_req(struct work_struct *dummy)
else
resp.result = 0;
 
-   if (!do_hot_add || (resp.page_count == 0))
-   pr_err("Memory hot add failed\n");
+   if (!do_hot_add || resp.page_count == 0) {
+   if (!allow_hibernation)
+   pr_err("Memory hot add failed\n");
+   else
+   pr_info("Ignore hot-add request!\n");
+   }
 
dm->state = DM_INITIALIZED;
resp.hdr.trans_id = atomic_inc_return(_id);
@@ -1509,6 +1516,11 @@ static void balloon_onchannelcallback(void *context)
break;
 
case DM_BALLOON_REQUEST:
+   if (allow_hibernation) {
+   pr_info("Ignore balloon-up request!\n");
+   break;
+   }
+
if (dm->state == DM_BALLOON_UP)
pr_warn("Currently ballooning\n");
bal_msg = (struct dm_balloon *)recv_buffer;
@@ -1518,6 +1530,11 @@ static void balloon_onchannelcallback(void *context)
break;
 
case DM_UNBALLOON_REQUEST:
+   if (allow_hibernation) {
+   pr_info("Ignore balloon-down request!\n");
+   break;
+   }
+
dm->state = DM_BALLOON_DOWN;
balloon_down(dm,
 (struct dm_unballoon_request *)recv_buffer);
@@ -1623,6 +1640,11 @@ static int balloon_connect_vsp(struct hv_device *dev)
cap_msg.hdr.size = sizeof(struct dm_capabilities);
cap_msg.hdr.trans_id = atomic_inc_return(_id);
 
+   /*
+* When hibernation (i.e. virtual ACPI S4 state) is enabled, the host
+* currently still requires the bits to be set, so we have to add code
+* to fail the host's hot-add and balloon up/down requests, if any.
+*/
cap_msg.caps.cap_bits.balloon = 1;
cap_msg.caps.cap_bits.hot_add = 1;
 
@@ -1672,6 +1694,24 @@ static int balloon_probe(struct hv_device *dev,
 {
int ret;
 
+#if 0
+   /*
+* The patch to implement hv_is_hibernation_supported() is going
+* through the tip tree. For now, let's hardcode allow_hibernation
+* to false to keep the current behavior of hv_balloon. If people
+* want to test hibernation, please blacklist hv_balloon fow now
+* or do not enable Dynamid Memory and Memory Resizing.
+*
+* We'll remove the conditional compilation as soon as
+* hv_is_hibernation_supported() is available in the mainline tree.
+*/
+   allow_hibernation = hv_is_hibernation_supported();
+#else
+   allow_hibernation = false;
+#endif
+   if (allow_hibernation)
+   hot_add = false;
+
 #ifdef CONFIG_MEMORY_HOTPLUG
do_hot_add = hot_add;
 #else
@@ -1711,6 +1751,8 @@ static int balloon_probe(struct hv_device *dev,
return 0;
 
 probe_error:
+   dm_device.state = DM_INIT_ERROR;
+   dm_device.thread  = NULL;
vmbus_close(dev->channel);
 #ifdef CONFIG_MEMORY_HOTPLUG
unregister_memory_notifier(_memory_nb);
@@ -1752,6 +1794,59 @@ static int balloon_remove(struct hv_device *dev)
return 0;
 }
 
+static int balloon_suspend(struct hv_device *hv_dev)
+{
+   struct hv_dynmem_device *dm = hv_get_drvdata(hv_dev);
+
+   tasklet_disable(_dev->channel->callback_event);
+
+   cancel_work_sync(>balloon_wrk.wrk);
+   cancel_work_sync(>ha_wrk.wrk);
+
+   if (dm->thread) {
+ 

[PATCH] Input: hyperv-keyboard: Add the support of hibernation

2019-09-11 Thread Dexuan Cui
We need hv_kbd_pm_notify() to make sure the pm_wakeup_hard_event() call
does not prevent the system from entering hibernation: the hibernation
is a relatively long process, which can be aborted by the call
pm_wakeup_hard_event(), which is invoked upon keyboard events.

Signed-off-by: Dexuan Cui 
---

This patch is basically a pure Hyper-V specific change and it has a
build dependency on the commit 271b2224d42f ("Drivers: hv: vmbus: Implement
suspend/resume for VSC drivers for hibernation"), which is on Sasha Levin's
Hyper-V tree's hyperv-next branch:
https://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux.git/log/?h=hyperv-next

I request this patch should go through Sasha's tree rather than the
input subsystemi's tree.

Hi Dmitry, can you please Ack?

 drivers/input/serio/hyperv-keyboard.c | 68 ---
 1 file changed, 63 insertions(+), 5 deletions(-)

diff --git a/drivers/input/serio/hyperv-keyboard.c 
b/drivers/input/serio/hyperv-keyboard.c
index 88ae7c2..277dc4c 100644
--- a/drivers/input/serio/hyperv-keyboard.c
+++ b/drivers/input/serio/hyperv-keyboard.c
@@ -10,6 +10,7 @@
 #include 
 #include 
 #include 
+#include 
 
 /*
  * Current version 1.0
@@ -95,6 +96,9 @@ struct hv_kbd_dev {
struct completion wait_event;
spinlock_t lock; /* protects 'started' field */
bool started;
+
+   struct notifier_block pm_nb;
+   bool hibernation_in_progress;
 };
 
 static void hv_kbd_on_receive(struct hv_device *hv_dev,
@@ -168,7 +172,7 @@ static void hv_kbd_on_receive(struct hv_device *hv_dev,
 * "echo freeze > /sys/power/state" can't really enter the
 * state because the Enter-UP can trigger a wakeup at once.
 */
-   if (!(info & IS_BREAK))
+   if (!(info & IS_BREAK) && !kbd_dev->hibernation_in_progress)
pm_wakeup_hard_event(_dev->device);
 
break;
@@ -179,10 +183,10 @@ static void hv_kbd_on_receive(struct hv_device *hv_dev,
}
 }
 
-static void hv_kbd_handle_received_packet(struct hv_device *hv_dev,
- struct vmpacket_descriptor *desc,
- u32 bytes_recvd,
- u64 req_id)
+static void
+hv_kbd_handle_received_packet(struct hv_device *hv_dev,
+ const struct vmpacket_descriptor *desc,
+ u32 bytes_recvd, u64 req_id)
 {
struct synth_kbd_msg *msg;
u32 msg_sz;
@@ -282,6 +286,8 @@ static int hv_kbd_connect_to_vsp(struct hv_device *hv_dev)
u32 proto_status;
int error;
 
+   reinit_completion(_dev->wait_event);
+
request = _dev->protocol_req;
memset(request, 0, sizeof(struct synth_kbd_protocol_request));
request->header.type = __cpu_to_le32(SYNTH_KBD_PROTOCOL_REQUEST);
@@ -332,6 +338,29 @@ static void hv_kbd_stop(struct serio *serio)
spin_unlock_irqrestore(_dev->lock, flags);
 }
 
+static int hv_kbd_pm_notify(struct notifier_block *nb,
+   unsigned long val, void *ign)
+{
+   struct hv_kbd_dev *kbd_dev;
+
+   kbd_dev = container_of(nb, struct hv_kbd_dev, pm_nb);
+
+   switch (val) {
+   case PM_HIBERNATION_PREPARE:
+   case PM_RESTORE_PREPARE:
+   kbd_dev->hibernation_in_progress = true;
+   return NOTIFY_OK;
+
+   case PM_POST_HIBERNATION:
+   case PM_POST_RESTORE:
+   kbd_dev->hibernation_in_progress = false;
+   return NOTIFY_OK;
+
+   default:
+   return NOTIFY_DONE;
+   }
+}
+
 static int hv_kbd_probe(struct hv_device *hv_dev,
const struct hv_vmbus_device_id *dev_id)
 {
@@ -380,6 +409,9 @@ static int hv_kbd_probe(struct hv_device *hv_dev,
 
device_init_wakeup(_dev->device, true);
 
+   kbd_dev->pm_nb.notifier_call = hv_kbd_pm_notify;
+   register_pm_notifier(_dev->pm_nb);
+
return 0;
 
 err_close_vmbus:
@@ -394,6 +426,7 @@ static int hv_kbd_remove(struct hv_device *hv_dev)
 {
struct hv_kbd_dev *kbd_dev = hv_get_drvdata(hv_dev);
 
+   unregister_pm_notifier(_dev->pm_nb);
serio_unregister_port(kbd_dev->hv_serio);
vmbus_close(hv_dev->channel);
kfree(kbd_dev);
@@ -403,6 +436,29 @@ static int hv_kbd_remove(struct hv_device *hv_dev)
return 0;
 }
 
+static int hv_kbd_suspend(struct hv_device *hv_dev)
+{
+   vmbus_close(hv_dev->channel);
+
+   return 0;
+}
+
+static int hv_kbd_resume(struct hv_device *hv_dev)
+{
+   int ret;
+
+   ret = vmbus_open(hv_dev->channel,
+KBD_VSC_SEND_RING_BUFFER_SIZE,
+KBD_VSC_RECV_RING_BUFFER_SIZE,
+NULL, 0,
+hv_kbd_on_channel_callback,
+hv_dev);
+   if (ret == 0)
+   ret = hv_kbd_connect_to_vsp(hv_dev);
+
+   return ret;
+}

[PATCH] HID: hyperv: Add the support of hibernation

2019-09-11 Thread Dexuan Cui
We need mousevsc_pm_notify() to make sure the pm_wakeup_hard_event() call
does not prevent the system from entering hibernation: the hibernation
is a relatively long process, which can be aborted by the call
pm_wakeup_hard_event(), which is invoked upon mouse events.

Signed-off-by: Dexuan Cui 
---

This patch is basically a pure Hyper-V specific change and it has a
build dependency on the commit 271b2224d42f ("Drivers: hv: vmbus: Implement
suspend/resume for VSC drivers for hibernation"), which is on Sasha Levin's
Hyper-V tree's hyperv-next branch:
https://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux.git/log/?h=hyperv-next

I request this patch should go through Sasha's tree rather than the
input subsystem's tree.

Hi Jiri, Benjamin, can you please Ack?

 drivers/hid/hid-hyperv.c | 71 ++--
 1 file changed, 69 insertions(+), 2 deletions(-)

diff --git a/drivers/hid/hid-hyperv.c b/drivers/hid/hid-hyperv.c
index cc5b09b8..e798740 100644
--- a/drivers/hid/hid-hyperv.c
+++ b/drivers/hid/hid-hyperv.c
@@ -12,6 +12,7 @@
 #include 
 #include 
 #include 
+#include 
 
 
 struct hv_input_dev_info {
@@ -150,6 +151,9 @@ struct mousevsc_dev {
struct hv_input_dev_info hid_dev_info;
struct hid_device   *hid_device;
u8  input_buf[HID_MAX_BUFFER_SIZE];
+
+   struct notifier_block   pm_nb;
+   boolhibernation_in_progress;
 };
 
 
@@ -192,6 +196,9 @@ static void mousevsc_on_receive_device_info(struct 
mousevsc_dev *input_device,
if (desc->bLength == 0)
goto cleanup;
 
+   /* The pointer is not NULL when we resume from hibernation */
+   if (input_device->hid_desc != NULL)
+   kfree(input_device->hid_desc);
input_device->hid_desc = kmemdup(desc, desc->bLength, GFP_ATOMIC);
 
if (!input_device->hid_desc)
@@ -203,6 +210,9 @@ static void mousevsc_on_receive_device_info(struct 
mousevsc_dev *input_device,
goto cleanup;
}
 
+   /* The pointer is not NULL when we resume from hibernation */
+   if (input_device->report_desc != NULL)
+   kfree(input_device->report_desc);
input_device->report_desc = kzalloc(input_device->report_desc_size,
  GFP_ATOMIC);
 
@@ -243,7 +253,7 @@ static void mousevsc_on_receive_device_info(struct 
mousevsc_dev *input_device,
 }
 
 static void mousevsc_on_receive(struct hv_device *device,
-   struct vmpacket_descriptor *packet)
+   const struct vmpacket_descriptor *packet)
 {
struct pipe_prt_msg *pipe_msg;
struct synthhid_msg *hid_msg;
@@ -301,7 +311,8 @@ static void mousevsc_on_receive(struct hv_device *device,
hid_input_report(input_dev->hid_device, HID_INPUT_REPORT,
 input_dev->input_buf, len, 1);
 
-   pm_wakeup_hard_event(_dev->device->device);
+   if (!input_dev->hibernation_in_progress)
+   pm_wakeup_hard_event(_dev->device->device);
 
break;
default:
@@ -378,6 +389,8 @@ static int mousevsc_connect_to_vsp(struct hv_device *device)
struct mousevsc_prt_msg *request;
struct mousevsc_prt_msg *response;
 
+   reinit_completion(_dev->wait_event);
+
request = _dev->protocol_req;
memset(request, 0, sizeof(struct mousevsc_prt_msg));
 
@@ -475,6 +488,29 @@ static int mousevsc_hid_raw_request(struct hid_device *hid,
 
 static struct hid_driver mousevsc_hid_driver;
 
+static int mousevsc_pm_notify(struct notifier_block *nb,
+ unsigned long val, void *ign)
+{
+   struct mousevsc_dev *input_dev;
+
+   input_dev = container_of(nb, struct mousevsc_dev, pm_nb);
+
+   switch (val) {
+   case PM_HIBERNATION_PREPARE:
+   case PM_RESTORE_PREPARE:
+   input_dev->hibernation_in_progress = true;
+   return NOTIFY_OK;
+
+   case PM_POST_HIBERNATION:
+   case PM_POST_RESTORE:
+   input_dev->hibernation_in_progress = false;
+   return NOTIFY_OK;
+
+   default:
+   return NOTIFY_DONE;
+   }
+}
+
 static int mousevsc_probe(struct hv_device *device,
const struct hv_vmbus_device_id *dev_id)
 {
@@ -549,6 +585,9 @@ static int mousevsc_probe(struct hv_device *device,
input_dev->connected = true;
input_dev->init_complete = true;
 
+   input_dev->pm_nb.notifier_call = mousevsc_pm_notify;
+   register_pm_notifier(_dev->pm_nb);
+
return ret;
 
 probe_err2:
@@ -568,6 +607,8 @@ static int mousevsc_remove(struct hv_device *dev)
 {
struct mousevsc_dev *input_dev = hv_get_drvdata(dev);
 
+   unregister_pm_notifier(_dev->pm_nb);
+
device_init_wakeup(>device, false);
vmbus_close(dev->channel);
hid_hw_stop(input_dev->hid_device);
@@ -577,6 

[PATCH] scsi: storvsc: Add the support of hibernation

2019-09-11 Thread Dexuan Cui
When we're in storvsc_suspend(), we're sure the SCSI layer has quiesced the
scsi device by scsi_bus_suspend() -> ... -> scsi_device_quiesce(), so the
low level SCSI adapter driver only needs to suspend/resume its own state.

Signed-off-by: Dexuan Cui 
---

This patch is basically a pure Hyper-V specific change and it has a
build dependency on the commit 271b2224d42f ("Drivers: hv: vmbus: Implement
suspend/resume for VSC drivers for hibernation"), which is on Sasha Levin's
Hyper-V tree's hyperv-next branch:
https://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux.git/log/?h=hyperv-next

I request this patch should go through Sasha's tree rather than the
SCSI tree.

 drivers/scsi/storvsc_drv.c | 41 +
 1 file changed, 41 insertions(+)

diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c
index ed8b9ac..9fbf604 100644
--- a/drivers/scsi/storvsc_drv.c
+++ b/drivers/scsi/storvsc_drv.c
@@ -1727,6 +1727,13 @@ enum {
 
 MODULE_DEVICE_TABLE(vmbus, id_table);
 
+static const struct { guid_t guid; } fc_guid = { HV_SYNTHFC_GUID };
+
+static bool hv_dev_is_fc(struct hv_device *hv_dev)
+{
+   return guid_equal(_guid.guid, _dev->dev_type);
+}
+
 static int storvsc_probe(struct hv_device *device,
const struct hv_vmbus_device_id *dev_id)
 {
@@ -1935,11 +1942,45 @@ static int storvsc_remove(struct hv_device *dev)
return 0;
 }
 
+static int storvsc_suspend(struct hv_device *hv_dev)
+{
+   struct storvsc_device *stor_device = hv_get_drvdata(hv_dev);
+   struct Scsi_Host *host = stor_device->host;
+   struct hv_host_device *host_dev = shost_priv(host);
+
+   storvsc_wait_to_drain(stor_device);
+
+   drain_workqueue(host_dev->handle_error_wq);
+
+   vmbus_close(hv_dev->channel);
+
+   memset(stor_device->stor_chns, 0,
+  num_possible_cpus() * sizeof(void *));
+
+   kfree(stor_device->stor_chns);
+   stor_device->stor_chns = NULL;
+
+   cpumask_clear(_device->alloced_cpus);
+
+   return 0;
+}
+
+static int storvsc_resume(struct hv_device *hv_dev)
+{
+   int ret;
+
+   ret = storvsc_connect_to_vsp(hv_dev, storvsc_ringbuffer_size,
+hv_dev_is_fc(hv_dev));
+   return ret;
+}
+
 static struct hv_driver storvsc_drv = {
.name = KBUILD_MODNAME,
.id_table = id_table,
.probe = storvsc_probe,
.remove = storvsc_remove,
+   .suspend = storvsc_suspend,
+   .resume = storvsc_resume,
.driver = {
.probe_type = PROBE_PREFER_ASYNCHRONOUS,
},
-- 
1.8.3.1



Re: [PATCH] gpio: remove explicit comparison with 0

2019-09-11 Thread Linus Walleij
On Sat, Sep 7, 2019 at 6:39 PM Saiyam Doshi  wrote:

> No need to compare return value with 0. In case of non-zero
> return value, the if condition will be true.
>
> This makes intent a bit more clear to the reader.
> "if (x) then", compared to "if (x is not zero) then".
>
> Signed-off-by: Saiyam Doshi 

Patch applied.

Yours,
Linus Walleij


[PATCH] module: Remove leftover '#undef' from export header

2019-09-11 Thread Will Deacon
Commit 7290d5809571 ("module: use relative references for __ksymtab
entries") converted the '__put' #define into an assembly macro in
asm-generic/export.h but forgot to remove the corresponding '#undef'.

Remove the leftover '#undef'.

Cc: Ard Biesheuvel 
Cc: Jessica Yu 
Signed-off-by: Will Deacon 
---

I spotted this trivial issue when debugging the namespace issue earlier
today, so here's the patch.

 include/asm-generic/export.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/include/asm-generic/export.h b/include/asm-generic/export.h
index 294d6ae785d4..153d9c2ee580 100644
--- a/include/asm-generic/export.h
+++ b/include/asm-generic/export.h
@@ -57,7 +57,6 @@ __kcrctab_\name:
 #endif
 #endif
 .endm
-#undef __put
 
 #if defined(CONFIG_TRIM_UNUSED_KSYMS)
 
-- 
2.23.0.237.gc6a4ce50a0-goog



Re: [PATCH net-next] net: stmmac: pci: Add HAPS support using GMAC5

2019-09-11 Thread David Miller
From: Jose Abreu 
Date: Mon,  9 Sep 2019 18:54:26 +0200

> Add the support for Synopsys HAPS board that uses GMAC5.
> 
> Signed-off-by: Jose Abreu 

Applied.


Re: [PATCH v4 2/2] net: phy: dp83867: Add SGMII mode type switching

2019-09-11 Thread David Miller
From: Vitaly Gaiduk 
Date: Mon,  9 Sep 2019 20:19:24 +0300

> This patch adds ability to switch beetween two PHY SGMII modes.
> Some hardware, for example, FPGA IP designs may use 6-wire mode
> which enables differential SGMII clock to MAC.
> 
> Signed-off-by: Vitaly Gaiduk 

Applied.


  1   2   3   4   5   6   7   8   >