date:20190814

[PATCH] i2c: stm32f7: Make structure stm32f7_i2c_algo constant

2019-08-14 Thread Nishka Dasgupta

Static structure stm32f7_i2c_algo, of type i2c_algorithm, is used only
when it is assigned to constant field algo of a variable having type
i2c_adapter. As stm32f7_i2c_algo is therefore never modified, make it
const as well to protect it from unintended modification.
Issue found with Coccinelle.

Signed-off-by: Nishka Dasgupta 
---
 drivers/i2c/busses/i2c-stm32f7.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/i2c/busses/i2c-stm32f7.c b/drivers/i2c/busses/i2c-stm32f7.c
index 266d1c269b83..d36cf08461f7 100644
--- a/drivers/i2c/busses/i2c-stm32f7.c
+++ b/drivers/i2c/busses/i2c-stm32f7.c
@@ -1809,7 +1809,7 @@ static u32 stm32f7_i2c_func(struct i2c_adapter *adap)
I2C_FUNC_SMBUS_I2C_BLOCK;
 }
 
-static struct i2c_algorithm stm32f7_i2c_algo = {
+static const struct i2c_algorithm stm32f7_i2c_algo = {
.master_xfer = stm32f7_i2c_xfer,
.smbus_xfer = stm32f7_i2c_smbus_xfer,
.functionality = stm32f7_i2c_func,
-- 
2.19.1

Re: [RFC PATCH 1/5] x86: tsc: add tsc to art helpers

2019-08-14 Thread Felipe Balbi



Hi,


Thomas Gleixner  writes:

> Felipe,
>
> On Tue, 16 Jul 2019, Felipe Balbi wrote:
>
> -ENOCHANGELOG
>
> As you said in the cover letter:
>
>>  (3) The change in arch/x86/kernel/tsc.c needs to be reviewed at length
>>  before going in.
>
> So some information what those interfaces are used for and why they are
> needed would be really helpful.

Okay, I have some more details about this. The TGPIO device itself uses
ART since TSC is not directly available to anything other than the
CPU. The 'problem' here is that reading ART incurs extra latency which
we would like to avoid. Therefore, we use TSC and scale it to
nanoseconds which, would be the same as ART to ns.

>> +void get_tsc_ns(struct system_counterval_t *tsc_counterval, u64 *tsc_ns)
>> +{
>> +u64 tmp, res, rem;
>> +u64 cycles;
>> +
>> +tsc_counterval->cycles = clocksource_tsc.read(NULL);
>> +cycles = tsc_counterval->cycles;
>> +tsc_counterval->cs = art_related_clocksource;
>> +
>> +rem = do_div(cycles, tsc_khz);
>> +
>> +res = cycles * USEC_PER_SEC;
>> +tmp = rem * USEC_PER_SEC;
>> +
>> +do_div(tmp, tsc_khz);
>> +res += tmp;
>> +
>> +*tsc_ns = res;
>> +}
>> +EXPORT_SYMBOL(get_tsc_ns);
>> +
>> +u64 get_art_ns_now(void)
>> +{
>> +struct system_counterval_t tsc_cycles;
>> +u64 tsc_ns;
>> +
>> +get_tsc_ns(_cycles, _ns);
>> +
>> +return tsc_ns;
>> +}
>> +EXPORT_SYMBOL(get_art_ns_now);
>
> While the changes look innocuous I'm missing the big picture why this needs
> to emulate ART instead of simply using TSC directly.

i don't think we're emulating ART here (other than the name in the
function). We're just reading TSC and converting to nanoseconds, right?

Cheers

-- 
balbi

[PATCH] Bluetooth: 6lowpan: Make variable header_ops constant

2019-08-14 Thread Nishka Dasgupta

Static variable header_ops, of type header_ops, is used only once, when
it is assigned to field header_ops of a variable having type net_device.
This corresponding field is declared as const in the definition of
net_device. Hence make header_ops constant as well to protect it from
unnecessary modification.
Issue found with Coccinelle.

Signed-off-by: Nishka Dasgupta 
---
 net/bluetooth/6lowpan.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/bluetooth/6lowpan.c b/net/bluetooth/6lowpan.c
index 9d41de1ec90f..bb55d92691b0 100644
--- a/net/bluetooth/6lowpan.c
+++ b/net/bluetooth/6lowpan.c
@@ -583,7 +583,7 @@ static const struct net_device_ops netdev_ops = {
.ndo_start_xmit = bt_xmit,
 };
 
-static struct header_ops header_ops = {
+static const struct header_ops header_ops = {
.create = header_create,
 };
 
-- 
2.19.1

[PATCH] Bluetooth: hci_qca: Make structure qca_proto constant

2019-08-14 Thread Nishka Dasgupta

Static structure qca_proto, of type hci_uart_proto, is used four times:
as the last argument in function hci_uart_register_device(), and as the
only argument to functions hci_uart_register_proto() and
hci_uart_unregister_proto(). In all three of these functions, the
parameter corresponding to qca_proto is declared as constant. Therefore,
make qca_proto itself constant as well in order to protect it from
unintended modification.
Issue found with Coccinelle.

Signed-off-by: Nishka Dasgupta 
---
 drivers/bluetooth/hci_qca.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/bluetooth/hci_qca.c b/drivers/bluetooth/hci_qca.c
index 82a0a3691a63..80923fc9418f 100644
--- a/drivers/bluetooth/hci_qca.c
+++ b/drivers/bluetooth/hci_qca.c
@@ -1326,7 +1326,7 @@ static int qca_setup(struct hci_uart *hu)
return ret;
 }
 
-static struct hci_uart_proto qca_proto = {
+static const struct hci_uart_proto qca_proto = {
.id = HCI_UART_QCA,
.name   = "QCA",
.manufacturer   = 29,
-- 
2.19.1

Re: [PATCH] lsilogic mpt fusion: mptctl: Fixed race condition around mptctl_id variable using mutexes

2019-08-14 Thread kbuild test robot

Hi Mark,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on linus/master]
[cannot apply to v5.3-rc4 next-20190814]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Mark-Balantzyan/lsilogic-mpt-fusion-mptctl-Fixed-race-condition-around-mptctl_id-variable-using-mutexes/20190815-115822
config: x86_64-lkp (attached as .config)
compiler: gcc-7 (Debian 7.4.0-10) 7.4.0
reproduce:
# save the attached .config to linux build tree
make ARCH=x86_64 

If you fix the issue, kindly add following tag
Reported-by: kbuild test robot 

All warnings (new ones prefixed by >>):

   drivers/message/fusion/mptctl.c: In function 'mptctl_do_fw_download':
>> drivers/message/fusion/mptctl.c:820:3: warning: this 'if' clause does not 
>> guard... [-Wmisleading-indentation]
  if ((mf = mpt_get_msg_frame(mptctl_id, iocp)) == NULL)
  ^~
   drivers/message/fusion/mptctl.c:822:4: note: ...this statement, but the 
latter is misleadingly indented as if it were guarded by the 'if'
   return -EAGAIN;
   ^~
   drivers/message/fusion/mptctl.c: In function 'mptctl_do_mpt_command':
   drivers/message/fusion/mptctl.c:1898:9: warning: this 'if' clause does not 
guard... [-Wmisleading-indentation]
if ((mf = mpt_get_msg_frame(mptctl_id, ioc)) == NULL)
^~
   drivers/message/fusion/mptctl.c:1900:17: note: ...this statement, but the 
latter is misleadingly indented as if it were guarded by the 'if'
return -EAGAIN;
^~

vim +/if +820 drivers/message/fusion/mptctl.c

^1da177e4c3f41 Linus Torvalds  2005-04-16   771  
^1da177e4c3f41 Linus Torvalds  2005-04-16   772  
/*=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=*/
^1da177e4c3f41 Linus Torvalds  2005-04-16   773  /*
^1da177e4c3f41 Linus Torvalds  2005-04-16   774   * FW Download engine.
^1da177e4c3f41 Linus Torvalds  2005-04-16   775   * Outputs:None.
^1da177e4c3f41 Linus Torvalds  2005-04-16   776   * Return: 0 if successful
^1da177e4c3f41 Linus Torvalds  2005-04-16   777   * -EFAULT if data 
unavailable
^1da177e4c3f41 Linus Torvalds  2005-04-16   778   * -ENXIO  if no 
such device
^1da177e4c3f41 Linus Torvalds  2005-04-16   779   * -EAGAIN if 
resource problem
^1da177e4c3f41 Linus Torvalds  2005-04-16   780   * -ENOMEM if no 
memory for SGE
^1da177e4c3f41 Linus Torvalds  2005-04-16   781   * -EMLINK if too 
many chain buffers required
^1da177e4c3f41 Linus Torvalds  2005-04-16   782   * -EBADRQC if 
adapter does not support FW download
^1da177e4c3f41 Linus Torvalds  2005-04-16   783   * -EBUSY if 
adapter is busy
^1da177e4c3f41 Linus Torvalds  2005-04-16   784   * -ENOMSG if FW 
upload returned bad status
^1da177e4c3f41 Linus Torvalds  2005-04-16   785   */
^1da177e4c3f41 Linus Torvalds  2005-04-16   786  static int
^1da177e4c3f41 Linus Torvalds  2005-04-16   787  mptctl_do_fw_download(int ioc, 
char __user *ufwbuf, size_t fwlen)
^1da177e4c3f41 Linus Torvalds  2005-04-16   788  {
^1da177e4c3f41 Linus Torvalds  2005-04-16   789 FWDownload_t
*dlmsg;
^1da177e4c3f41 Linus Torvalds  2005-04-16   790 MPT_FRAME_HDR   
*mf;
^1da177e4c3f41 Linus Torvalds  2005-04-16   791 MPT_ADAPTER 
*iocp;
^1da177e4c3f41 Linus Torvalds  2005-04-16   792 FWDownloadTCSGE_t   
*ptsge;
^1da177e4c3f41 Linus Torvalds  2005-04-16   793 MptSge_t
*sgl, *sgIn;
^1da177e4c3f41 Linus Torvalds  2005-04-16   794 char
*sgOut;
^1da177e4c3f41 Linus Torvalds  2005-04-16   795 struct buflist  
*buflist;
^1da177e4c3f41 Linus Torvalds  2005-04-16   796 struct buflist  
*bl;
^1da177e4c3f41 Linus Torvalds  2005-04-16   797 dma_addr_t  
 sgl_dma;
^1da177e4c3f41 Linus Torvalds  2005-04-16   798 int 
 ret;
^1da177e4c3f41 Linus Torvalds  2005-04-16   799 int 
 numfrags = 0;
^1da177e4c3f41 Linus Torvalds  2005-04-16   800 int 
 maxfrags;
^1da177e4c3f41 Linus Torvalds  2005-04-16   801 int 
 n = 0;
^1da177e4c3f41 Linus Torvalds  2005-04-16   802 u32 
 sgdir;
^1da177e4c3f41 Linus Torvalds  2005-04-16   803 u32 
 nib;
^1da177e4c3f41 Linus Torvalds  2005-04-16   804 int 
 fw_bytes_copied = 0;
^1da177e4c3f41 Linus Torvalds  2005-04-16   805 int 
 i;
^1da177e4c3f41 Linus Torvalds  2005-04-16   806 int 
 sge_offset = 0;
^1da177e4c3f41 Linus Torvalds  2005-04-16   807 u16 
 iocstat;
^1da177e4c3f41 Linus Torvalds  2005-04-16   808

[PATCH v4] bus: ti-sysc: Change return types of functions

2019-08-14 Thread Nishka Dasgupta

Change return type of functions sysc_check_one_child() and
sysc_check_children() from int to void as neither ever returns an error.
Modify call sites of both functions accordingly.

Signed-off-by: Nishka Dasgupta 
---
Changes in v4:
- Merge into a single patch the two patches for sysc_check_one_child()
  and sysc_check_children().
Changes in v3:
- Add patch for sysc_check_children().
- Remove return statement in sysc_check_one_child().
- Remove braces at call site.
Changes in v2:
- Remove error variable entirely.
- Change return type of sysc_check_one_child().

 drivers/bus/ti-sysc.c | 22 ++
 1 file changed, 6 insertions(+), 16 deletions(-)

diff --git a/drivers/bus/ti-sysc.c b/drivers/bus/ti-sysc.c
index e6deabd8305d..a2eae8f36ef8 100644
--- a/drivers/bus/ti-sysc.c
+++ b/drivers/bus/ti-sysc.c
@@ -615,8 +615,8 @@ static void sysc_check_quirk_stdout(struct sysc *ddata,
  * node but children have "ti,hwmods". These belong to the interconnect
  * target node and are managed by this driver.
  */
-static int sysc_check_one_child(struct sysc *ddata,
-   struct device_node *np)
+static void sysc_check_one_child(struct sysc *ddata,
+struct device_node *np)
 {
const char *name;
 
@@ -626,22 +626,14 @@ static int sysc_check_one_child(struct sysc *ddata,
 
sysc_check_quirk_stdout(ddata, np);
sysc_parse_dts_quirks(ddata, np, true);
-
-   return 0;
 }
 
-static int sysc_check_children(struct sysc *ddata)
+static void sysc_check_children(struct sysc *ddata)
 {
struct device_node *child;
-   int error;
-
-   for_each_child_of_node(ddata->dev->of_node, child) {
-   error = sysc_check_one_child(ddata, child);
-   if (error)
-   return error;
-   }
 
-   return 0;
+   for_each_child_of_node(ddata->dev->of_node, child)
+   sysc_check_one_child(ddata, child);
 }
 
 /*
@@ -794,9 +786,7 @@ static int sysc_map_and_check_registers(struct sysc *ddata)
if (error)
return error;
 
-   error = sysc_check_children(ddata);
-   if (error)
-   return error;
+   sysc_check_children(ddata);
 
error = sysc_parse_registers(ddata);
if (error)
-- 
2.19.1

[PATCH v2 2/2] iommu/arm-smmu-v3: add nr_ats_masters for quickly check

2019-08-14 Thread Zhen Lei

When (smmu_domain->smmu->features & ARM_SMMU_FEAT_ATS) is true, even if a
smmu domain does not contain any ats master, the operations of
arm_smmu_atc_inv_to_cmd() and lock protection in arm_smmu_atc_inv_domain()
are always executed. This will impact performance, especially in
multi-core and stress scenarios. For my FIO test scenario, about 8%
performance reduced.

In fact, we can use a struct member to record how many ats masters that
the smmu contains. And check that without traverse the list and check all
masters one by one in the lock protection.

Fixes: 9ce27afc0830 ("iommu/arm-smmu-v3: Add support for PCI ATS")
Signed-off-by: Zhen Lei 
---
 drivers/iommu/arm-smmu-v3.c | 14 +-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index 29056d9bb12aa01..154334d3310c9b8 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -631,6 +631,7 @@ struct arm_smmu_domain {
 
struct io_pgtable_ops   *pgtbl_ops;
boolnon_strict;
+   int nr_ats_masters;
 
enum arm_smmu_domain_stage  stage;
union {
@@ -1531,7 +1532,16 @@ static int arm_smmu_atc_inv_domain(struct 
arm_smmu_domain *smmu_domain,
struct arm_smmu_cmdq_ent cmd;
struct arm_smmu_master *master;
 
-   if (!(smmu_domain->smmu->features & ARM_SMMU_FEAT_ATS))
+   /*
+* The protectiom of spinlock(_domain->devices_lock) is omitted.
+* Because for a given master, its map/unmap operations should only be
+* happened after it has been attached and before it has been detached.
+* So that, if at least one master need to be atc invalidated, the
+* value of smmu_domain->nr_ats_masters can not be zero.
+*
+* This can alleviate performance loss in multi-core scenarios.
+*/
+   if (!smmu_domain->nr_ats_masters)
return 0;
 
arm_smmu_atc_inv_to_cmd(ssid, iova, size, );
@@ -1913,6 +1923,7 @@ static void arm_smmu_detach_dev(struct arm_smmu_master 
*master)
 
spin_lock_irqsave(_domain->devices_lock, flags);
list_del(>domain_head);
+   smmu_domain->nr_ats_masters--;
spin_unlock_irqrestore(_domain->devices_lock, flags);
 
master->domain = NULL;
@@ -1968,6 +1979,7 @@ static int arm_smmu_attach_dev(struct iommu_domain 
*domain, struct device *dev)
 
spin_lock_irqsave(_domain->devices_lock, flags);
list_add(>domain_head, _domain->devices);
+   smmu_domain->nr_ats_masters++;
spin_unlock_irqrestore(_domain->devices_lock, flags);
 out_unlock:
mutex_unlock(_domain->init_mutex);
-- 
1.8.3

[PATCH v4 0/2] mm,thp: Add filemap_huge_fault() for THP

2019-08-14 Thread William Kucharski

This set of patches is the first step towards a mechanism for automatically
mapping read-only text areas of appropriate size and alignment to THPs
whenever possible.

For now, the central routine, filemap_huge_fault(), amd various support
routines are only included if the experimental kernel configuration option

RO_EXEC_FILEMAP_HUGE_FAULT_THP

is enabled.

This is because filemap_huge_fault() is dependent upon the
address_space_operations vector readpage() pointing to a routine that will
read and fill an entire large page at a time without poulluting the page
cache with PAGESIZE entries for the large page being mapped or performing
readahead that would pollute the page cache entries for succeeding large
pages. Unfortunately, there is no good way to determine how many bytes
were read by readpage(). At present, if filemap_huge_fault() were to call
a conventional readpage() routine, it would only fill the first PAGESIZE
bytes of the large page, which is definitely NOT the desired behavior.

However, by making the code available now it is hoped that filesystem
maintainers who have pledged to provide such a mechanism will do so more
rapidly.

The first part of the patch adds an order field to __page_cache_alloc(),
allowing callers to directly request page cache pages of various sizes.
This code was provided by Matthew Wilcox.

The second part of the patch implements the filemap_huge_fault() mechanism
as described above.

As this code is only run when the experimental config option is set,
there are some issues that need to be resolved but this is a good step
step that will enable further developemt.

Changes since v3:
1. Multiple code review comments addressed
2. filemap_huge_fault() now does rcu locking when possible
3. filemap_huge_fault() now properly adds the THP to the page cache before
   calling readpage()

Changes since v2:
1. FGP changes were pulled out to enable submission as an independent
   patch
2. Inadvertent tab spacing and comment changes were reverted

Changes since v1:
1. Fix improperly generated patch for v1 PATCH 1/2

Matthew Wilcox (1):
  Add an 'order' argument to __page_cache_alloc() and
do_read_cache_page(). Ensure the allocated pages are compound pages.

William Kucharski (1):
  Add filemap_huge_fault() to attempt to satisfy page faults on
memory-mapped read-only text pages using THP when possible.

 fs/afs/dir.c|   2 +-
 fs/btrfs/compression.c  |   2 +-
 fs/cachefiles/rdwr.c|   4 +-
 fs/ceph/addr.c  |   2 +-
 fs/ceph/file.c  |   2 +-
 include/linux/mm.h  |   2 +
 include/linux/pagemap.h |  10 +-
 mm/Kconfig  |  15 ++
 mm/filemap.c| 357 ++--
 mm/huge_memory.c|   3 +
 mm/mmap.c   |  38 -
 mm/readahead.c  |   2 +-
 mm/rmap.c   |   4 +-
 net/ceph/pagelist.c |   4 +-
 net/ceph/pagevec.c  |   2 +-
 15 files changed, 413 insertions(+), 36 deletions(-)

-- 
2.21.0

[PATCH v4 1/2] mm: Allow the page cache to allocate large pages

2019-08-14 Thread William Kucharski

Add an 'order' argument to __page_cache_alloc() and
do_read_cache_page(). Ensure the allocated pages are compound pages.

Signed-off-by: Matthew Wilcox (Oracle) 
Signed-off-by: William Kucharski 
Reported-by: kbuild test robot 
---
 fs/afs/dir.c|  2 +-
 fs/btrfs/compression.c  |  2 +-
 fs/cachefiles/rdwr.c|  4 ++--
 fs/ceph/addr.c  |  2 +-
 fs/ceph/file.c  |  2 +-
 include/linux/pagemap.h | 10 ++
 mm/filemap.c| 20 +++-
 mm/readahead.c  |  2 +-
 net/ceph/pagelist.c |  4 ++--
 net/ceph/pagevec.c  |  2 +-
 10 files changed, 27 insertions(+), 23 deletions(-)

diff --git a/fs/afs/dir.c b/fs/afs/dir.c
index e640d67274be..0a392214f71e 100644
--- a/fs/afs/dir.c
+++ b/fs/afs/dir.c
@@ -274,7 +274,7 @@ static struct afs_read *afs_read_dir(struct afs_vnode 
*dvnode, struct key *key)
afs_stat_v(dvnode, n_inval);
 
ret = -ENOMEM;
-   req->pages[i] = __page_cache_alloc(gfp);
+   req->pages[i] = __page_cache_alloc(gfp, 0);
if (!req->pages[i])
goto error;
ret = add_to_page_cache_lru(req->pages[i],
diff --git a/fs/btrfs/compression.c b/fs/btrfs/compression.c
index 60c47b417a4b..5280e7477b7e 100644
--- a/fs/btrfs/compression.c
+++ b/fs/btrfs/compression.c
@@ -466,7 +466,7 @@ static noinline int add_ra_bio_pages(struct inode *inode,
}
 
page = __page_cache_alloc(mapping_gfp_constraint(mapping,
-~__GFP_FS));
+~__GFP_FS), 0);
if (!page)
break;
 
diff --git a/fs/cachefiles/rdwr.c b/fs/cachefiles/rdwr.c
index 44a3ce1e4ce4..11d30212745f 100644
--- a/fs/cachefiles/rdwr.c
+++ b/fs/cachefiles/rdwr.c
@@ -259,7 +259,7 @@ static int cachefiles_read_backing_file_one(struct 
cachefiles_object *object,
goto backing_page_already_present;
 
if (!newpage) {
-   newpage = __page_cache_alloc(cachefiles_gfp);
+   newpage = __page_cache_alloc(cachefiles_gfp, 0);
if (!newpage)
goto nomem_monitor;
}
@@ -495,7 +495,7 @@ static int cachefiles_read_backing_file(struct 
cachefiles_object *object,
goto backing_page_already_present;
 
if (!newpage) {
-   newpage = __page_cache_alloc(cachefiles_gfp);
+   newpage = __page_cache_alloc(cachefiles_gfp, 0);
if (!newpage)
goto nomem;
}
diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c
index e078cc55b989..bcb41fbee533 100644
--- a/fs/ceph/addr.c
+++ b/fs/ceph/addr.c
@@ -1707,7 +1707,7 @@ int ceph_uninline_data(struct file *filp, struct page 
*locked_page)
if (len > PAGE_SIZE)
len = PAGE_SIZE;
} else {
-   page = __page_cache_alloc(GFP_NOFS);
+   page = __page_cache_alloc(GFP_NOFS, 0);
if (!page) {
err = -ENOMEM;
goto out;
diff --git a/fs/ceph/file.c b/fs/ceph/file.c
index 685a03cc4b77..ae58d7c31aa4 100644
--- a/fs/ceph/file.c
+++ b/fs/ceph/file.c
@@ -1305,7 +1305,7 @@ static ssize_t ceph_read_iter(struct kiocb *iocb, struct 
iov_iter *to)
struct page *page = NULL;
loff_t i_size;
if (retry_op == READ_INLINE) {
-   page = __page_cache_alloc(GFP_KERNEL);
+   page = __page_cache_alloc(GFP_KERNEL, 0);
if (!page)
return -ENOMEM;
}
diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index c7552459a15f..92e026d9a6b7 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -208,17 +208,19 @@ static inline int page_cache_add_speculative(struct page 
*page, int count)
 }
 
 #ifdef CONFIG_NUMA
-extern struct page *__page_cache_alloc(gfp_t gfp);
+extern struct page *__page_cache_alloc(gfp_t gfp, unsigned int order);
 #else
-static inline struct page *__page_cache_alloc(gfp_t gfp)
+static inline struct page *__page_cache_alloc(gfp_t gfp, unsigned int order)
 {
-   return alloc_pages(gfp, 0);
+   if (order > 0)
+   gfp |= __GFP_COMP;
+   return alloc_pages(gfp, order);
 }
 #endif
 
 static inline struct page *page_cache_alloc(struct address_space *x)
 {
-   return __page_cache_alloc(mapping_gfp_mask(x));
+   return __page_cache_alloc(mapping_gfp_mask(x), 0);
 }
 
 static inline gfp_t readahead_gfp_mask(struct address_space *x)
diff --git a/mm/filemap.c b/mm/filemap.c
index

[PATCH v4 2/2] mm,thp: Add experimental config option RO_EXEC_FILEMAP_HUGE_FAULT_THP

2019-08-14 Thread William Kucharski

Add filemap_huge_fault() to attempt to satisfy page
faults on memory-mapped read-only text pages using THP when possible.

Signed-off-by: William Kucharski 
---
 include/linux/mm.h |   2 +
 mm/Kconfig |  15 ++
 mm/filemap.c   | 337 +++--
 mm/huge_memory.c   |   3 +
 mm/mmap.c  |  38 -
 mm/rmap.c  |   4 +-
 6 files changed, 386 insertions(+), 13 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 0334ca97c584..2a5311721739 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2433,6 +2433,8 @@ extern void truncate_inode_pages_final(struct 
address_space *);
 
 /* generic vm_area_ops exported for stackable file systems */
 extern vm_fault_t filemap_fault(struct vm_fault *vmf);
+extern vm_fault_t filemap_huge_fault(struct vm_fault *vmf,
+   enum page_entry_size pe_size);
 extern void filemap_map_pages(struct vm_fault *vmf,
pgoff_t start_pgoff, pgoff_t end_pgoff);
 extern vm_fault_t filemap_page_mkwrite(struct vm_fault *vmf);
diff --git a/mm/Kconfig b/mm/Kconfig
index 56cec636a1fc..2debaded0e4d 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -736,4 +736,19 @@ config ARCH_HAS_PTE_SPECIAL
 config ARCH_HAS_HUGEPD
bool
 
+config RO_EXEC_FILEMAP_HUGE_FAULT_THP
+   bool "read-only exec filemap_huge_fault THP support (EXPERIMENTAL)"
+   depends on TRANSPARENT_HUGE_PAGECACHE && SHMEM
+
+   help
+   Introduce filemap_huge_fault() to automatically map executable
+   read-only pages of mapped files of suitable size and alignment
+   using THP if possible.
+
+   This is marked experimental because it is a new feature and is
+   dependent upon filesystmes implementing readpages() in a way
+   that will recognize large THP pages and read file content to
+   them without polluting the pagecache with PAGESIZE pages due
+   to readahead.
+
 endmenu
diff --git a/mm/filemap.c b/mm/filemap.c
index 38b46fc00855..aebf2f54f52e 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -199,13 +199,12 @@ static void unaccount_page_cache_page(struct 
address_space *mapping,
nr = hpage_nr_pages(page);
 
__mod_node_page_state(page_pgdat(page), NR_FILE_PAGES, -nr);
-   if (PageSwapBacked(page)) {
+
+   if (PageSwapBacked(page))
__mod_node_page_state(page_pgdat(page), NR_SHMEM, -nr);
-   if (PageTransHuge(page))
-   __dec_node_page_state(page, NR_SHMEM_THPS);
-   } else {
-   VM_BUG_ON_PAGE(PageTransHuge(page), page);
-   }
+
+   if (PageTransHuge(page))
+   __dec_node_page_state(page, NR_SHMEM_THPS);
 
/*
 * At this point page must be either written or cleaned by
@@ -1663,7 +1662,8 @@ struct page *pagecache_get_page(struct address_space 
*mapping, pgoff_t offset,
 no_page:
if (!page && (fgp_flags & FGP_CREAT)) {
int err;
-   if ((fgp_flags & FGP_WRITE) && 
mapping_cap_account_dirty(mapping))
+   if ((fgp_flags & FGP_WRITE) &&
+   mapping_cap_account_dirty(mapping))
gfp_mask |= __GFP_WRITE;
if (fgp_flags & FGP_NOFS)
gfp_mask &= ~__GFP_FS;
@@ -2643,6 +2643,326 @@ vm_fault_t filemap_fault(struct vm_fault *vmf)
 }
 EXPORT_SYMBOL(filemap_fault);
 
+#ifdef CONFIG_RO_EXEC_FILEMAP_HUGE_FAULT_THP
+/*
+ * Check for an entry in the page cache which would conflict with the address
+ * range we wish to map using a THP or is otherwise unusable to map a large
+ * cached page.
+ *
+ * The routine will return true if a usable page is found in the page cache
+ * (and *pagep will be set to the address of the cached page), or if no
+ * cached page is found (and *pagep will be set to NULL).
+ */
+static bool
+filemap_huge_check_pagecache_usable(struct xa_state *xas,
+   struct page **pagep, pgoff_t hindex, pgoff_t hindex_max)
+{
+   struct page *page;
+
+   while (1) {
+   page = xas_find(xas, hindex_max);
+
+   if (xas_retry(xas, page)) {
+   xas_set(xas, hindex);
+   continue;
+   }
+
+   /*
+* A found entry is unusable if:
+*  + the entry is an Xarray value, not a pointer
+*  + the entry is an internal Xarray node
+*  + the entry is not a Transparent Huge Page
+*  + the entry is not a compound page
+*  + the entry is not the head of a compound page
+*  + the entry is a page page with an order other than
+*HPAGE_PMD_ORDER
+*  + the page's index is not what we expect it to be
+*  + the page is not up-to-date
+*/
+   if (!page)
+   break;
+
+   if

Re: [PATCH] PCI: dwc: Add map irq callback

2019-08-14 Thread Dilip Kota




On 8/14/2019 6:59 PM, Christoph Hellwig wrote:

On Wed, Aug 14, 2019 at 04:31:14PM +0800, Dilip Kota wrote:

callback.

pp->map_irq() must assign the callback along with the platform specific
configuration.
In Intel PCIe driver pp->map_irq() does the same. (Driver is not yet present
in mainline, i will submit for review once this change is approved).

And that's what I meant.  The standard procedure is to submit your
core changes together with the user, not separately.
Sure, will submit the driver change along with this change. Sorry for 
missing it.


Thanks,
Dilip

Re: [PATCH v5 1/2] dt-bindings: mmc: Document Aspeed SD controller

2019-08-14 Thread Andrew Jeffery




On Thu, 15 Aug 2019, at 15:06, Joel Stanley wrote:
> On Wed, 7 Aug 2019 at 00:38, Andrew Jeffery  wrote:
> >
> > The ASPEED SD/SDIO/MMC controller exposes two slots implementing the
> > SDIO Host Specification v2.00, with 1 or 4 bit data buses, or an 8 bit
> > data bus if only a single slot is enabled.
> >
> > Signed-off-by: Andrew Jeffery 
> 
> Reviewed-by: Joel Stanley 
> 
> Two minor comments below.
> 
> > +++ b/Documentation/devicetree/bindings/mmc/aspeed,sdhci.yaml
> > @@ -0,0 +1,105 @@
> > +# SPDX-License-Identifier: GPL-2.0-or-later
> 
> No "Copyright IBM" ?

I'm going rogue.

That reminds me I should chase up where we got to with the binding
licensing.

> 
> > +%YAML 1.2
> > +---
> 
> > +
> > +examples:
> > +  - |
> > +#include 
> > +sdc@1e74 {
> > +compatible = "aspeed,ast2500-sd-controller";
> > +reg = <0x1e74 0x100>;
> > +#address-cells = <1>;
> > +#size-cells = <1>;
> > +ranges = <0 0x1e74 0x1>;
> 
> According to the datasheet this could be 0x2. It does not matter
> though, as there's nothing in it past 0x300.

Good catch.

Andrew

[PATCH v2] regulator: core: Add label to collate of_node_put() statements

2019-08-14 Thread Nishka Dasgupta

In function of_get_child_regulator(), the loop for_each_child_of_node()
contains two mid-loop return statements, each preceded by a statement
putting child. In order to reduce this repetition, create a new label,
err_node_put, that puts child and then returns the required value;
edit the mid-loop return blocks to instead go to this new label.

Signed-off-by: Nishka Dasgupta 
---
Changes in v2:
- Submit this as a separate patch instead of updating a previous patch.

 drivers/regulator/core.c | 13 +++--
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/drivers/regulator/core.c b/drivers/regulator/core.c
index 7a5d52948703..4a27a46ec6e7 100644
--- a/drivers/regulator/core.c
+++ b/drivers/regulator/core.c
@@ -380,16 +380,17 @@ static struct device_node *of_get_child_regulator(struct 
device_node *parent,
 
if (!regnode) {
regnode = of_get_child_regulator(child, prop_name);
-   if (regnode) {
-   of_node_put(child);
-   return regnode;
-   }
+   if (regnode)
+   goto err_node_put;
} else {
-   of_node_put(child);
-   return regnode;
+   goto err_node_put;
}
}
return NULL;
+
+err_node_put:
+   of_node_put(child);
+   return regnode;
 }
 
 /**
-- 
2.19.1

Re: [PATCH v5 1/2] dt-bindings: mmc: Document Aspeed SD controller

2019-08-14 Thread Joel Stanley

On Wed, 7 Aug 2019 at 00:38, Andrew Jeffery  wrote:
>
> The ASPEED SD/SDIO/MMC controller exposes two slots implementing the
> SDIO Host Specification v2.00, with 1 or 4 bit data buses, or an 8 bit
> data bus if only a single slot is enabled.
>
> Signed-off-by: Andrew Jeffery 

Reviewed-by: Joel Stanley 

Two minor comments below.

> +++ b/Documentation/devicetree/bindings/mmc/aspeed,sdhci.yaml
> @@ -0,0 +1,105 @@
> +# SPDX-License-Identifier: GPL-2.0-or-later

No "Copyright IBM" ?

> +%YAML 1.2
> +---

> +
> +examples:
> +  - |
> +#include 
> +sdc@1e74 {
> +compatible = "aspeed,ast2500-sd-controller";
> +reg = <0x1e74 0x100>;
> +#address-cells = <1>;
> +#size-cells = <1>;
> +ranges = <0 0x1e74 0x1>;

According to the datasheet this could be 0x2. It does not matter
though, as there's nothing in it past 0x300.

Cheers,

Joel

Re: [PATCH v5 2/2] mmc: Add support for the ASPEED SD controller

2019-08-14 Thread Joel Stanley

On Wed, 7 Aug 2019 at 00:38, Andrew Jeffery  wrote:
>
> Add a minimal driver for ASPEED's SD controller, which exposes two
> SDHCIs.
>
> The ASPEED design implements a common register set for the SDHCIs, and
> moves some of the standard configuration elements out to this common
> area (e.g. 8-bit mode, and card detect configuration which is not
> currently supported).
>
> The SD controller has a dedicated hardware interrupt that is shared
> between the slots. The common register set exposes information on which
> slot triggered the interrupt; early revisions of the patch introduced an
> irqchip for the register, but reality is it doesn't behave as an
> irqchip, and the result fits awkwardly into the irqchip APIs. Instead
> I've taken the simple approach of using the IRQ as a shared IRQ with
> some minor performance impact for the second slot.
>
> Ryan was the original author of the patch - I've taken his work and
> massaged it to drop the irqchip support and rework the devicetree
> integration. The driver has been smoke tested under qemu against a
> minimal SD controller model and lightly tested on an ast2500-evb.
>
> Signed-off-by: Ryan Chen 
> Signed-off-by: Andrew Jeffery 
> Acked-by: Adrian Hunter 

Reviewed-by: Joel Stanley 


>
> ---
>
> v5:
> * Cleanup sdhci driver on registration failure
>
> v4: No change
>
> v2:
> * Add AST2600 compatible
> * Drop SDHCI_QUIRK2_CLOCK_DIV_ZERO_BROKEN
> * Ensure slot number is valid
> * Fix build with CONFIG_MODULES
> * Fix module license string
> * Non-PCI devices won't die
> * Rename aspeed_sdc_configure_8bit_mode()
> * Rename aspeed_sdhci_pdata
> * Switch to sdhci_enable_clk()
> * Use PTR_ERR() on the right `struct platform_device *`
> ---
>  drivers/mmc/host/Kconfig   |  12 ++
>  drivers/mmc/host/Makefile  |   1 +
>  drivers/mmc/host/sdhci-of-aspeed.c | 332 +
>  3 files changed, 345 insertions(+)
>  create mode 100644 drivers/mmc/host/sdhci-of-aspeed.c
>
> diff --git a/drivers/mmc/host/Kconfig b/drivers/mmc/host/Kconfig
> index 14d89a108edd..0f8a230de2f3 100644
> --- a/drivers/mmc/host/Kconfig
> +++ b/drivers/mmc/host/Kconfig
> @@ -154,6 +154,18 @@ config MMC_SDHCI_OF_ARASAN
>
>   If unsure, say N.
>
> +config MMC_SDHCI_OF_ASPEED
> +   tristate "SDHCI OF support for the ASPEED SDHCI controller"
> +   depends on MMC_SDHCI_PLTFM
> +   depends on OF
> +   help
> + This selects the ASPEED Secure Digital Host Controller Interface.
> +
> + If you have a controller with this interface, say Y or M here. You
> + also need to enable an appropriate bus interface.
> +
> + If unsure, say N.
> +
>  config MMC_SDHCI_OF_AT91
> tristate "SDHCI OF support for the Atmel SDMMC controller"
> depends on MMC_SDHCI_PLTFM
> diff --git a/drivers/mmc/host/Makefile b/drivers/mmc/host/Makefile
> index 73578718f119..390ee162fe71 100644
> --- a/drivers/mmc/host/Makefile
> +++ b/drivers/mmc/host/Makefile
> @@ -84,6 +84,7 @@ obj-$(CONFIG_MMC_SDHCI_ESDHC_IMX) += sdhci-esdhc-imx.o
>  obj-$(CONFIG_MMC_SDHCI_DOVE)   += sdhci-dove.o
>  obj-$(CONFIG_MMC_SDHCI_TEGRA)  += sdhci-tegra.o
>  obj-$(CONFIG_MMC_SDHCI_OF_ARASAN)  += sdhci-of-arasan.o
> +obj-$(CONFIG_MMC_SDHCI_OF_ASPEED)  += sdhci-of-aspeed.o
>  obj-$(CONFIG_MMC_SDHCI_OF_AT91)+= sdhci-of-at91.o
>  obj-$(CONFIG_MMC_SDHCI_OF_ESDHC)   += sdhci-of-esdhc.o
>  obj-$(CONFIG_MMC_SDHCI_OF_HLWD)+= sdhci-of-hlwd.o
> diff --git a/drivers/mmc/host/sdhci-of-aspeed.c 
> b/drivers/mmc/host/sdhci-of-aspeed.c
> new file mode 100644
> index ..8bb095ca2fa9
> --- /dev/null
> +++ b/drivers/mmc/host/sdhci-of-aspeed.c
> @@ -0,0 +1,332 @@
> +// SPDX-License-Identifier: GPL-2.0-or-later
> +/* Copyright (C) 2019 ASPEED Technology Inc. */
> +/* Copyright (C) 2019 IBM Corp. */
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#include "sdhci-pltfm.h"
> +
> +#define ASPEED_SDC_INFO0x00
> +#define   ASPEED_SDC_S1MMC8BIT(25)
> +#define   ASPEED_SDC_S0MMC8BIT(24)
> +
> +struct aspeed_sdc {
> +   struct clk *clk;
> +   struct resource *res;
> +
> +   spinlock_t lock;
> +   void __iomem *regs;
> +};
> +
> +struct aspeed_sdhci {
> +   struct aspeed_sdc *parent;
> +   u32 width_mask;
> +};
> +
> +static void aspeed_sdc_configure_8bit_mode(struct aspeed_sdc *sdc,
> +  struct aspeed_sdhci *sdhci,
> +  bool bus8)
> +{
> +   u32 info;
> +
> +   /* Set/clear 8 bit mode */
> +   spin_lock(>lock);
> +   info = readl(sdc->regs + ASPEED_SDC_INFO);
> +   if (bus8)
> +   info |= sdhci->width_mask;
> +   else
> +   info &= ~sdhci->width_mask;
> +   writel(info, sdc->regs + ASPEED_SDC_INFO);
> +   spin_unlock(>lock);
> +}

Re: [PATCH v5 5/7] PCI/ATS: Add PASID support for PCIe VF devices

2019-08-14 Thread Bjorn Helgaas

On Tue, Aug 13, 2019 at 03:19:58PM -0700, Kuppuswamy Sathyanarayanan wrote:
> On Mon, Aug 12, 2019 at 03:05:08PM -0500, Bjorn Helgaas wrote:
> > On Thu, Aug 01, 2019 at 05:06:02PM -0700, 
> > sathyanarayanan.kuppusw...@linux.intel.com wrote:
> > > From: Kuppuswamy Sathyanarayanan 
> > > 
> > > 
> > > When IOMMU tries to enable PASID for VF device in
> > > iommu_enable_dev_iotlb(), it always fails because PASID support for PCIe
> > > VF device is currently broken in PCIE driver. Current implementation
> > > expects the given PCIe device (PF & VF) to implement PASID capability
> > > before enabling the PASID support. But this assumption is incorrect. As
> > > per PCIe spec r4.0, sec 9.3.7.14, all VFs associated with PF can only
> > > use the PASID of the PF and not implement it.
> > > 
> > > Also, since PASID is a shared resource between PF/VF, following rules
> > > should apply.
> > > 
> > > 1. Use proper locking before accessing/modifying PF resources in VF
> > >PASID enable/disable call.
> > > 2. Use reference count logic to track the usage of PASID resource.
> > > 3. Disable PASID only if the PASID reference count (pasid_ref_cnt) is 
> > > zero.
> > > 
> > > Cc: Ashok Raj 
> > > Cc: Keith Busch 
> > > Suggested-by: Ashok Raj 
> > > Signed-off-by: Kuppuswamy Sathyanarayanan 
> > > 
> > > ---
> > >  drivers/pci/ats.c   | 113 ++--
> > >  include/linux/pci.h |   2 +
> > >  2 files changed, 90 insertions(+), 25 deletions(-)
> > > 
> > > diff --git a/drivers/pci/ats.c b/drivers/pci/ats.c
> > > index 079dc544..9384afd7d00e 100644
> > > --- a/drivers/pci/ats.c
> > > +++ b/drivers/pci/ats.c
> > > @@ -402,6 +402,8 @@ void pci_pasid_init(struct pci_dev *pdev)
> > >   if (pdev->is_virtfn)
> > >   return;
> > >  
> > > + mutex_init(>pasid_lock);
> > > +
> > >   pos = pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_PASID);
> > >   if (!pos)
> > >   return;
> > > @@ -436,32 +438,57 @@ void pci_pasid_init(struct pci_dev *pdev)
> > >  int pci_enable_pasid(struct pci_dev *pdev, int features)
> > >  {
> > >   u16 control, supported;
> > > + int ret = 0;
> > > + struct pci_dev *pf = pci_physfn(pdev);
> > >  
> > > - if (WARN_ON(pdev->pasid_enabled))
> > > - return -EBUSY;
> > > + mutex_lock(>pasid_lock);
> > >  
> > > - if (!pdev->eetlp_prefix_path)
> > > - return -EINVAL;
> > > + if (WARN_ON(pdev->pasid_enabled)) {
> > > + ret = -EBUSY;
> > > + goto pasid_unlock;
> > > + }
> > >  
> > > - if (!pdev->pasid_cap)
> > > - return -EINVAL;
> > > + if (!pdev->eetlp_prefix_path) {
> > > + ret = -EINVAL;
> > > + goto pasid_unlock;
> > > + }
> > >  
> > > - pci_read_config_word(pdev, pdev->pasid_cap + PCI_PASID_CAP,
> > > -  );
> > > + if (!pf->pasid_cap) {
> > > + ret = -EINVAL;
> > > + goto pasid_unlock;
> > > + }
> > > +
> > > + if (pdev->is_virtfn && pf->pasid_enabled)
> > > + goto update_status;
> > > +
> > > + pci_read_config_word(pf, pf->pasid_cap + PCI_PASID_CAP, );
> > >   supported &= PCI_PASID_CAP_EXEC | PCI_PASID_CAP_PRIV;
> > >  
> > >   /* User wants to enable anything unsupported? */
> > > - if ((supported & features) != features)
> > > - return -EINVAL;
> > > + if ((supported & features) != features) {
> > > + ret = -EINVAL;
> > > + goto pasid_unlock;
> > > + }
> > >  
> > >   control = PCI_PASID_CTRL_ENABLE | features;
> > > - pdev->pasid_features = features;
> > > -
> > > + pf->pasid_features = features;
> > >   pci_write_config_word(pdev, pdev->pasid_cap + PCI_PASID_CTRL, control);
> > >  
> > > - pdev->pasid_enabled = 1;
> > > + /*
> > > +  * If PASID is not already enabled in PF, increment pasid_ref_cnt
> > > +  * to count PF PASID usage.
> > > +  */
> > > + if (pdev->is_virtfn && !pf->pasid_enabled) {
> > > + atomic_inc(>pasid_ref_cnt);
> > > + pf->pasid_enabled = 1;
> > > + }
> > >  
> > > - return 0;
> > > +update_status:
> > > + atomic_inc(>pasid_ref_cnt);
> > > + pdev->pasid_enabled = 1;
> > > +pasid_unlock:
> > > + mutex_unlock(>pasid_lock);
> > > + return ret;
> > >  }
> > >  EXPORT_SYMBOL_GPL(pci_enable_pasid);
> > >  
> > > @@ -472,16 +499,29 @@ EXPORT_SYMBOL_GPL(pci_enable_pasid);
> > >  void pci_disable_pasid(struct pci_dev *pdev)
> > >  {
> > >   u16 control = 0;
> > > + struct pci_dev *pf = pci_physfn(pdev);
> > > +
> > > + mutex_lock(>pasid_lock);
> > >  
> > >   if (WARN_ON(!pdev->pasid_enabled))
> > > - return;
> > > + goto pasid_unlock;
> > >  
> > > - if (!pdev->pasid_cap)
> > > - return;
> > > + if (!pf->pasid_cap)
> > > + goto pasid_unlock;
> > >  
> > > - pci_write_config_word(pdev, pdev->pasid_cap + PCI_PASID_CTRL, control);
> > > + atomic_dec(>pasid_ref_cnt);
> > >  
> > > + if (atomic_read(>pasid_ref_cnt))
> > > + goto done;
> > > +
> > > + /* Disable PASID only if pasid_ref_cnt is zero */
> > > + pci_write_config_word(pf, pf->pasid_cap +

Re: [PATCH] x86/apic: Handle missing global clockevent gracefully

2019-08-14 Thread Daniel Drake

On Mon, Aug 12, 2019 at 2:16 PM Daniel Drake  wrote:
> I can do a bit of testing on other platforms too. Are there any
> specific tests I should run, other than checking that the system boots
> and doesn't have any timer watchdog complaints in the log?

Tested this on 2 AMD platforms that were not affected by the crash here.
In addition to confirming that they boot fine without timer complaints
in the logs, I checked the calibrate_APIC_clock() result before and
after this patch. I repeated each test twice.

Asus E402YA (AMD E2-7015)
Before: 99811, 99811
After: 99812, 99812

Acer Aspire A315-21G (AMD A9-9420e)
Before: 99811, 99811
After: 99807, 99820

Those new numbers seem very close to the previous ones and I didn't
observe any problems.

Thanks
Daniel

Re: [PATCH v5 3/7] PCI/ATS: Initialize PASID in pci_ats_init()

2019-08-14 Thread Bjorn Helgaas

On Thu, Aug 01, 2019 at 05:06:00PM -0700, 
sathyanarayanan.kuppusw...@linux.intel.com wrote:
> From: Kuppuswamy Sathyanarayanan 
> 
> Currently, PASID Capability checks are repeated across all PASID API's.
> Instead, cache the capability check result in pci_pasid_init() and use
> it in other PASID API's. Also, since PASID is a shared resource between
> PF/VF, initialize PASID features with default values in pci_pasid_init().
> 
> Signed-off-by: Kuppuswamy Sathyanarayanan 
> 

> + * TODO: Since PASID is a shared resource between PF/VF, don't update
> + * PASID features in the same API as a per device feature.

This comment is slightly misleading (at least, it misled *me* :))
because it hints that PASID might be specific to SR-IOV.  But I don't
think that's true, so if you keep a comment like this, please reword
it along the lines of "for SR-IOV devices, the PF's PASID is shared
between the PF and all VFs" so it leaves open the possibility of
non-SR-IOV devices using PASID as well.

Bjorn

[PATCH] powerpc: Allow flush_(inval_)dcache_range to work across ranges >4GB

2019-08-14 Thread Alastair D'Silva

From: Alastair D'Silva 

Heads Up: This patch cannot be submitted to Linus's tree, as the affected
assembler functions have already been converted to C.

When calling flush_(inval_)dcache_range with a size >4GB, we were masking
off the upper 32 bits, so we would incorrectly flush a range smaller
than intended.

This patch replaces the 32 bit shifts with 64 bit ones, so that
the full size is accounted for.

Signed-off-by: Alastair D'Silva 
---
 arch/powerpc/kernel/misc_64.S | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kernel/misc_64.S b/arch/powerpc/kernel/misc_64.S
index 1ad4089dd110..d4d096f80f4b 100644
--- a/arch/powerpc/kernel/misc_64.S
+++ b/arch/powerpc/kernel/misc_64.S
@@ -130,7 +130,7 @@ _GLOBAL_TOC(flush_dcache_range)
subfr8,r6,r4/* compute length */
add r8,r8,r5/* ensure we get enough */
lwz r9,DCACHEL1LOGBLOCKSIZE(r10)/* Get log-2 of dcache block 
size */
-   srw.r8,r8,r9/* compute line count */
+   srd.r8,r8,r9/* compute line count */
beqlr   /* nothing to do? */
mtctr   r8
 0: dcbst   0,r6
@@ -148,7 +148,7 @@ _GLOBAL(flush_inval_dcache_range)
subfr8,r6,r4/* compute length */
add r8,r8,r5/* ensure we get enough */
lwz r9,DCACHEL1LOGBLOCKSIZE(r10)/* Get log-2 of dcache block size */
-   srw.r8,r8,r9/* compute line count */
+   srd.r8,r8,r9/* compute line count */
beqlr   /* nothing to do? */
sync
isync
-- 
2.21.0

Re: [RESEND PATCH 1/2 -mm] mm: account lazy free pages separately

2019-08-14 Thread Yang Shi





On 8/14/19 5:55 AM, Vlastimil Babka wrote:

On 8/12/19 7:00 PM, Yang Shi wrote:

I can see that memcg rss size was the primary problem David was looking
at. But MemAvailable will not help with that, right? Moreover is

Yes, but David actually would like to have memcg MemAvailable (the
accounter like the global one), which should be counted like the global
one and should account per memcg deferred split THP properly.


accounting the full THP correct? What if subpages are still mapped?

"Deferred split" definitely doesn't mean they are free. When memory
pressure is hit, they would be split, then the unmapped normal pages
would be freed. So, when calculating MemAvailable, they are not
accounted 100%, but like "available += lazyfree - min(lazyfree / 2,
wmark_low)", just like how page cache is accounted.

We could get more accurate account, i.e. checking each sub page's
mapcount when accounting, but it may change before shrinker start
scanning. So, just use the ballpark estimation to trade off the
complexity for accurate accounting.

If we know the mapcounts in the moment the deferred split is initiated (I
suppose there has to be a iteration over all subpages already?), we could get
the exact number to adjust the counter with, and also store the number somewhere
(e.g. a unused field in first/second tail page, I think we already do that for
something). Then in the shrinker we just read that number to adjust the counter
back. Then we can ignore the subpage mapping changes before shrinking happens,
they shouldn't change the situation significantly, and importantly we we will be
safe from counter imbalance thanks to the stored number.


Thanks, I'm going to look into this approach. Thanks for the suggestion 
again.

Re: [RESEND PATCH 1/2 -mm] mm: account lazy free pages separately

2019-08-14 Thread Yang Shi





On 8/14/19 5:49 AM, Vlastimil Babka wrote:

On 8/9/19 8:26 PM, Yang Shi wrote:

Here the new counter is introduced for patch 2/2 to account deferred
split THPs into available memory since NR_ANON_THPS may contain
non-deferred split THPs.

I could use an internal counter for deferred split THPs, but if it is
accounted by mod_node_page_state, why not just show it in /proc/meminfo?

The answer to "Why not" is that it becomes part of userspace API (btw this
patchset should have CC'd linux-api@ - please do for further iterations) and
even if the implementation detail of deferred splitting might change in the
future, we'll basically have to keep the counter (even with 0 value) in
/proc/meminfo forever.

Also, quite recently we have added the following counter:

KReclaimable: Kernel allocations that the kernel will attempt to reclaim
   under memory pressure. Includes SReclaimable (below), and other
   direct allocations with a shrinker.

Although THP allocations are not exactly "kernel allocations", once they are
unmapped, they are in fact kernel-only, so IMHO it wouldn't be a big stretch to
add the lazy THP pages there?


Thanks a lot for the suggestion. I agree it may be a good fit. Hope 
"kernel allocations" not cause confusion. But, we can explain in the 
documentation.





Or we fix NR_ANON_THPS and show deferred split THPs in /proc/meminfo?

Re: [RESEND PATCH 1/2 -mm] mm: account lazy free pages separately

2019-08-14 Thread Yang Shi





On 8/14/19 4:08 AM, Michal Hocko wrote:

On Mon 12-08-19 10:00:17, Yang Shi wrote:


On 8/12/19 2:34 AM, Michal Hocko wrote:

On Fri 09-08-19 16:54:43, Yang Shi wrote:

On 8/9/19 11:26 AM, Yang Shi wrote:

On 8/9/19 11:02 AM, Michal Hocko wrote:

[...]

I have to study the code some more but is there any reason why those
pages are not accounted as proper THPs anymore? Sure they are partially
unmaped but they are still THPs so why cannot we keep them accounted
like that. Having a new counter to reflect that sounds like papering
over the problem to me. But as I've said I might be missing something
important here.

I think we could keep those pages accounted for NR_ANON_THPS since they
are still THP although they are unmapped as you mentioned if we just
want to fix the improper accounting.

By double checking what NR_ANON_THPS really means,
Documentation/filesystems/proc.txt says "Non-file backed huge pages mapped
into userspace page tables". Then it makes some sense to dec NR_ANON_THPS
when removing rmap even though they are still THPs.

I don't think we would like to change the definition, if so a new counter
may make more sense.

Yes, changing NR_ANON_THPS semantic sounds like a bad idea. Let
me try whether I understand the problem. So we have some THP in
limbo waiting for them to be split and unmapped parts to be freed,
right? I can see that page_remove_anon_compound_rmap does correctly
decrement NR_ANON_MAPPED for sub pages that are no longer mapped by
anybody. LRU pages seem to be accounted properly as well.  As you've
said NR_ANON_THPS reflects the number of THPs mapped and that should be
reflecting the reality already IIUC.

So the only problem seems to be that deferred THP might aggregate a lot
of immediately freeable memory (if none of the subpages are mapped) and
that can confuse MemAvailable because it doesn't know about the fact.
Has an skewed counter resulted in a user observable behavior/failures?

No. But the skewed counter may make big difference for a big scale cluster.
The MemAvailable is an important factor for cluster scheduler to determine
the capacity.

But MemAvailable is a very rough estimation. Is relying on it really a
good measure? I mean there is a lot of reclaimable memory that is not
reflected there (some fs. internal data structures, networking buffers
etc.)


Yes, I agree there are other freeable objects not accounted into 
MemAvailable. Their size depends on the workload. But, deferred split 
THPs seems more common with the common workloads. A simple run with 
MariaDB test of mmtest shows it could generate over fifteen thousand 
deferred split THPs (accumulated around 30G in one hour run, 75% of 40G 
memory for my VM). So, it may be worth accounting deferred split THPs in 
MemAvailable.




[...]


accounting the full THP correct? What if subpages are still mapped?

"Deferred split" definitely doesn't mean they are free. When memory pressure
is hit, they would be split, then the unmapped normal pages would be freed.
So, when calculating MemAvailable, they are not accounted 100%, but like
"available += lazyfree - min(lazyfree / 2, wmark_low)", just like how page
cache is accounted.

Then this is even more dubious IMHO.


We could get more accurate account, i.e. checking each sub page's mapcount
when accounting, but it may change before shrinker start scanning. So, just
use the ballpark estimation to trade off the complexity for accurate
accounting.

I do not see much point in fixing up one particular counter when there
is a whole lot that is even not considered. I would rather live with the
fact that MemAvailable is only very rough estimate then whack a mole on
any memory consumer that is freeable directly or indirectly via memory
reclaim. Because this is likely to be always subtly broken and only
visible under very specific workloads so there is no way to test for it.


I saw Vlastimil suggested KReclaimable, it seems a good fit. If so we 
don't need create a new counter anymore.

Re: [PATCH v5 3/7] PCI/ATS: Initialize PASID in pci_ats_init()

2019-08-14 Thread Bjorn Helgaas

On Thu, Aug 01, 2019 at 05:06:00PM -0700, 
sathyanarayanan.kuppusw...@linux.intel.com wrote:
> From: Kuppuswamy Sathyanarayanan 
> 
> Currently, PASID Capability checks are repeated across all PASID API's.
> Instead, cache the capability check result in pci_pasid_init() and use
> it in other PASID API's. Also, since PASID is a shared resource between
> PF/VF, initialize PASID features with default values in pci_pasid_init().
> 
> Signed-off-by: Kuppuswamy Sathyanarayanan 
> 
> ---
>  drivers/pci/ats.c   | 74 +
>  include/linux/pci-ats.h |  5 +++
>  include/linux/pci.h |  1 +
>  3 files changed, 59 insertions(+), 21 deletions(-)
> 

> diff --git a/drivers/pci/ats.c b/drivers/pci/ats.c
> index 280be911f190..1f4be27a071d 100644
> --- a/drivers/pci/ats.c
> +++ b/drivers/pci/ats.c
> @@ -30,6 +30,8 @@ void pci_ats_init(struct pci_dev *dev)
>   dev->ats_cap = pos;
>  
>   pci_pri_init(dev);
> +
> + pci_pasid_init(dev);
>  }
>  
>  /**
> @@ -315,6 +317,40 @@ EXPORT_SYMBOL_GPL(pci_reset_pri);
>  #endif /* CONFIG_PCI_PRI */
>  
>  #ifdef CONFIG_PCI_PASID
> +
> +void pci_pasid_init(struct pci_dev *pdev)
> +{
> ...
> +}

> diff --git a/include/linux/pci-ats.h b/include/linux/pci-ats.h
> index 33653d4ca94f..bc7f815d38ff 100644
> --- a/include/linux/pci-ats.h
> +++ b/include/linux/pci-ats.h
> @@ -40,6 +40,7 @@ static inline int pci_reset_pri(struct pci_dev *pdev)
>  
>  #ifdef CONFIG_PCI_PASID
>  
> +void pci_pasid_init(struct pci_dev *pdev);

This also looks like it should be static in ats.c.

>  int pci_enable_pasid(struct pci_dev *pdev, int features);
>  void pci_disable_pasid(struct pci_dev *pdev);
>  void pci_restore_pasid_state(struct pci_dev *pdev);
> @@ -48,6 +49,10 @@ int pci_max_pasids(struct pci_dev *pdev);
>  
>  #else  /* CONFIG_PCI_PASID */
>  
> +static inline void pci_pasid_init(struct pci_dev *pdev)
> +{
> +}
> +
>  static inline int pci_enable_pasid(struct pci_dev *pdev, int features)
>  {
>   return -EINVAL;

Re: [PATCH] Fix a stack buffer overflow bug check_input_term

2019-08-14 Thread Hui Peng

The stack trace differs from test to test, the attached trace1 file is
taken from one of the tests.

The bug is confirmed by adding some printk statement in
`check_input_term`, the trace with output of printk is attached in
trace2 file.

This patch is a tentative fix to the bug, please give me feedback.

On 8/15/19 12:35 AM, Hui Peng wrote:
> `check_input_term` recursively calls itself with input
> from device side (e.g., uac_input_terminal_descriptor.bCSourceID)
> as argument (id). In `check_input_term`, if `check_input_term`
> is called with the same `id` argument as the caller, it triggers
> endless recursive call, resulting kernel space stack overflow.
>
> This patch fixes the bug by adding a bitmap to `struct mixer_build`
> to keep track of the checked ids by `check_input_term` and stop
> the execution if some id has been checked (similar to how
> parse_audio_unit handles unitid argument).
>
> Reported-by: Hui Peng 
> Reported-by: Mathias Payer 
> Signed-off-by: Hui Peng 
> ---
>  sound/usb/mixer.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/sound/usb/mixer.c b/sound/usb/mixer.c
> index ea487378be17..1f6c8213df82 100644
> --- a/sound/usb/mixer.c
> +++ b/sound/usb/mixer.c
> @@ -68,6 +68,7 @@ struct mixer_build {
>   unsigned char *buffer;
>   unsigned int buflen;
>   DECLARE_BITMAP(unitbitmap, MAX_ID_ELEMS);
> + DECLARE_BITMAP(termbitmap, MAX_ID_ELEMS);
>   struct usb_audio_term oterm;
>   const struct usbmix_name_map *map;
>   const struct usbmix_selector_map *selector_map;
> @@ -782,6 +783,8 @@ static int check_input_term(struct mixer_build *state, 
> int id,
>   int err;
>   void *p1;
>  
> + if (test_and_set_bit(id, state->termbitmap))
> + return 0;
>   memset(term, 0, sizeof(*term));
>   while ((p1 = find_audio_control_unit(state, id)) != NULL) {
>   unsigned char *hdr = p1;
[7.839002] usb 1-1: new high-speed USB device number 2 using xhci_hcd
[7.966787] usb 1-1: Using ep0 maxpacket: 16
[7.969898] usb 1-1: string descriptor 0 read error: -22
[7.971507] usb 1-1: New USB device found, idVendor=046d, idProduct=0a44, 
bcdDevice= 1.27
[7.973874] usb 1-1: New USB device strings: Mfr=1, Product=2, SerialNumber=0
[7.979864] usb 1-1: current rate 16732531 is different from the runtime 
rate 48000
[7.982029] usb 1-1: current rate 11254477 is different from the runtime 
rate 48000
[7.987190] [parse_audio_unit:2797] reached
[7.987741] [parse_audio_unit:2814] reached
[7.988310] [parse_audio_feature_unit:1855] reached
[7.988982] [parse_audio_feature_unit:1915] reached
[7.989605] [parse_audio_unit:2797] reached
[7.990173] [parse_audio_unit:2804] reached
[7.999615] [parse_audio_unit:2797] reached
[8.000166] [parse_audio_unit:2814] reached
[8.000734] [parse_audio_feature_unit:1855] reached
[8.001361] [parse_audio_feature_unit:1915] reached
[8.002011] [parse_audio_unit:2797] reached
[8.002590] [parse_audio_unit:2811] reached
[8.003180] [parse_audio_unit:2797] reached
[8.003754] [parse_audio_unit:2814] reached
[8.004342] [parse_audio_feature_unit:1855] reached
[8.004992] [parse_audio_feature_unit:1915] reached
[8.005656] [parse_audio_feature_unit:1921] reached
[8.006324] [check_input_term:804] reached
[8.006881] [check_input_term:860] reached
[8.007451] [check_input_term:804] reached
[8.008038] [check_input_term:860] reached
[8.008596] [check_input_term:804] reached
[8.009129] [check_input_term:860] reached
[8.009686] [check_input_term:804] reached
[8.010217] [check_input_term:860] reached
[8.010834] [check_input_term:804] reached
[8.011430] [check_input_term:860] reached
[8.011945] [check_input_term:804] reached
[8.012577] [check_input_term:860] reached
[8.013109] [check_input_term:804] reached
[8.013704] [check_input_term:860] reached
[8.014319] [check_input_term:804] reached
[8.014848] [check_input_term:860] reached
[8.015415] [check_input_term:804] reached
[8.015964] [check_input_term:860] reached
[8.016525] [check_input_term:804] reached
[8.017055] [check_input_term:860] reached
[8.017612] [check_input_term:804] reached
[8.018168] [check_input_term:860] reached
[8.018701] [check_input_term:804] reached
[8.019273] [check_input_term:860] reached
[8.019805] [check_input_term:804] reached
[8.020362] [check_input_term:860] reached
[8.020893] [check_input_term:804] reached
[8.021440] [check_input_term:860] reached
[8.021972] [check_input_term:804] reached
[8.022536] [check_input_term:860] reached
[8.023127] [check_input_term:804] reached
[8.023664] [check_input_term:860] reached
[8.024213] [check_input_term:804] reached
[8.024791] [check_input_term:860] reached
[8.025351] [check_input_term:804] reached
[8.025886] [check_input_term:860] reached
[8.026435] [check_input_term:804] reached
[

RE: [EXT] Re: [v1 1/3] clk: ls1028a: Add clock driver for Display output interface

2019-08-14 Thread Wen He



> -Original Message-
> From: Stephen Boyd 
> Sent: 2019年8月15日 1:27
> To: linux-...@vger.kernel.org; linux-de...@linux.nxdi.nxp.com;
> linux-kernel@vger.kernel.org; liviu.du...@arm.com; Leo Li
> ; Michael Turquette ; Wen
> He 
> Subject: RE: [EXT] Re: [v1 1/3] clk: ls1028a: Add clock driver for Display 
> output
> interface
> 
> 
> Quoting Wen He (2019-08-14 02:38:21)
> >
> >
> > > -Original Message-
> > > From: Stephen Boyd 
> > > Sent: 2019年8月14日 2:25
> > > To: Michael Turquette ; Wen He
> > > ; Leo Li ;
> > > linux-...@vger.kernel.org; linux-de...@linux.nxdi.nxp.com;
> > > linux-kernel@vger.kernel.org; liviu.du...@arm.com
> > > Cc: Wen He 
> > > Subject: [EXT] Re: [v1 1/3] clk: ls1028a: Add clock driver for
> > > Display output interface
> > >
> > >
> > > Quoting Wen He (2019-08-12 03:01:03)
> > > > diff --git a/drivers/clk/Kconfig b/drivers/clk/Kconfig index
> > > > 801fa1cd0321..0e6c7027d637 100644
> > > > --- a/drivers/clk/Kconfig
> > > > +++ b/drivers/clk/Kconfig
> > > > @@ -223,6 +223,15 @@ config CLK_QORIQ
> > > >   This adds the clock driver support for Freescale QorIQ
> platforms
> > > >   using common clock framework.
> > > >
> > > > +config CLK_PLLDIG
> > > > +bool "Clock driver for LS1028A Display output"
> > > > +depends on ARCH_LAYERSCAPE && OF
> > >
> > > Does it actually depend on either of these to build? Probabl not, so
> > > maybe just default ARCH_LAYERSCAPE && OF? Also, can your Kconfig
> > > variable be named something more specific like CLK_LS1028A_PLLDIG?
> >
> > Actually it also depends Display modules, but we allow building
> > display drivers as modules, so is here whether need add Display
> > modules depend and also allow clock driver building to a module?
> > Would it be better to reduce the number of the modules insert, I think
> > the clock driver should be long available for the system.
> 
> I'm asking if it actually requires ARCH_LAYERSCAPE or OF to successfully
> compile the file. Is that true? I don't see any asm/ includes or anything 
> that's
> going to fail if either of these configs aren't enabled.
> So it seems safe to change this to
> 
> depends on ARCH_LAYERSCAPE || COMPILE_TEST
> default ARCH_LAYERSCAPE
> 
> so that it's compiled by default on this architecture and is available to be
> compile tested by various test builders.

Understand, Will send next patch version.

Best Regards,
Wen

> 
> >
> > looks like great if named Kconfig variable to 'CLK_LS1028A_PLLDIG'.
> >
> > >

Re: [PATCH v5 2/7] PCI/ATS: Initialize PRI in pci_ats_init()

2019-08-14 Thread Bjorn Helgaas

On Thu, Aug 01, 2019 at 05:05:59PM -0700, 
sathyanarayanan.kuppusw...@linux.intel.com wrote:
> From: Kuppuswamy Sathyanarayanan 
> 
> Currently, PRI Capability checks are repeated across all PRI API's.
> Instead, cache the capability check result in pci_pri_init() and use it
> in other PRI API's. Also, since PRI is a shared resource between PF/VF,
> initialize default values for common PRI features in pci_pri_init().
> 
> Signed-off-by: Kuppuswamy Sathyanarayanan 
> 
> ---
>  drivers/pci/ats.c   | 80 -
>  include/linux/pci-ats.h |  5 +++
>  include/linux/pci.h |  1 +
>  3 files changed, 61 insertions(+), 25 deletions(-)
> 

> diff --git a/drivers/pci/ats.c b/drivers/pci/ats.c
> index cdd936d10f68..280be911f190 100644
> --- a/drivers/pci/ats.c
> +++ b/drivers/pci/ats.c

> @@ -28,6 +28,8 @@ void pci_ats_init(struct pci_dev *dev)
>   return;
>  
>   dev->ats_cap = pos;
> +
> + pci_pri_init(dev);
>  }
>  
>  /**
> @@ -170,36 +172,72 @@ int pci_ats_page_aligned(struct pci_dev *pdev)
>  EXPORT_SYMBOL_GPL(pci_ats_page_aligned);
>  
>  #ifdef CONFIG_PCI_PRI
> +
> +void pci_pri_init(struct pci_dev *pdev)
> +{
> ...
> +}

> diff --git a/include/linux/pci-ats.h b/include/linux/pci-ats.h
> index 1a0bdaee2f32..33653d4ca94f 100644
> --- a/include/linux/pci-ats.h
> +++ b/include/linux/pci-ats.h
> @@ -6,6 +6,7 @@
>  
>  #ifdef CONFIG_PCI_PRI
>  
> +void pci_pri_init(struct pci_dev *pdev);

pci_pri_init() is implemented and called in drivers/pci/ats.c.  Unless
there's a need to call this from outside ats.c, it should be static
and should not be declared here.

If you can make it static, please also reorder the code so you don't
need a forward declaration in ats.c.

>  int pci_enable_pri(struct pci_dev *pdev, u32 reqs);
>  void pci_disable_pri(struct pci_dev *pdev);
>  void pci_restore_pri_state(struct pci_dev *pdev);
> @@ -13,6 +14,10 @@ int pci_reset_pri(struct pci_dev *pdev);
>  
>  #else /* CONFIG_PCI_PRI */
>  
> +static inline void pci_pri_init(struct pci_dev *pdev)
> +{
> +}
> +
>  static inline int pci_enable_pri(struct pci_dev *pdev, u32 reqs)
>  {
>   return -ENODEV;

[PATCH] Fix a stack buffer overflow bug check_input_term

2019-08-14 Thread Hui Peng

`check_input_term` recursively calls itself with input
from device side (e.g., uac_input_terminal_descriptor.bCSourceID)
as argument (id). In `check_input_term`, if `check_input_term`
is called with the same `id` argument as the caller, it triggers
endless recursive call, resulting kernel space stack overflow.

This patch fixes the bug by adding a bitmap to `struct mixer_build`
to keep track of the checked ids by `check_input_term` and stop
the execution if some id has been checked (similar to how
parse_audio_unit handles unitid argument).

Reported-by: Hui Peng 
Reported-by: Mathias Payer 
Signed-off-by: Hui Peng 
---
 sound/usb/mixer.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/sound/usb/mixer.c b/sound/usb/mixer.c
index ea487378be17..1f6c8213df82 100644
--- a/sound/usb/mixer.c
+++ b/sound/usb/mixer.c
@@ -68,6 +68,7 @@ struct mixer_build {
unsigned char *buffer;
unsigned int buflen;
DECLARE_BITMAP(unitbitmap, MAX_ID_ELEMS);
+   DECLARE_BITMAP(termbitmap, MAX_ID_ELEMS);
struct usb_audio_term oterm;
const struct usbmix_name_map *map;
const struct usbmix_selector_map *selector_map;
@@ -782,6 +783,8 @@ static int check_input_term(struct mixer_build *state, int 
id,
int err;
void *p1;
 
+   if (test_and_set_bit(id, state->termbitmap))
+   return 0;
memset(term, 0, sizeof(*term));
while ((p1 = find_audio_control_unit(state, id)) != NULL) {
unsigned char *hdr = p1;
-- 
2.22.1

[PATCH] clk: composite: Drop unused clk.h include

2019-08-14 Thread Stephen Boyd

This include isn't used. Drop it.

Signed-off-by: Stephen Boyd 
---
 drivers/clk/clk-composite.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/clk/clk-composite.c b/drivers/clk/clk-composite.c
index b06038b8f658..4f13a681ddfc 100644
--- a/drivers/clk/clk-composite.c
+++ b/drivers/clk/clk-composite.c
@@ -3,7 +3,6 @@
  * Copyright (c) 2013 NVIDIA CORPORATION.  All rights reserved.
  */
 
-#include 
 #include 
 #include 
 #include 
-- 
Sent by a computer through tubes

RE: [EXT] Re: rtc: pcf85363/pcf85263: fix error that failed to run hwclock -w

2019-08-14 Thread Biwen Li

> Caution: EXT Email
> 
> Hi,
> 
> On 14/08/2019 17:32:49+0800, Biwen Li wrote:
> > Issue:
> > # hwclock -w
> > hwclock: RTC_SET_TIME: Invalid argument
> >
> > The patch fixes error when run command hwclock -w with rtc
> > pcf85363/pcf85263
> >
> 
> Could you describe a bit more the issue and what causes it?
1. Relative patch: https://lkml.org/lkml/2019/4/3/55 , this patch will always 
check for unwritable registers, it will compare reg with max_register in 
regmap_writeable.

2. In drivers/rtc/rtc-pcf85363.c, CTRL_STOP_EN is 0x2e, but DT_100THS is 0, 
max_regiter is 0x2f, then reg will be equal to 0x30, '0x30 < 0x2f' is false,so 
regmap_writeable will return false.
> 
> IIRC I wrote that code and it works on my pcf85363.
> 
> > Signed-off-by: Biwen Li 
> > ---
> >  drivers/rtc/rtc-pcf85363.c | 7 ++-
> >  1 file changed, 6 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/rtc/rtc-pcf85363.c b/drivers/rtc/rtc-pcf85363.c
> > index a075e77617dc..3450d615974d 100644
> > --- a/drivers/rtc/rtc-pcf85363.c
> > +++ b/drivers/rtc/rtc-pcf85363.c
> > @@ -166,7 +166,12 @@ static int pcf85363_rtc_set_time(struct device *dev,
> struct rtc_time *tm)
> >   buf[DT_YEARS] = bin2bcd(tm->tm_year % 100);
> >
> >   ret = regmap_bulk_write(pcf85363->regmap, CTRL_STOP_EN,
> > - tmp, sizeof(tmp));
> > + tmp, 2);
> > + if (ret)
> > + return ret;
> > +
> > + ret = regmap_bulk_write(pcf85363->regmap, DT_100THS,
> > + buf, sizeof(tmp) - 2);
> >   if (ret)
> >   return ret;
> >
> > --
> > 2.17.1
> >
> 
> --
> Alexandre Belloni, Bootlin
> Embedded Linux and Kernel engineering
> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbootlin.
> comdata=02%7C01%7Cbiwen.li%40nxp.com%7C8ef8fda7d05a48ef707
> 308d7209f8029%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C63
> 7013741730886353sdata=ePC651YZUzvL5ocjXAZqKT0tIZpJM01LgRNSa
> 7i7wLE%3Dreserved=0

RE: rtc: pcf85363/pcf85263: fix error that failed to run hwclock -w

2019-08-14 Thread Biwen Li

> > Subject: rtc: pcf85363/pcf85263: fix error that failed to run hwclock
> > -w
> >
> > Issue:
> > # hwclock -w
> > hwclock: RTC_SET_TIME: Invalid argument
> >
> > The patch fixes error when run command hwclock -w with rtc
> > pcf85363/pcf85263
> 
> Can you explain a little bit more in the commit message on how the changes fix
> the above issue?   It is not that clear just from the code.
1. Relative patch: https://lkml.org/lkml/2019/4/3/55 , this patch will always
check for unwritable registers, it will compare reg with max_register in 
regmap_writeable.

2. In drivers/rtc/rtc-pcf85363.c, CTRL_STOP_EN is 0x2e, but DT_100THS is 0, 
max_regiter is 0x2f,
then reg will be equal to 0x30, '0x30 < 0x2f' is false,so regmap_writeable will 
return false.

> 
> >
> > Signed-off-by: Biwen Li 
> > ---
> >  drivers/rtc/rtc-pcf85363.c | 7 ++-
> >  1 file changed, 6 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/rtc/rtc-pcf85363.c b/drivers/rtc/rtc-pcf85363.c
> > index a075e77617dc..3450d615974d 100644
> > --- a/drivers/rtc/rtc-pcf85363.c
> > +++ b/drivers/rtc/rtc-pcf85363.c
> > @@ -166,7 +166,12 @@ static int pcf85363_rtc_set_time(struct device
> > *dev, struct rtc_time *tm)
> > buf[DT_YEARS] = bin2bcd(tm->tm_year % 100);
> >
> > ret = regmap_bulk_write(pcf85363->regmap, CTRL_STOP_EN,
> > -   tmp, sizeof(tmp));
> > +   tmp, 2);
> > +   if (ret)
> > +   return ret;
> > +
> > +   ret = regmap_bulk_write(pcf85363->regmap, DT_100THS,
> > +   buf, sizeof(tmp) - 2);
> > if (ret)
> > return ret;
> >
> > --
> > 2.17.1

[PATCH 0/6] powerpc: convert cache asm to C

2019-08-14 Thread Alastair D'Silva

From: Alastair D'Silva 

This series addresses a few issues discovered in how we flush caches:
1. Flushes were truncated at 4GB, so larger flushes were incorrect.
2. Flushing the dcache in arch_add_memory was unnecessary

This series also converts much of the cache assembler to C, with the
aim of making it easier to maintain.

Alastair D'Silva (6):
  powerpc: Allow flush_icache_range to work across ranges >4GB
  powerpc: define helpers to get L1 icache sizes
  powerpc: Convert flush_icache_range & friends to C
  powerpc: Chunk calls to flush_dcache_range in arch_*_memory
  powerpc: Remove 'extern' from func prototypes in cache headers
  powerpc: Don't flush caches when adding memory

 arch/powerpc/include/asm/cache.h  |  63 +-
 arch/powerpc/include/asm/cacheflush.h |  49 ++-
 arch/powerpc/kernel/misc_32.S | 117 --
 arch/powerpc/kernel/misc_64.S |  97 -
 arch/powerpc/mm/mem.c |  80 +-
 5 files changed, 146 insertions(+), 260 deletions(-)

-- 
2.21.0

[PATCH 6/6] powerpc: Don't flush caches when adding memory

2019-08-14 Thread Alastair D'Silva

From: Alastair D'Silva 

This operation takes a significant amount of time when hotplugging
large amounts of memory (~50 seconds with 890GB of persistent memory).

This was orignally in commit fb5924fddf9e
("powerpc/mm: Flush cache on memory hot(un)plug") to support memtrace,
but the flush on add is not needed as it is flushed on remove.

Signed-off-by: Alastair D'Silva 
---
 arch/powerpc/mm/mem.c | 6 --
 1 file changed, 6 deletions(-)

diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c
index fb0d5e9aa11b..43be99de7c9a 100644
--- a/arch/powerpc/mm/mem.c
+++ b/arch/powerpc/mm/mem.c
@@ -111,7 +111,6 @@ int __ref arch_add_memory(int nid, u64 start, u64 size,
 {
unsigned long start_pfn = start >> PAGE_SHIFT;
unsigned long nr_pages = size >> PAGE_SHIFT;
-   unsigned long i;
int rc;
 
resize_hpt_for_hotplug(memblock_phys_mem_size());
@@ -124,11 +123,6 @@ int __ref arch_add_memory(int nid, u64 start, u64 size,
return -EFAULT;
}
 
-   for (i = 0; i < size; i += FLUSH_CHUNK_SIZE) {
-   flush_dcache_range(start + i, min(start + size, start + i + 
FLUSH_CHUNK_SIZE));
-   cond_resched();
-   }
-
return __add_pages(nid, start_pfn, nr_pages, restrictions);
 }
 
-- 
2.21.0

[PATCH 5/6] powerpc: Remove 'extern' from func prototypes in cache headers

2019-08-14 Thread Alastair D'Silva

From: Alastair D'Silva 

The 'extern' keyword does not value-add for function prototypes.

Signed-off-by: Alastair D'Silva 
---
 arch/powerpc/include/asm/cache.h  | 8 
 arch/powerpc/include/asm/cacheflush.h | 6 +++---
 2 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/include/asm/cache.h b/arch/powerpc/include/asm/cache.h
index 728f154204db..c5c096e968e0 100644
--- a/arch/powerpc/include/asm/cache.h
+++ b/arch/powerpc/include/asm/cache.h
@@ -102,10 +102,10 @@ static inline u32 l1_icache_bytes(void)
 #define __read_mostly __attribute__((__section__(".data..read_mostly")))
 
 #ifdef CONFIG_PPC_BOOK3S_32
-extern long _get_L2CR(void);
-extern long _get_L3CR(void);
-extern void _set_L2CR(unsigned long);
-extern void _set_L3CR(unsigned long);
+long _get_L2CR(void);
+long _get_L3CR(void);
+void _set_L2CR(unsigned long val);
+void _set_L3CR(unsigned long val);
 #else
 #define _get_L2CR()0L
 #define _get_L3CR()0L
diff --git a/arch/powerpc/include/asm/cacheflush.h 
b/arch/powerpc/include/asm/cacheflush.h
index 4c3377aff8ed..1826bf2cc137 100644
--- a/arch/powerpc/include/asm/cacheflush.h
+++ b/arch/powerpc/include/asm/cacheflush.h
@@ -38,15 +38,15 @@ static inline void flush_cache_vmap(unsigned long start, 
unsigned long end) { }
 #endif
 
 #define ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE 1
-extern void flush_dcache_page(struct page *page);
+void flush_dcache_page(struct page *page);
 #define flush_dcache_mmap_lock(mapping)do { } while (0)
 #define flush_dcache_mmap_unlock(mapping)  do { } while (0)
 
 void flush_icache_range(unsigned long start, unsigned long stop);
-extern void flush_icache_user_range(struct vm_area_struct *vma,
+void flush_icache_user_range(struct vm_area_struct *vma,
struct page *page, unsigned long addr,
int len);
-extern void flush_dcache_icache_page(struct page *page);
+void flush_dcache_icache_page(struct page *page);
 
 /**
  * flush_dcache_range(): Write any modified data cache blocks out to memory 
and invalidate them.
-- 
2.21.0

[PATCH 3/6] powerpc: Convert flush_icache_range & friends to C

2019-08-14 Thread Alastair D'Silva

From: Alastair D'Silva 

Similar to commit 22e9c88d486a
("powerpc/64: reuse PPC32 static inline flush_dcache_range()")
this patch converts flush_icache_range() to C, and reimplements the
following functions as wrappers around it:
__flush_dcache_icache
__flush_dcache_icache_phys

This was done as we discovered a long-standing bug where the length of the
range was truncated due to using a 32 bit shift instead of a 64 bit one.

By converting these functions to C, it becomes easier to maintain.

Signed-off-by: Alastair D'Silva 
---
 arch/powerpc/include/asm/cache.h  |  26 +++---
 arch/powerpc/include/asm/cacheflush.h |  32 ---
 arch/powerpc/kernel/misc_32.S | 117 --
 arch/powerpc/kernel/misc_64.S |  97 -
 arch/powerpc/mm/mem.c |  71 +++-
 5 files changed, 102 insertions(+), 241 deletions(-)

diff --git a/arch/powerpc/include/asm/cache.h b/arch/powerpc/include/asm/cache.h
index f852d5cd746c..728f154204db 100644
--- a/arch/powerpc/include/asm/cache.h
+++ b/arch/powerpc/include/asm/cache.h
@@ -98,20 +98,7 @@ static inline u32 l1_icache_bytes(void)
 #endif
 #endif /* ! __ASSEMBLY__ */
 
-#if defined(__ASSEMBLY__)
-/*
- * For a snooping icache, we still need a dummy icbi to purge all the
- * prefetched instructions from the ifetch buffers. We also need a sync
- * before the icbi to order the the actual stores to memory that might
- * have modified instructions with the icbi.
- */
-#define PURGE_PREFETCHED_INS   \
-   sync;   \
-   icbi0,r3;   \
-   sync;   \
-   isync
-
-#else
+#if !defined(__ASSEMBLY__)
 #define __read_mostly __attribute__((__section__(".data..read_mostly")))
 
 #ifdef CONFIG_PPC_BOOK3S_32
@@ -145,6 +132,17 @@ static inline void dcbst(void *addr)
 {
__asm__ __volatile__ ("dcbst %y0" : : "Z"(*(u8 *)addr) : "memory");
 }
+
+static inline void icbi(void *addr)
+{
+   __asm__ __volatile__ ("icbi 0, %0" : : "r"(addr) : "memory");
+}
+
+static inline void iccci(void)
+{
+   __asm__ __volatile__ ("iccci 0, r0");
+}
+
 #endif /* !__ASSEMBLY__ */
 #endif /* __KERNEL__ */
 #endif /* _ASM_POWERPC_CACHE_H */
diff --git a/arch/powerpc/include/asm/cacheflush.h 
b/arch/powerpc/include/asm/cacheflush.h
index ed57843ef452..4c3377aff8ed 100644
--- a/arch/powerpc/include/asm/cacheflush.h
+++ b/arch/powerpc/include/asm/cacheflush.h
@@ -42,24 +42,18 @@ extern void flush_dcache_page(struct page *page);
 #define flush_dcache_mmap_lock(mapping)do { } while (0)
 #define flush_dcache_mmap_unlock(mapping)  do { } while (0)
 
-extern void flush_icache_range(unsigned long, unsigned long);
+void flush_icache_range(unsigned long start, unsigned long stop);
 extern void flush_icache_user_range(struct vm_area_struct *vma,
struct page *page, unsigned long addr,
int len);
-extern void __flush_dcache_icache(void *page_va);
 extern void flush_dcache_icache_page(struct page *page);
-#if defined(CONFIG_PPC32) && !defined(CONFIG_BOOKE)
-extern void __flush_dcache_icache_phys(unsigned long physaddr);
-#else
-static inline void __flush_dcache_icache_phys(unsigned long physaddr)
-{
-   BUG();
-}
-#endif
 
-/*
- * Write any modified data cache blocks out to memory and invalidate them.
+/**
+ * flush_dcache_range(): Write any modified data cache blocks out to memory 
and invalidate them.
  * Does not invalidate the corresponding instruction cache blocks.
+ *
+ * @start: the start address
+ * @stop: the stop address (exclusive)
  */
 static inline void flush_dcache_range(unsigned long start, unsigned long stop)
 {
@@ -82,6 +76,20 @@ static inline void flush_dcache_range(unsigned long start, 
unsigned long stop)
isync();
 }
 
+/**
+ * __flush_dcache_icache(): Flush a particular page from the data cache to RAM.
+ * Note: this is necessary because the instruction cache does *not*
+ * snoop from the data cache.
+ *
+ * @page: the address of the page to flush
+ */
+static inline void __flush_dcache_icache(void *page)
+{
+   unsigned long page_addr = (unsigned long)page;
+
+   flush_icache_range(page_addr, page_addr + PAGE_SIZE);
+}
+
 /*
  * Write any modified data cache blocks out to memory.
  * Does not invalidate the corresponding cache lines (especially for
diff --git a/arch/powerpc/kernel/misc_32.S b/arch/powerpc/kernel/misc_32.S
index fe4bd321730e..12b95e6799d4 100644
--- a/arch/powerpc/kernel/misc_32.S
+++ b/arch/powerpc/kernel/misc_32.S
@@ -318,123 +318,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_UNIFIED_ID_CACHE)
 EXPORT_SYMBOL(flush_instruction_cache)
 #endif /* CONFIG_PPC_8xx */
 
-/*
- * Write any modified data cache blocks out to memory
- * and invalidate the corresponding instruction cache blocks.
- * This is a no-op on the 601.
- *
- * flush_icache_range(unsigned long start, unsigned long stop)
- */
-_GLOBAL(flush_icache_range)
-BEGIN_FTR_SECTION
-

[PATCH 4/6] powerpc: Chunk calls to flush_dcache_range in arch_*_memory

2019-08-14 Thread Alastair D'Silva

From: Alastair D'Silva 

When presented with large amounts of memory being hotplugged
(in my test case, ~890GB), the call to flush_dcache_range takes
a while (~50 seconds), triggering RCU stalls.

This patch breaks up the call into 16GB chunks, calling
cond_resched() inbetween to allow the scheduler to run.

Signed-off-by: Alastair D'Silva 
---
 arch/powerpc/mm/mem.c | 16 ++--
 1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c
index 5400da87a804..fb0d5e9aa11b 100644
--- a/arch/powerpc/mm/mem.c
+++ b/arch/powerpc/mm/mem.c
@@ -104,11 +104,14 @@ int __weak remove_section_mapping(unsigned long start, 
unsigned long end)
return -ENODEV;
 }
 
+#define FLUSH_CHUNK_SIZE (16ull * 1024ull * 1024ull * 1024ull)
+
 int __ref arch_add_memory(int nid, u64 start, u64 size,
struct mhp_restrictions *restrictions)
 {
unsigned long start_pfn = start >> PAGE_SHIFT;
unsigned long nr_pages = size >> PAGE_SHIFT;
+   unsigned long i;
int rc;
 
resize_hpt_for_hotplug(memblock_phys_mem_size());
@@ -120,7 +123,11 @@ int __ref arch_add_memory(int nid, u64 start, u64 size,
start, start + size, rc);
return -EFAULT;
}
-   flush_dcache_range(start, start + size);
+
+   for (i = 0; i < size; i += FLUSH_CHUNK_SIZE) {
+   flush_dcache_range(start + i, min(start + size, start + i + 
FLUSH_CHUNK_SIZE));
+   cond_resched();
+   }
 
return __add_pages(nid, start_pfn, nr_pages, restrictions);
 }
@@ -131,13 +138,18 @@ void __ref arch_remove_memory(int nid, u64 start, u64 
size,
unsigned long start_pfn = start >> PAGE_SHIFT;
unsigned long nr_pages = size >> PAGE_SHIFT;
struct page *page = pfn_to_page(start_pfn) + vmem_altmap_offset(altmap);
+   unsigned long i;
int ret;
 
__remove_pages(page_zone(page), start_pfn, nr_pages, altmap);
 
/* Remove htab bolted mappings for this section of memory */
start = (unsigned long)__va(start);
-   flush_dcache_range(start, start + size);
+   for (i = 0; i < size; i += FLUSH_CHUNK_SIZE) {
+   flush_dcache_range(start + i, min(start + size, start + i + 
FLUSH_CHUNK_SIZE));
+   cond_resched();
+   }
+
ret = remove_section_mapping(start, start + size);
WARN_ON_ONCE(ret);
 
-- 
2.21.0

[PATCH 2/6] powerpc: define helpers to get L1 icache sizes

2019-08-14 Thread Alastair D'Silva

From: Alastair D'Silva 

This patch adds helpers to retrieve icache sizes, and renames the existing
helpers to make it clear that they are for dcache.

Signed-off-by: Alastair D'Silva 
---
 arch/powerpc/include/asm/cache.h  | 29 +++
 arch/powerpc/include/asm/cacheflush.h | 12 +--
 2 files changed, 31 insertions(+), 10 deletions(-)

diff --git a/arch/powerpc/include/asm/cache.h b/arch/powerpc/include/asm/cache.h
index b3388d95f451..f852d5cd746c 100644
--- a/arch/powerpc/include/asm/cache.h
+++ b/arch/powerpc/include/asm/cache.h
@@ -55,25 +55,46 @@ struct ppc64_caches {
 
 extern struct ppc64_caches ppc64_caches;
 
-static inline u32 l1_cache_shift(void)
+static inline u32 l1_dcache_shift(void)
 {
return ppc64_caches.l1d.log_block_size;
 }
 
-static inline u32 l1_cache_bytes(void)
+static inline u32 l1_dcache_bytes(void)
 {
return ppc64_caches.l1d.block_size;
 }
+
+static inline u32 l1_icache_shift(void)
+{
+   return ppc64_caches.l1i.log_block_size;
+}
+
+static inline u32 l1_icache_bytes(void)
+{
+   return ppc64_caches.l1i.block_size;
+}
 #else
-static inline u32 l1_cache_shift(void)
+static inline u32 l1_dcache_shift(void)
 {
return L1_CACHE_SHIFT;
 }
 
-static inline u32 l1_cache_bytes(void)
+static inline u32 l1_dcache_bytes(void)
 {
return L1_CACHE_BYTES;
 }
+
+static inline u32 l1_icache_shift(void)
+{
+   return L1_CACHE_SHIFT;
+}
+
+static inline u32 l1_icache_bytes(void)
+{
+   return L1_CACHE_BYTES;
+}
+
 #endif
 #endif /* ! __ASSEMBLY__ */
 
diff --git a/arch/powerpc/include/asm/cacheflush.h 
b/arch/powerpc/include/asm/cacheflush.h
index eef388f2659f..ed57843ef452 100644
--- a/arch/powerpc/include/asm/cacheflush.h
+++ b/arch/powerpc/include/asm/cacheflush.h
@@ -63,8 +63,8 @@ static inline void __flush_dcache_icache_phys(unsigned long 
physaddr)
  */
 static inline void flush_dcache_range(unsigned long start, unsigned long stop)
 {
-   unsigned long shift = l1_cache_shift();
-   unsigned long bytes = l1_cache_bytes();
+   unsigned long shift = l1_dcache_shift();
+   unsigned long bytes = l1_dcache_bytes();
void *addr = (void *)(start & ~(bytes - 1));
unsigned long size = stop - (unsigned long)addr + (bytes - 1);
unsigned long i;
@@ -89,8 +89,8 @@ static inline void flush_dcache_range(unsigned long start, 
unsigned long stop)
  */
 static inline void clean_dcache_range(unsigned long start, unsigned long stop)
 {
-   unsigned long shift = l1_cache_shift();
-   unsigned long bytes = l1_cache_bytes();
+   unsigned long shift = l1_dcache_shift();
+   unsigned long bytes = l1_dcache_bytes();
void *addr = (void *)(start & ~(bytes - 1));
unsigned long size = stop - (unsigned long)addr + (bytes - 1);
unsigned long i;
@@ -108,8 +108,8 @@ static inline void clean_dcache_range(unsigned long start, 
unsigned long stop)
 static inline void invalidate_dcache_range(unsigned long start,
   unsigned long stop)
 {
-   unsigned long shift = l1_cache_shift();
-   unsigned long bytes = l1_cache_bytes();
+   unsigned long shift = l1_dcache_shift();
+   unsigned long bytes = l1_dcache_bytes();
void *addr = (void *)(start & ~(bytes - 1));
unsigned long size = stop - (unsigned long)addr + (bytes - 1);
unsigned long i;
-- 
2.21.0

[PATCH 1/6] powerpc: Allow flush_icache_range to work across ranges >4GB

2019-08-14 Thread Alastair D'Silva

From: Alastair D'Silva 

When calling flush_icache_range with a size >4GB, we were masking
off the upper 32 bits, so we would incorrectly flush a range smaller
than intended.

This patch replaces the 32 bit shifts with 64 bit ones, so that
the full size is accounted for.

Signed-off-by: Alastair D'Silva 
Cc: sta...@vger.kernel.org
---
 arch/powerpc/kernel/misc_64.S | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kernel/misc_64.S b/arch/powerpc/kernel/misc_64.S
index b55a7b4cb543..9bc0aa9aeb65 100644
--- a/arch/powerpc/kernel/misc_64.S
+++ b/arch/powerpc/kernel/misc_64.S
@@ -82,7 +82,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_COHERENT_ICACHE)
subfr8,r6,r4/* compute length */
add r8,r8,r5/* ensure we get enough */
lwz r9,DCACHEL1LOGBLOCKSIZE(r10)/* Get log-2 of cache block 
size */
-   srw.r8,r8,r9/* compute line count */
+   srd.r8,r8,r9/* compute line count */
beqlr   /* nothing to do? */
mtctr   r8
 1: dcbst   0,r6
@@ -98,7 +98,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_COHERENT_ICACHE)
subfr8,r6,r4/* compute length */
add r8,r8,r5
lwz r9,ICACHEL1LOGBLOCKSIZE(r10)/* Get log-2 of Icache block 
size */
-   srw.r8,r8,r9/* compute line count */
+   srd.r8,r8,r9/* compute line count */
beqlr   /* nothing to do? */
mtctr   r8
 2: icbi0,r6
-- 
2.21.0

[PATCH] clk: sunxi: Don't call clk_hw_get_name() on a hw that isn't registered

2019-08-14 Thread Stephen Boyd

The implementation of clk_hw_get_name() relies on the clk_core
associated with the clk_hw pointer existing. If of_clk_hw_register()
fails, there isn't a clk_core created yet, so calling clk_hw_get_name()
here fails. Extract the name first so we can print it later.

Fixes: 1d80c14248d6 ("clk: sunxi-ng: Add common infrastructure")
Cc: Maxime Ripard 
Cc: Chen-Yu Tsai 
Signed-off-by: Stephen Boyd 
---
 drivers/clk/sunxi-ng/ccu_common.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/clk/sunxi-ng/ccu_common.c 
b/drivers/clk/sunxi-ng/ccu_common.c
index 7fe3ac980e5f..2e20e650b6c0 100644
--- a/drivers/clk/sunxi-ng/ccu_common.c
+++ b/drivers/clk/sunxi-ng/ccu_common.c
@@ -97,14 +97,15 @@ int sunxi_ccu_probe(struct device_node *node, void __iomem 
*reg,
 
for (i = 0; i < desc->hw_clks->num ; i++) {
struct clk_hw *hw = desc->hw_clks->hws[i];
+   const char *name;
 
if (!hw)
continue;
 
+   name = hw->init->name;
ret = of_clk_hw_register(node, hw);
if (ret) {
-   pr_err("Couldn't register clock %d - %s\n",
-  i, clk_hw_get_name(hw));
+   pr_err("Couldn't register clock %d - %s\n", i, name);
goto err_clk_unreg;
}
}

base-commit: 5f9e832c137075045d15cd6899ab0505cfb2ca4b
-- 
Sent by a computer through tubes

Re: [PATCH] KVM: LAPIC: Periodically revaluate appropriate lapic_timer_advance_ns

2019-08-14 Thread Wanpeng Li

On Wed, 14 Aug 2019 at 20:50, Paolo Bonzini  wrote:
>
> On 12/08/19 11:06, Wanpeng Li wrote:
> > On Fri, 9 Aug 2019 at 18:24, Paolo Bonzini  wrote:
> >>
> >> On 09/08/19 07:45, Wanpeng Li wrote:
> >>> From: Wanpeng Li 
> >>>
> >>> Even if for realtime CPUs, cache line bounces, frequency scaling, presence
> >>> of higher-priority RT tasks, etc can cause different response. These
> >>> interferences should be considered and periodically revaluate whether
> >>> or not the lapic_timer_advance_ns value is the best, do nothing if it is,
> >>> otherwise recaluate again.
> >>
> >> How much fluctuation do you observe between different runs?
> >
> > Sometimes can ~1000 cycles after converting to guest tsc freq.
>
> Hmm, I wonder if we need some kind of continuous smoothing.  Something like

Actually this can fluctuate drastically instead of continuous
smoothing during testing (running linux guest instead of
kvm-unit-tests).

>
> if (abs(advance_expire_delta) < LAPIC_TIMER_ADVANCE_ADJUST_DONE) {
> /* no update for random fluctuations */
> return;
> }
>
> if (unlikely(timer_advance_ns > 5000))
> timer_advance_ns = LAPIC_TIMER_ADVANCE_ADJUST_INIT;
> apic->lapic_timer.timer_advance_ns = timer_advance_ns;
>
> and removing all the timer_advance_adjust_done stuff.  What do you think?

I just sent out v2, periodically revaluate and get a minimal
conservative value from these revaluate points. Please have a look. :)

Regards,
Wanpeng Li

[PATCH v2] KVM: LAPIC: Periodically revaluate to get conservative lapic_timer_advance_ns

2019-08-14 Thread Wanpeng Li

From: Wanpeng Li 

Even if for realtime CPUs, cache line bounces, frequency scaling, presence 
of higher-priority RT tasks, etc can still cause different response. These 
interferences should be considered and periodically revaluate whether 
or not the lapic_timer_advance_ns value is the best, do nothing if it is,
otherwise recaluate again. Set lapic_timer_advance_ns to the minimal 
conservative value from all the estimated values.

Testing on Skylake server, cat vcpu*/lapic_timer_advance_ns, before patch:
1628
4161
4321
3236
...

Testing on Skylake server, cat vcpu*/lapic_timer_advance_ns, after patch:
1553
1499
1509
1489
...

Testing on Haswell desktop, cat vcpu*/lapic_timer_advance_ns, before patch:
4617
3641
4102
4577
...
Testing on Haswell desktop, cat vcpu*/lapic_timer_advance_ns, after patch:
2775
2892
2764
2775
...

Cc: Paolo Bonzini 
Cc: Radim Krčmář 
Signed-off-by: Wanpeng Li 
---
 arch/x86/kvm/lapic.c | 34 --
 arch/x86/kvm/lapic.h |  2 ++
 2 files changed, 30 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index df5cd07..8487d9c 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -69,6 +69,7 @@
 #define LAPIC_TIMER_ADVANCE_ADJUST_INIT 1000
 /* step-by-step approximation to mitigate fluctuation */
 #define LAPIC_TIMER_ADVANCE_ADJUST_STEP 8
+#define LAPIC_TIMER_ADVANCE_RECALC_PERIOD (600 * HZ)
 
 static inline int apic_test_vector(int vec, void *bitmap)
 {
@@ -1480,10 +1481,21 @@ static inline void __wait_lapic_expire(struct kvm_vcpu 
*vcpu, u64 guest_cycles)
 static inline void adjust_lapic_timer_advance(struct kvm_vcpu *vcpu,
  s64 advance_expire_delta)
 {
-   struct kvm_lapic *apic = vcpu->arch.apic;
-   u32 timer_advance_ns = apic->lapic_timer.timer_advance_ns;
+   struct kvm_timer *ktimer = >arch.apic->lapic_timer;
+   u32 timer_advance_ns = ktimer->timer_advance_ns;
u64 ns;
 
+   /* periodic revaluate */
+   if (unlikely(ktimer->timer_advance_adjust_done)) {
+   ktimer->recalc_timer_advance_ns = jiffies +
+   LAPIC_TIMER_ADVANCE_RECALC_PERIOD;
+   if (abs(advance_expire_delta) > 
LAPIC_TIMER_ADVANCE_ADJUST_DONE) {
+   timer_advance_ns = LAPIC_TIMER_ADVANCE_ADJUST_INIT;
+   ktimer->timer_advance_adjust_done = false;
+   } else
+   return;
+   }
+
/* too early */
if (advance_expire_delta < 0) {
ns = -advance_expire_delta * 100ULL;
@@ -1499,12 +1511,18 @@ static inline void adjust_lapic_timer_advance(struct 
kvm_vcpu *vcpu,
}
 
if (abs(advance_expire_delta) < LAPIC_TIMER_ADVANCE_ADJUST_DONE)
-   apic->lapic_timer.timer_advance_adjust_done = true;
+   ktimer->timer_advance_adjust_done = true;
if (unlikely(timer_advance_ns > 5000)) {
timer_advance_ns = LAPIC_TIMER_ADVANCE_ADJUST_INIT;
-   apic->lapic_timer.timer_advance_adjust_done = false;
+   ktimer->timer_advance_adjust_done = false;
+   }
+   ktimer->timer_advance_ns = timer_advance_ns;
+
+   if (ktimer->timer_advance_adjust_done) {
+   if (ktimer->min_timer_advance_ns > timer_advance_ns)
+   ktimer->min_timer_advance_ns = timer_advance_ns;
+   ktimer->timer_advance_ns = ktimer->min_timer_advance_ns;
}
-   apic->lapic_timer.timer_advance_ns = timer_advance_ns;
 }
 
 static void __kvm_wait_lapic_expire(struct kvm_vcpu *vcpu)
@@ -1523,7 +1541,8 @@ static void __kvm_wait_lapic_expire(struct kvm_vcpu *vcpu)
if (guest_tsc < tsc_deadline)
__wait_lapic_expire(vcpu, tsc_deadline - guest_tsc);
 
-   if (unlikely(!apic->lapic_timer.timer_advance_adjust_done))
+   if (unlikely(!apic->lapic_timer.timer_advance_adjust_done) ||
+   time_before(apic->lapic_timer.recalc_timer_advance_ns, jiffies))
adjust_lapic_timer_advance(vcpu, 
apic->lapic_timer.advance_expire_delta);
 }
 
@@ -2301,9 +2320,12 @@ int kvm_create_lapic(struct kvm_vcpu *vcpu, int 
timer_advance_ns)
if (timer_advance_ns == -1) {
apic->lapic_timer.timer_advance_ns = 
LAPIC_TIMER_ADVANCE_ADJUST_INIT;
apic->lapic_timer.timer_advance_adjust_done = false;
+   apic->lapic_timer.recalc_timer_advance_ns = jiffies;
+   apic->lapic_timer.min_timer_advance_ns = UINT_MAX;
} else {
apic->lapic_timer.timer_advance_ns = timer_advance_ns;
apic->lapic_timer.timer_advance_adjust_done = true;
+   apic->lapic_timer.recalc_timer_advance_ns = MAX_JIFFY_OFFSET;
}
 
 
diff --git a/arch/x86/kvm/lapic.h b/arch/x86/kvm/lapic.h
index 50053d2..56a05eb 100644
--- a/arch/x86/kvm/lapic.h
+++ b/arch/x86/kvm/lapic.h
@@ -31,6 +31,8 @@ struct kvm_timer {
u32 timer_mode_mask;

Re: clk/clk-next boot bisection: v5.3-rc1-79-g31f58d2f58cb on sun8i-h3-libretech-all-h3-cc

2019-08-14 Thread Stephen Boyd

Quoting kernelci.org bot (2019-08-14 20:35:25)
> * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
> * This automated bisection report was sent to you on the basis  *
> * that you may be involved with the breaking commit it has  *
> * found.  No manual investigation has been done to verify it,   *
> * and the root cause of the problem may be somewhere else.  *
> *   *
> * If you do send a fix, please include this trailer:*
> *   Reported-by: "kernelci.org bot"   *
> *   *
> * Hope this helps!  *
> * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
> 
> clk/clk-next boot bisection: v5.3-rc1-79-g31f58d2f58cb on 
> sun8i-h3-libretech-all-h3-cc

If this is the only board that failed, great! Must be something in a
sun8i driver that uses the init structure after registration.

> 
> Summary:
>   Start:  31f58d2f58cb Merge branch 'clk-meson' into clk-next
>   Details:https://kernelci.org/boot/id/5d54b9d159b514324cf1226e
>   Plain log:  
> https://storage.kernelci.org//clk/clk-next/v5.3-rc1-79-g31f58d2f58cb/arm/multi_v7_defconfig+CONFIG_SMP=n/gcc-8/lab-baylibre/boot-sun8i-h3-libretech-all-h3-cc.txt
>   HTML log:   
> https://storage.kernelci.org//clk/clk-next/v5.3-rc1-79-g31f58d2f58cb/arm/multi_v7_defconfig+CONFIG_SMP=n/gcc-8/lab-baylibre/boot-sun8i-h3-libretech-all-h3-cc.html
>   Result: c82987e740d1 clk: Overwrite clk_hw::init with NULL during 
> clk_register()
> 
> Checks:
>   revert: PASS
>   verify: PASS
> 
> Parameters:
>   Tree:   clk
>   URL:https://git.kernel.org/pub/scm/linux/kernel/git/clk/linux.git
>   Branch: clk-next
>   Target: sun8i-h3-libretech-all-h3-cc
>   CPU arch:   arm
>   Lab:lab-baylibre
>   Compiler:   gcc-8
>   Config: multi_v7_defconfig+CONFIG_SMP=n
>   Test suite: boot
> 
> Breaking commit found:
> 
> ---
> commit c82987e740d12be98b8ae8aa9221b8b9e2541271
> Author: Stephen Boyd 
> Date:   Wed Jul 31 12:35:17 2019 -0700
> 
> clk: Overwrite clk_hw::init with NULL during clk_register()
> 
> We don't want clk provider drivers to use the init structure after clk
> registration time, but we leave a dangling reference to it by means of
> clk_hw::init. Let's overwrite the member with NULL during clk_register()
> so that this can't be used anymore after registration time.
> 
> Cc: Bjorn Andersson 
> Cc: Doug Anderson 
> Signed-off-by: Stephen Boyd 
> Link: https://lkml.kernel.org/r/20190731193517.237136-10-sb...@kernel.org
> Reviewed-by: Sylwester Nawrocki 
> 
> diff --git a/drivers/clk/clk.c b/drivers/clk/clk.c
> index c0990703ce54..efac620264a2 100644
> --- a/drivers/clk/clk.c
> +++ b/drivers/clk/clk.c
> @@ -3484,9 +3484,9 @@ static int clk_cpy_name(const char **dst_p, const char 
> *src, bool must_exist)
> return 0;
>  }
>  
> -static int clk_core_populate_parent_map(struct clk_core *core)
> +static int clk_core_populate_parent_map(struct clk_core *core,
> +   const struct clk_init_data *init)
>  {
> -   const struct clk_init_data *init = core->hw->init;
> u8 num_parents = init->num_parents;
> const char * const *parent_names = init->parent_names;
> const struct clk_hw **parent_hws = init->parent_hws;
> @@ -3566,6 +3566,14 @@ __clk_register(struct device *dev, struct device_node 
> *np, struct clk_hw *hw)
>  {
> int ret;
> struct clk_core *core;
> +   const struct clk_init_data *init = hw->init;
> +
> +   /*
> +* The init data is not supposed to be used outside of registration 
> path.
> +* Set it to NULL so that provider drivers can't use it either and so 
> that
> +* we catch use of hw->init early on in the core.
> +*/
> +   hw->init = NULL;
>  
> core = kzalloc(sizeof(*core), GFP_KERNEL);
> if (!core) {
> @@ -3573,17 +3581,17 @@ __clk_register(struct device *dev, struct device_node 
> *np, struct clk_hw *hw)
> goto fail_out;
> }
>  
> -   core->name = kstrdup_const(hw->init->name, GFP_KERNEL);
> +   core->name = kstrdup_const(init->name, GFP_KERNEL);
> if (!core->name) {
> ret = -ENOMEM;
> goto fail_name;
> }
>  
> -   if (WARN_ON(!hw->init->ops)) {
> +   if (WARN_ON(!init->ops)) {
> ret = -EINVAL;
> goto fail_ops;
> }
> -   core->ops = hw->init->ops;
> +   core->ops = init->ops;
>  
> if (dev && pm_runtime_enabled(dev))
> core->rpm_enabled = true;
> @@ -3592,13 +3600,13 @@ __clk_register(struct device *dev, struct device_node 
> *np, struct clk_hw *hw)
> if (dev && dev->driver)
>

[PATCH] perf vendor events intel: Add Tremontx event file v1.02

2019-08-14 Thread Haiyan Song

Add a Intel event file for perf.

Signed-off-by: Haiyan Song 
---
 tools/perf/pmu-events/arch/x86/mapfile.csv |   1 +
 tools/perf/pmu-events/arch/x86/tremontx/cache.json | 111 ++
 .../pmu-events/arch/x86/tremontx/frontend.json |  26 ++
 .../perf/pmu-events/arch/x86/tremontx/memory.json  |  26 ++
 tools/perf/pmu-events/arch/x86/tremontx/other.json |  26 ++
 .../pmu-events/arch/x86/tremontx/pipeline.json | 111 ++
 .../arch/x86/tremontx/uncore-memory.json   |  73 
 .../pmu-events/arch/x86/tremontx/uncore-other.json | 431 +
 .../pmu-events/arch/x86/tremontx/uncore-power.json |  11 +
 .../arch/x86/tremontx/virtual-memory.json  |  86 
 10 files changed, 902 insertions(+)
 create mode 100644 tools/perf/pmu-events/arch/x86/tremontx/cache.json
 create mode 100644 tools/perf/pmu-events/arch/x86/tremontx/frontend.json
 create mode 100644 tools/perf/pmu-events/arch/x86/tremontx/memory.json
 create mode 100644 tools/perf/pmu-events/arch/x86/tremontx/other.json
 create mode 100644 tools/perf/pmu-events/arch/x86/tremontx/pipeline.json
 create mode 100644 tools/perf/pmu-events/arch/x86/tremontx/uncore-memory.json
 create mode 100644 tools/perf/pmu-events/arch/x86/tremontx/uncore-other.json
 create mode 100644 tools/perf/pmu-events/arch/x86/tremontx/uncore-power.json
 create mode 100644 tools/perf/pmu-events/arch/x86/tremontx/virtual-memory.json

diff --git a/tools/perf/pmu-events/arch/x86/mapfile.csv 
b/tools/perf/pmu-events/arch/x86/mapfile.csv
index b90e5fec2f32..745ced083844 100644
--- a/tools/perf/pmu-events/arch/x86/mapfile.csv
+++ b/tools/perf/pmu-events/arch/x86/mapfile.csv
@@ -35,4 +35,5 @@ GenuineIntel-6-55-[01234],v1,skylakex,core
 GenuineIntel-6-55-[56789ABCDEF],v1,cascadelakex,core
 GenuineIntel-6-7D,v1,icelake,core
 GenuineIntel-6-7E,v1,icelake,core
+GenuineIntel-6-86,v1,tremontx,core
 AuthenticAMD-23-[[:xdigit:]]+,v1,amdfam17h,core
diff --git a/tools/perf/pmu-events/arch/x86/tremontx/cache.json 
b/tools/perf/pmu-events/arch/x86/tremontx/cache.json
new file mode 100644
index ..f88040171b4d
--- /dev/null
+++ b/tools/perf/pmu-events/arch/x86/tremontx/cache.json
@@ -0,0 +1,111 @@
+[
+{
+"CollectPEBSRecord": "2",
+"PublicDescription": "Counts cacheable memory requests that miss in 
the the Last Level Cache.  Requests include Demand Loads, Reads for 
Ownership(RFO), Instruction fetches and L1 HW prefetches. If the platform has 
an L3 cache, last level cache is the L3, otherwise it is the L2.",
+"EventCode": "0x2e",
+"Counter": "0,1,2,3",
+"UMask": "0x41",
+"PEBScounters": "0,1,2,3",
+"EventName": "LONGEST_LAT_CACHE.MISS",
+"PDIR_COUNTER": "na",
+"SampleAfterValue": "23",
+"BriefDescription": "Counts memory requests originating from the core 
that miss in the last level cache. If the platform has an L3 cache, last level 
cache is the L3, otherwise it is the L2."
+},
+{
+"CollectPEBSRecord": "2",
+"PublicDescription": "Counts cacheable memory requests that access the 
Last Level Cache.  Requests include Demand Loads, Reads for Ownership(RFO), 
Instruction fetches and L1 HW prefetches. If the platform has an L3 cache, last 
level cache is the L3, otherwise it is the L2.",
+"EventCode": "0x2e",
+"Counter": "0,1,2,3",
+"UMask": "0x4f",
+"PEBScounters": "0,1,2,3",
+"EventName": "LONGEST_LAT_CACHE.REFERENCE",
+"PDIR_COUNTER": "na",
+"SampleAfterValue": "23",
+"BriefDescription": "Counts memory requests originating from the core 
that reference a cache line in the last level cache. If the platform has an L3 
cache, last level cache is the L3, otherwise it is the L2."
+},
+{
+"PEBS": "1",
+"CollectPEBSRecord": "2",
+"PublicDescription": "Counts the number of load uops retired. This 
event is Precise Event capable",
+"EventCode": "0xd0",
+"Counter": "0,1,2,3",
+"UMask": "0x81",
+"PEBScounters": "0,1,2,3",
+"EventName": "MEM_UOPS_RETIRED.ALL_LOADS",
+"SampleAfterValue": "23",
+"BriefDescription": "Counts the number of load uops retired.",
+"Data_LA": "1"
+},
+{
+"PEBS": "1",
+"CollectPEBSRecord": "2",
+"PublicDescription": "Counts the number of store uops retired. This 
event is Precise Event capable",
+"EventCode": "0xd0",
+"Counter": "0,1,2,3",
+"UMask": "0x82",
+"PEBScounters": "0,1,2,3",
+"EventName": "MEM_UOPS_RETIRED.ALL_STORES",
+"SampleAfterValue": "23",
+"BriefDescription": "Counts the number of store uops retired.",
+"Data_LA": "1"
+},
+{
+"PEBS": "1",
+"CollectPEBSRecord": "2",
+"EventCode": "0xd1",
+"Counter": "0,1,2,3",
+"UMask": "0x1",
+"PEBScounters": "0,1,2,3",
+"EventName":

Re: [5.3.0-rc4-next][bisected 882632][qla2xxx] WARNING: CPU: 10 PID: 425 at drivers/scsi/qla2xxx/qla_isr.c:2784 qla2x00_status_entry.isra

2019-08-14 Thread Bart Van Assche


On 8/14/19 10:18 AM, Abdul Haleem wrote:

On Wed, 2019-08-14 at 10:05 -0700, Bart Van Assche wrote:

On 8/14/19 9:52 AM, Abdul Haleem wrote:

Greeting's

Today's linux-next kernel (5.3.0-rc4-next-20190813)  booted with warning on my 
powerpc power 8 lpar

The WARN_ON_ONCE() was introduced by commit 88263208 (scsi: qla2xxx: Complain if 
sp->done() is not...)

boot logs:

WARNING: CPU: 10 PID: 425 at drivers/scsi/qla2xxx/qla_isr.c:2784


Hi Abdul,

Thank you for having reported this. Is that the only warning reported on your 
setup by the qla2xxx
driver? If that warning is commented out, does the qla2xxx driver work as 
expected?


boot warning did not show up when the commit is reverted.

should I comment out only the WARN_ON_ONCE() which is causing the issue,
and not the other one ?


Yes please. Commit 88263208 introduced five kernel warnings but I think 
only one of these should be removed again, e.g. as follows:


diff --git a/drivers/scsi/qla2xxx/qla_isr.c b/drivers/scsi/qla2xxx/qla_isr.c
index cd39ac18c5fd..d81b5ecce24b 100644
--- a/drivers/scsi/qla2xxx/qla_isr.c
+++ b/drivers/scsi/qla2xxx/qla_isr.c
@@ -2780,8 +2780,6 @@ qla2x00_status_entry(scsi_qla_host_t *vha, struct 
rsp_que *rsp, void *pkt)


if (rsp->status_srb == NULL)
sp->done(sp, res);
-   else
-   WARN_ON_ONCE(true);
 }

 /**

[PATCH v5] arm64: dts: ls1028a: Add esdhc node in dts

2019-08-14 Thread Yinbo Zhu

From: Ashish Kumar 

This patch is to add esdhc node and enable SD UHS-I,
eMMC HS200 for ls1028ardb/ls1028aqds board.

Signed-off-by: Ashish Kumar 
Signed-off-by: Yangbo Lu 
Signed-off-by: Yinbo Zhu 
---
Change in v5:
Fix indent.

 arch/arm64/boot/dts/freescale/fsl-ls1028a-qds.dts |  8 +++
 arch/arm64/boot/dts/freescale/fsl-ls1028a-rdb.dts | 13 +++
 arch/arm64/boot/dts/freescale/fsl-ls1028a.dtsi| 27 +++
 3 files changed, 48 insertions(+)

diff --git a/arch/arm64/boot/dts/freescale/fsl-ls1028a-qds.dts 
b/arch/arm64/boot/dts/freescale/fsl-ls1028a-qds.dts
index de6ef39..5e14e5a 100644
--- a/arch/arm64/boot/dts/freescale/fsl-ls1028a-qds.dts
+++ b/arch/arm64/boot/dts/freescale/fsl-ls1028a-qds.dts
@@ -95,6 +95,14 @@
status = "okay";
 };
 
+ {
+   status = "okay";
+};
+
+ {
+   status = "okay";
+};
+
  {
status = "okay";
 
diff --git a/arch/arm64/boot/dts/freescale/fsl-ls1028a-rdb.dts 
b/arch/arm64/boot/dts/freescale/fsl-ls1028a-rdb.dts
index 9fb9113..1a69221 100644
--- a/arch/arm64/boot/dts/freescale/fsl-ls1028a-rdb.dts
+++ b/arch/arm64/boot/dts/freescale/fsl-ls1028a-rdb.dts
@@ -83,6 +83,19 @@
};
 };
 
+ {
+   sd-uhs-sdr104;
+   sd-uhs-sdr50;
+   sd-uhs-sdr25;
+   sd-uhs-sdr12;
+   status = "okay";
+};
+
+ {
+   mmc-hs200-1_8v;
+   status = "okay";
+};
+
  {
status = "okay";
 
diff --git a/arch/arm64/boot/dts/freescale/fsl-ls1028a.dtsi 
b/arch/arm64/boot/dts/freescale/fsl-ls1028a.dtsi
index 7975519..f299075 100644
--- a/arch/arm64/boot/dts/freescale/fsl-ls1028a.dtsi
+++ b/arch/arm64/boot/dts/freescale/fsl-ls1028a.dtsi
@@ -245,6 +245,33 @@
status = "disabled";
};
 
+   esdhc: mmc@214 {
+   compatible = "fsl,ls1028a-esdhc", "fsl,esdhc";
+   reg = <0x0 0x214 0x0 0x1>;
+   interrupts = ;
+   clock-frequency = <0>; /* fixed up by bootloader */
+   clocks = < 2 1>;
+   voltage-ranges = <1800 1800 3300 3300>;
+   sdhci,auto-cmd12;
+   little-endian;
+   bus-width = <4>;
+   status = "disabled";
+   };
+
+   esdhc1: mmc@215 {
+   compatible = "fsl,ls1028a-esdhc", "fsl,esdhc";
+   reg = <0x0 0x215 0x0 0x1>;
+   interrupts = ;
+   clock-frequency = <0>; /* fixed up by bootloader */
+   clocks = < 2 1>;
+   voltage-ranges = <1800 1800 3300 3300>;
+   sdhci,auto-cmd12;
+   broken-cd;
+   little-endian;
+   bus-width = <4>;
+   status = "disabled";
+   };
+
duart0: serial@21c0500 {
compatible = "fsl,ns16550", "ns16550a";
reg = <0x00 0x21c0500 0x0 0x100>;
-- 
2.9.5

clk/clk-next boot bisection: v5.3-rc1-79-g31f58d2f58cb on sun8i-h3-libretech-all-h3-cc

2019-08-14 Thread kernelci.org bot

* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
* This automated bisection report was sent to you on the basis  *
* that you may be involved with the breaking commit it has  *
* found.  No manual investigation has been done to verify it,   *
* and the root cause of the problem may be somewhere else.  *
*   *
* If you do send a fix, please include this trailer:*
*   Reported-by: "kernelci.org bot"   *
*   *
* Hope this helps!  *
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

clk/clk-next boot bisection: v5.3-rc1-79-g31f58d2f58cb on 
sun8i-h3-libretech-all-h3-cc

Summary:
  Start:  31f58d2f58cb Merge branch 'clk-meson' into clk-next
  Details:https://kernelci.org/boot/id/5d54b9d159b514324cf1226e
  Plain log:  
https://storage.kernelci.org//clk/clk-next/v5.3-rc1-79-g31f58d2f58cb/arm/multi_v7_defconfig+CONFIG_SMP=n/gcc-8/lab-baylibre/boot-sun8i-h3-libretech-all-h3-cc.txt
  HTML log:   
https://storage.kernelci.org//clk/clk-next/v5.3-rc1-79-g31f58d2f58cb/arm/multi_v7_defconfig+CONFIG_SMP=n/gcc-8/lab-baylibre/boot-sun8i-h3-libretech-all-h3-cc.html
  Result: c82987e740d1 clk: Overwrite clk_hw::init with NULL during 
clk_register()

Checks:
  revert: PASS
  verify: PASS

Parameters:
  Tree:   clk
  URL:https://git.kernel.org/pub/scm/linux/kernel/git/clk/linux.git
  Branch: clk-next
  Target: sun8i-h3-libretech-all-h3-cc
  CPU arch:   arm
  Lab:lab-baylibre
  Compiler:   gcc-8
  Config: multi_v7_defconfig+CONFIG_SMP=n
  Test suite: boot

Breaking commit found:

---
commit c82987e740d12be98b8ae8aa9221b8b9e2541271
Author: Stephen Boyd 
Date:   Wed Jul 31 12:35:17 2019 -0700

clk: Overwrite clk_hw::init with NULL during clk_register()

We don't want clk provider drivers to use the init structure after clk
registration time, but we leave a dangling reference to it by means of
clk_hw::init. Let's overwrite the member with NULL during clk_register()
so that this can't be used anymore after registration time.

Cc: Bjorn Andersson 
Cc: Doug Anderson 
Signed-off-by: Stephen Boyd 
Link: https://lkml.kernel.org/r/20190731193517.237136-10-sb...@kernel.org
Reviewed-by: Sylwester Nawrocki 

diff --git a/drivers/clk/clk.c b/drivers/clk/clk.c
index c0990703ce54..efac620264a2 100644
--- a/drivers/clk/clk.c
+++ b/drivers/clk/clk.c
@@ -3484,9 +3484,9 @@ static int clk_cpy_name(const char **dst_p, const char 
*src, bool must_exist)
return 0;
 }
 
-static int clk_core_populate_parent_map(struct clk_core *core)
+static int clk_core_populate_parent_map(struct clk_core *core,
+   const struct clk_init_data *init)
 {
-   const struct clk_init_data *init = core->hw->init;
u8 num_parents = init->num_parents;
const char * const *parent_names = init->parent_names;
const struct clk_hw **parent_hws = init->parent_hws;
@@ -3566,6 +3566,14 @@ __clk_register(struct device *dev, struct device_node 
*np, struct clk_hw *hw)
 {
int ret;
struct clk_core *core;
+   const struct clk_init_data *init = hw->init;
+
+   /*
+* The init data is not supposed to be used outside of registration 
path.
+* Set it to NULL so that provider drivers can't use it either and so 
that
+* we catch use of hw->init early on in the core.
+*/
+   hw->init = NULL;
 
core = kzalloc(sizeof(*core), GFP_KERNEL);
if (!core) {
@@ -3573,17 +3581,17 @@ __clk_register(struct device *dev, struct device_node 
*np, struct clk_hw *hw)
goto fail_out;
}
 
-   core->name = kstrdup_const(hw->init->name, GFP_KERNEL);
+   core->name = kstrdup_const(init->name, GFP_KERNEL);
if (!core->name) {
ret = -ENOMEM;
goto fail_name;
}
 
-   if (WARN_ON(!hw->init->ops)) {
+   if (WARN_ON(!init->ops)) {
ret = -EINVAL;
goto fail_ops;
}
-   core->ops = hw->init->ops;
+   core->ops = init->ops;
 
if (dev && pm_runtime_enabled(dev))
core->rpm_enabled = true;
@@ -3592,13 +3600,13 @@ __clk_register(struct device *dev, struct device_node 
*np, struct clk_hw *hw)
if (dev && dev->driver)
core->owner = dev->driver->owner;
core->hw = hw;
-   core->flags = hw->init->flags;
-   core->num_parents = hw->init->num_parents;
+   core->flags = init->flags;
+   core->num_parents = init->num_parents;
core->min_rate = 0;
core->max_rate = ULONG_MAX;
hw->core = core;
 
-   ret = clk_core_populate_parent_map(core);
+   ret =

Re: [PATCH v3 2/2] pwm: sprd: Add Spreadtrum PWM support

2019-08-14 Thread Baolin Wang

Hi Uwe,

On Wed, 14 Aug 2019 at 23:03, Uwe Kleine-König
 wrote:
>
> On Wed, Aug 14, 2019 at 08:46:11PM +0800, Baolin Wang wrote:

> > +
> > + /*
> > +  * The hardware provides a counter that is feed by the source clock.
> > +  * The period length is (PRESCALE + 1) * MOD counter steps.
> > +  * The duty cycle length is (PRESCALE + 1) * DUTY counter steps.
> > +  * Thus the period_ns and duty_ns calculation formula should be:
> > +  * period_ns = NSEC_PER_SEC * (prescale + 1) * mod / clk_rate
> > +  * duty_ns = NSEC_PER_SEC * (prescale + 1) * duty / clk_rate
> > +  */
> > + val = sprd_pwm_read(spc, pwm->hwpwm, SPRD_PWM_PRESCALE);
> > + prescale = val & SPRD_PWM_PRESCALE_MSK;
> > + tmp = (prescale + 1) * NSEC_PER_SEC * SPRD_PWM_MOD_MAX;
> > + state->period = DIV_ROUND_CLOSEST_ULL(tmp, chn->clk_rate);
> > +
> > + val = sprd_pwm_read(spc, pwm->hwpwm, SPRD_PWM_DUTY);
> > + duty = val & SPRD_PWM_DUTY_MSK;
> > + tmp = (prescale + 1) * NSEC_PER_SEC * duty;
> > + state->duty_cycle = DIV_ROUND_CLOSEST_ULL(tmp, chn->clk_rate);
> > +
> > + /* Disable PWM clocks if the PWM channel is not in enable state. */
> > + if (!state->enabled)
> > + clk_bulk_disable_unprepare(SPRD_PWM_CHN_CLKS_NUM, chn->clks);
> > +}
> > +
> > +static int sprd_pwm_config(struct sprd_pwm_chip *spc, struct pwm_device 
> > *pwm,
> > +int duty_ns, int period_ns)
> > +{
> > + struct sprd_pwm_chn *chn = >chn[pwm->hwpwm];
> > + u32 prescale, duty;
> > + u64 tmp;
> > +
> > + /*
> > +  * The hardware provides a counter that is feed by the source clock.
> > +  * The period length is (PRESCALE + 1) * MOD counter steps.
> > +  * The duty cycle length is (PRESCALE + 1) * DUTY counter steps.
> > +  *
> > +  * To keep the maths simple we're always using MOD = SPRD_PWM_MOD_MAX.
>
> Did you spend some thoughts about how wrong your period can get because
> of that "lazyness"?
>
> Let's assume a clk rate of 100/3 MHz. Then the available period lengths
> are:
>
> PRESCALE =  0  ->  period =   7.65 µs
> PRESCALE =  1  ->  period =  15.30 µs
> ...
> PRESCALE = 17  ->  period = 137.70 µs
> PRESCALE = 18  ->  period = 145.35 µs
>
> So the error can be up to (nearly) 7.65 µs (or in general

Yes, but for our use case (pwm backlight), the precision can meet our
requirement. Moreover, we usually do not change the period, just
adjust the duty to change the back light.

> 255 / clk_rate) because if 145.34 µs is requested you configure
> PRESCALE = 17 and so yield a period of 137.70 µs. If however you'd pick

I did not get you here, if period is 145.34, we still get the
corresponding PRESCALE = 18 by below formula:

tmp = (u64)chn->clk_rate * period_ns;
do_div(tmp, NSEC_PER_SEC);
prescale = DIV_ROUND_CLOSEST_ULL(tmp, SPRD_PWM_MOD_MAX) - 1;

> PRESCALE = 18 and MOD = 254 you get a period of 144.78 µs and so the
> error is only 0.56 µs which is a factor of 13 better.
>
> Hmm.
>
> > +  * The value for PRESCALE is selected such that the resulting period
> > +  * gets the maximal length not bigger than the requested one with the
> > +  * given settings (MOD = SPRD_PWM_MOD_MAX and input clock).
> > +  */
> > + duty = duty_ns * SPRD_PWM_MOD_MAX / period_ns;
>
> I wonder if you loose some precision here as you use period_ns but might
> actually implement a shorter period.
>
> Quick example, again consider clk_rate = 100 / 3 MHz,
> period_ns = 145340, duty_ns = 72670. Then you end up with
>
> PRESCALE = 17
> MOD = 255
> DUTY = 127

Incorrect, we will get PRESCALE = 18,  MOD = 255, DUTY = 127.

> That corresponds to period_ns = 137700, duty_ns = 68580. With DUTY = 134
> you get 72360 ns which is still smaller than the requested 72670 ns.

Incorrect, with DUTY = 134 (PRESCALE = 18  ->  period = 145.35 µs),
duty_ns = 76380ns

> (But then again it is not obvious which of the two is the "better"
> approximation because Thierry doesn't seem to see the necessity to
> discuss or define a policy here.)

Like I said, this is the simple calculation formula which can meet our
requirement (we limit our DUTY value can only be 0 - 254).
duty = duty_ns * SPRD_PWM_MOD_MAX / period_ns;

>
> (And to pick up the thoughts about not using SPRD_PWM_MOD_MAX
> unconditionally, you could also use
>
> PRESCALE = 18
> MOD = 254
> DUTY = 127
>
> yielding period_ns = 144780 and duty_ns = 72390. Summary:
>
> Request: 72670 / 145340
> your result: 68580 / 137700
> also possible:   72390 / 144780
>
> Judge yourself.)
>
> > + tmp = (u64)chn->clk_rate * period_ns;
> > + do_div(tmp, NSEC_PER_SEC);
> > + prescale = DIV_ROUND_CLOSEST_ULL(tmp, SPRD_PWM_MOD_MAX) - 1;
>
> Now that you use DIV_ROUND_CLOSEST_ULL the comment is wrong because you
> might provide a period bigger than the requested one. Also you

Re: [PATCH V5 0/9] Fixes for vhost metadata acceleration

2019-08-14 Thread Jason Wang




On 2019/8/14 上午12:41, Christoph Hellwig wrote:

On Tue, Aug 13, 2019 at 08:57:07AM -0300, Jason Gunthorpe wrote:

On Tue, Aug 13, 2019 at 04:31:07PM +0800, Jason Wang wrote:


What kind of issues do you see? Spinlock is to synchronize GUP with MMU
notifier in this series.

A GUP that can't sleep can't pagefault which makes it a really weird
pattern

get_user_pages/get_user_pages_fast must not be called under a spinlock.
We have the somewhat misnamed __get_user_page_fast that just does a
lookup for existing pages and never faults for a few places that need
to do that lookup from contexts where we can't sleep.



Yes, I do use __get_user_pages_fast() in the code.

Thanks

Re: [PATCH V5 0/9] Fixes for vhost metadata acceleration

2019-08-14 Thread Jason Wang




On 2019/8/13 下午7:57, Jason Gunthorpe wrote:

On Tue, Aug 13, 2019 at 04:31:07PM +0800, Jason Wang wrote:


What kind of issues do you see? Spinlock is to synchronize GUP with MMU
notifier in this series.

A GUP that can't sleep can't pagefault which makes it a really weird
pattern



My understanding is __get_user_pages_fast() assumes caller can fail or 
have fallback. And we have graceful fallback to copy_{to|from}_user().






Btw, back to the original question. May I know why synchronize_rcu() is not
suitable? Consider:

We already went over this. You'd need to determine it doesn't somehow
deadlock the mm on reclaim paths. Maybe it is OK, the rcq_gq_wq is
marked WQ_MEM_RECLAIM at least..



Yes, will take a look at this.




I also think Michael was concerned about the latency spikes a long RCU
delay would cause.



I don't think it's a real problem consider MMU notifier could be 
preempted or blocked.


Thanks




Jason

Re: [PATCH] virtio-net: lower min ring num_free for efficiency

2019-08-14 Thread Jason Wang




On 2019/8/15 上午11:11, 冉 jiang wrote:

On 2019/8/15 11:01, Jason Wang wrote:

On 2019/8/14 上午10:06, ? jiang wrote:

This change lowers ring buffer reclaim threshold from 1/2*queue to
budget
for better performance. According to our test with qemu + dpdk, packet
dropping happens when the guest is not able to provide free buffer in
avail ring timely with default 1/2*queue. The value in the patch has
been
tested and does show better performance.


Please add your tests setup and result here.

Thanks



Signed-off-by: jiangkidd 
---
   drivers/net/virtio_net.c | 2 +-
   1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 0d4115c9e20b..bc08be7925eb 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -1331,7 +1331,7 @@ static int virtnet_receive(struct receive_queue
*rq, int budget,
   }
   }
   -    if (rq->vq->num_free > virtqueue_get_vring_size(rq->vq) / 2) {
+    if (rq->vq->num_free > min((unsigned int)budget,
virtqueue_get_vring_size(rq->vq)) / 2) {
   if (!try_fill_recv(vi, rq, GFP_ATOMIC))
   schedule_delayed_work(>refill, 0);
   }

Sure, here are the details:



Thanks for the details, but I meant it's better if you could summarize 
you test result in the commit log in a compact way.


Btw, some comments, see below:





Test setup & result:



Below is the snippet from our test result. Test1 was done with default
driver with the value of 1/2 * queue, while test2 is with my patch. We
can see average
drop packets do decrease a lot in test2.

test1Time    avgDropPackets    test2Time    avgDropPackets pps

16:21.0    12.295    56:50.4    0 300k
17:19.1    15.244    56:50.4    0    300k
18:17.5    18.789    56:50.4    0    300k
19:15.1    14.208    56:50.4    0    300k
20:13.2    20.818    56:50.4    0.267    300k
21:11.2    12.397    56:50.4    0    300k
22:09.3    12.599    56:50.4    0    300k
23:07.3    15.531    57:48.4    0    300k
24:05.5    13.664    58:46.5    0    300k
25:03.7    13.158    59:44.5    4.73    300k
26:01.1    2.486    00:42.6    0    300k
26:59.1    11.241    01:40.6    0    300k
27:57.2    20.521    02:38.6    0    300k
28:55.2    30.094    03:36.7    0    300k
29:53.3    16.828    04:34.7    0.963    300k
30:51.3    46.916    05:32.8    0    400k
31:49.3    56.214    05:32.8    0    400k
32:47.3    58.69    05:32.8    0    400k
33:45.3    61.486    05:32.8    0    400k
34:43.3    72.175    05:32.8    0.598    400k
35:41.3    56.699    05:32.8    0    400k
36:39.3    61.071    05:32.8    0    400k
37:37.3    43.355    06:30.8    0    400k
38:35.4    44.644    06:30.8    0    400k
39:33.4    72.336    06:30.8    0    400k
40:31.4    70.676    06:30.8    0    400k
41:29.4    108.009    06:30.8    0    400k
42:27.4    65.216    06:30.8    0    400k



Why there're difference in test time? Could you summarize them like:

Test setup: e.g testpmd or pktgen to generate packets to guest

avg packets drop before: XXX

avg packets drop after: YYY(-ZZZ%)

Thanks





Data to prove why the patch helps:



We did have completed several rounds of test with setting the value to
budget (64 as the default value). It does improve a lot with pps is
below 400pps for a single stream. We are confident that it runs out of free
buffer in avail ring when packet dropping happens with below systemtap:

Just a snippet:

probe module("virtio_ring").function("virtqueue_get_buf")
{
   x = (@cast($_vq, "vring_virtqueue")->vring->used->idx)-
(@cast($_vq, "vring_virtqueue")->last_used_idx) ---> we use this one
to verify if the queue is full, which means guest is not able to take
buffer from the queue timely

   if (x<0 && (x+65535)<4096)
       x = x+65535

   if((x==1024) && @cast($_vq, "vring_virtqueue")->vq->callback ==
callback_addr)
       netrxcount[x] <<< gettimeofday_s()
}


probe module("virtio_ring").function("virtqueue_add_inbuf")
{
   y = (@cast($vq, "vring_virtqueue")->vring->avail->idx)-
(@cast($vq, "vring_virtqueue")->vring->used->idx) ---> we use this one
to verify if we run out of free buffer in avail ring
   if (y<0 && (y+65535)<4096)
       y = y+65535

   if(@2=="debugon")
   {
       if(y==0 && @cast($vq, "vring_virtqueue")->vq->callback ==
callback_addr)
       {
           netrxfreecount[y] <<< gettimeofday_s()

           printf("no avail ring left seen, printing most recent 5
num free, vq: %lx, current index: %d\n", $vq, recentfreecount)
           for(i=recentfreecount; i!=((recentfreecount+4) % 5);
i=((i+1) % 5))
           {
               printf("index: %d, num free: %d\n", i, recentfree[$vq,
i])
           }

           printf("index: %d, num free: %d\n", i, recentfree[$vq, i])
           //exit()
       }
   }
}


probe

[PATCH] arm64: dts: imx8mn: Add gpio-ranges property

2019-08-14 Thread Anson . Huang

From: Anson Huang 

Add "gpio-ranges" property to establish connections between GPIOs
and PINs on i.MX8MN pinctrl driver.

Signed-off-by: Anson Huang 
---
 arch/arm64/boot/dts/freescale/imx8mn.dtsi | 5 +
 1 file changed, 5 insertions(+)

diff --git a/arch/arm64/boot/dts/freescale/imx8mn.dtsi 
b/arch/arm64/boot/dts/freescale/imx8mn.dtsi
index f5eff35..1d8899b 100644
--- a/arch/arm64/boot/dts/freescale/imx8mn.dtsi
+++ b/arch/arm64/boot/dts/freescale/imx8mn.dtsi
@@ -173,6 +173,7 @@
#gpio-cells = <2>;
interrupt-controller;
#interrupt-cells = <2>;
+   gpio-ranges = < 0 10 30>;
};
 
gpio2: gpio@3021 {
@@ -185,6 +186,7 @@
#gpio-cells = <2>;
interrupt-controller;
#interrupt-cells = <2>;
+   gpio-ranges = < 0 40 21>;
};
 
gpio3: gpio@3022 {
@@ -197,6 +199,7 @@
#gpio-cells = <2>;
interrupt-controller;
#interrupt-cells = <2>;
+   gpio-ranges = < 0 61 26>;
};
 
gpio4: gpio@3023 {
@@ -209,6 +212,7 @@
#gpio-cells = <2>;
interrupt-controller;
#interrupt-cells = <2>;
+   gpio-ranges = < 21 108 11>;
};
 
gpio5: gpio@3024 {
@@ -221,6 +225,7 @@
#gpio-cells = <2>;
interrupt-controller;
#interrupt-cells = <2>;
+   gpio-ranges = < 0 119 30>;
};
 
wdog1: watchdog@3028 {
-- 
2.7.4

Re: [PATCH] sh: Drop -Werror from kernel Makefile

2019-08-14 Thread Guenter Roeck


On 8/14/19 5:59 PM, Gustavo A. R. Silva wrote:

Guenter,

On 8/13/19 8:18 AM, Guenter Roeck wrote:


Please note that _mainline_ builds are currently broken.



This should be fixed now:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=41de59634046b19cd53a1983594a95135c656997



Yes, it is.

Thanks!

Guenter

Re: [PATCH 1/2] riscv: Add memmove string operation.

2019-08-14 Thread Nick Hu

Hi Paul,

On Wed, Aug 14, 2019 at 10:03:39AM -0700, Paul Walmsley wrote:
> Hi Nick,
> 
> On Wed, 14 Aug 2019, Nick Hu wrote:
> 
> > On Wed, Aug 14, 2019 at 10:22:15AM +0800, Paul Walmsley wrote:
> > > On Tue, 13 Aug 2019, Palmer Dabbelt wrote:
> > > 
> > > > On Mon, 12 Aug 2019 08:04:46 PDT (-0700), Christoph Hellwig wrote:
> > > > > On Wed, Aug 07, 2019 at 03:19:14PM +0800, Nick Hu wrote:
> > > > > > There are some features which need this string operation for 
> > > > > > compilation,
> > > > > > like KASAN. So the purpose of this porting is for the features like 
> > > > > > KASAN
> > > > > > which cannot be compiled without it.
> > > > > > 
> > > > > > KASAN's string operations would replace the original string 
> > > > > > operations and
> > > > > > call for the architecture defined string operations. Since we don't 
> > > > > > have
> > > > > > this in current kernel, this patch provides the implementation.
> > > > > > 
> > > > > > This porting refers to the 'arch/nds32/lib/memmove.S'.
> > > > > 
> > > > > This looks sensible to me, although my stringop asm is rather rusty,
> > > > > so just an ack and not a real review-by:
> > > > 
> > > > FWIW, we just write this in C everywhere else and rely on the compiler 
> > > > to
> > > > unroll the loops.  I always prefer C to assembly when possible, so I'd 
> > > > prefer
> > > > if we just adopt the string code from newlib.  We have a RISC-V-specific
> > > > memcpy in there, but just use the generic memmove.
> > > > 
> > > > Maybe the best bet here would be to adopt the newlib memcpy/memmove as 
> > > > generic
> > > > Linux functions?  They're both in C so they should be fine, and they 
> > > > both look
> > > > faster than what's in lib/string.c.  Then everyone would benefit and we 
> > > > don't
> > > > need this tricky RISC-V assembly.  Also, from the look of it the newlib 
> > > > code
> > > > is faster because the inner loop is unrolled.
> > > 
> > > There's a generic memmove implementation in the kernel already:
> > > 
> > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/include/linux/string.h#n362
> > > 
> > > Nick, could you tell us more about why the generic memmove() isn't 
> > > suitable?
> > 
> > KASAN has its own string operations(memcpy/memmove/memset) because it needs 
> > to
> > hook some code to check memory region. It would undefined the original 
> > string
> > operations and called the string operations with the prefix '__'. But the
> > generic string operations didn't declare with the prefix. Other archs with
> > KASAN support like arm64 and xtensa all have their own string operations and
> > defined with the prefix.
> 
> Thanks for the explanation.  What do you think about Palmer's idea to 
> define a generic C set of KASAN string operations, derived from the newlib 
> code?
> 
> 
> - Paul

That sounds good to me. But it should be another topic. We need to investigate
it further about replacing something generic and fundamental in lib/string.c
with newlib C functions.  Some blind spots may exist.  So I suggest, let's
consider KASAN for now.

Nick

Re: [PATCH] nbd: add a missed nbd_config_put() in nbd_xmit_timeout()

2019-08-14 Thread Mike Christie

Josef had ackd my patch for the same thing here:

https://www.spinics.net/lists/linux-block/msg43800.html

so maybe Jens will pick that up with the rest of the set Josef had acked:

https://www.spinics.net/lists/linux-block/msg43809.html

to make it easier.

On 08/14/2019 08:27 PM, sunke (E) wrote:
> ping
> 
> 在 2019/8/12 20:31, Sun Ke 写道:
>> When try to get the lock failed, before return, execute the
>> nbd_config_put() to decrease the nbd->config_refs.
>>
>> If the nbd->config_refs is added but not decreased. Then will not
>> execute nbd_clear_sock() in nbd_config_put(). bd->task_setup will
>> not be cleared away. Finally, print"Device being setup by another
>> task" in nbd_add_sock() and nbd device can not be reused.
>>
>> Fixes: 8f3ea35929a0 ("nbd: handle unexpected replies better")
>> Signed-off-by: Sun Ke 
>> ---
>>   drivers/block/nbd.c | 4 +++-
>>   1 file changed, 3 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
>> index e21d2de..a69a90a 100644
>> --- a/drivers/block/nbd.c
>> +++ b/drivers/block/nbd.c
>> @@ -357,8 +357,10 @@ static enum blk_eh_timer_return
>> nbd_xmit_timeout(struct request *req,
>>   }
>>   config = nbd->config;
>>   -if (!mutex_trylock(>lock))
>> +if (!mutex_trylock(>lock)) {
>> +nbd_config_put(nbd);
>>   return BLK_EH_RESET_TIMER;
>> +}
>> if (config->num_connections > 1) {
>>   dev_err_ratelimited(nbd_to_dev(nbd),
>>
>

Re: [PATCH] cxgb4: fix a memory leak bug

2019-08-14 Thread David Miller

From: Wenwen Wang 
Date: Tue, 13 Aug 2019 04:18:52 -0500

> In blocked_fl_write(), 't' is not deallocated if bitmap_parse_user() fails,
> leading to a memory leak bug. To fix this issue, free t before returning
> the error.
> 
> Signed-off-by: Wenwen Wang 

Applied.

Re: [RFC PATCH 2/2] mm/gup: introduce vaddr_pin_pages_remote()

2019-08-14 Thread John Hubbard


On 8/14/19 5:02 PM, John Hubbard wrote:

On 8/14/19 4:50 PM, Ira Weiny wrote:

On Tue, Aug 13, 2019 at 05:56:31PM -0700, John Hubbard wrote:

On 8/13/19 5:51 PM, John Hubbard wrote:

On 8/13/19 2:08 PM, Ira Weiny wrote:

On Mon, Aug 12, 2019 at 05:07:32PM -0700, John Hubbard wrote:

On 8/12/19 4:49 PM, Ira Weiny wrote:

On Sun, Aug 11, 2019 at 06:50:44PM -0700, john.hubb...@gmail.com wrote:

From: John Hubbard 

...

Finally, I struggle with converting everyone to a new call.  It is more
overhead to use vaddr_pin in the call above because now the GUP code is going
to associate a file pin object with that file when in ODP we don't need that
because the pages can move around.


What if the pages in ODP are file-backed?



oops, strike that, you're right: in that case, even the file system case is 
covered.
Don't mind me. :)


Ok so are we agreed we will drop the patch to the ODP code?  I'm going to keep
the FOLL_PIN flag and addition in the vaddr_pin_pages.



Yes. I hope I'm not overlooking anything, but it all seems to make sense to
let ODP just rely on the MMU notifiers.



Hold on, I *was* forgetting something: this was a two part thing, and you're
conflating the two points, but they need to remain separate and distinct. There
were:

1. FOLL_PIN is necessary because the caller is clearly in the use case that
requires it--however briefly they might be there. As Jan described it,

"Anything that gets page reference and then touches page data (e.g. direct IO)
needs the new kind of tracking so that filesystem knows someone is messing with
the page data." [1]

2. Releasing the pin: for ODP, we can use MMU notifiers instead of requiring a
lease.

This second point does not invalidate the first point. Therefore, I still see 
the
need for the call within ODP, to something that sets FOLL_PIN. And that means
either vaddr_pin_[user?]_pages_remote, or some other wrapper of your choice. :)

I guess shows that the API might need to be refined. We're trying to solve
two closely related issues, but they're not identical.

thanks,
--
John Hubbard
NVIDIA

Re: [PATCH] virtio-net: lower min ring num_free for efficiency

2019-08-14 Thread Jason Wang




On 2019/8/14 上午10:06, ? jiang wrote:

This change lowers ring buffer reclaim threshold from 1/2*queue to budget
for better performance. According to our test with qemu + dpdk, packet
dropping happens when the guest is not able to provide free buffer in
avail ring timely with default 1/2*queue. The value in the patch has been
tested and does show better performance.



Please add your tests setup and result here.

Thanks




Signed-off-by: jiangkidd 
---
  drivers/net/virtio_net.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 0d4115c9e20b..bc08be7925eb 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -1331,7 +1331,7 @@ static int virtnet_receive(struct receive_queue *rq, int 
budget,
}
}
  
-	if (rq->vq->num_free > virtqueue_get_vring_size(rq->vq) / 2) {

+   if (rq->vq->num_free > min((unsigned int)budget, 
virtqueue_get_vring_size(rq->vq)) / 2) {
if (!try_fill_recv(vi, rq, GFP_ATOMIC))
schedule_delayed_work(>refill, 0);
}

[PATCH] arm: dts: rockchip: fix vcc_host_5v regulator for usb3 host

2019-08-14 Thread Kever Yang

According to rock64 schemetic V2 and V3, the VCC_HOST_5V output is
controlled by USB_20_HOST_DRV, which is the same as VCC_HOST1_5V.

Signed-off-by: Kever Yang 
---

 arch/arm64/boot/dts/rockchip/rk3328-rock64.dts | 10 ++
 1 file changed, 2 insertions(+), 8 deletions(-)

diff --git a/arch/arm64/boot/dts/rockchip/rk3328-rock64.dts 
b/arch/arm64/boot/dts/rockchip/rk3328-rock64.dts
index 7cfd5ca6cc85..bd4ad1635e0b 100644
--- a/arch/arm64/boot/dts/rockchip/rk3328-rock64.dts
+++ b/arch/arm64/boot/dts/rockchip/rk3328-rock64.dts
@@ -35,9 +35,9 @@
vcc_host_5v: vcc-host-5v-regulator {
compatible = "regulator-fixed";
enable-active-high;
-   gpio = < RK_PA0 GPIO_ACTIVE_HIGH>;
+   gpio = < RK_PA2 GPIO_ACTIVE_LOW>;
pinctrl-names = "default";
-   pinctrl-0 = <_host_drv>;
+   pinctrl-0 = <_host_drv>;
regulator-name = "vcc_host_5v";
regulator-always-on;
regulator-boot-on;
@@ -320,12 +320,6 @@
rockchip,pins = <0 RK_PA2 RK_FUNC_GPIO _pull_none>;
};
};
-
-   usb3 {
-   usb30_host_drv: usb30-host-drv {
-   rockchip,pins = <0 RK_PA0 RK_FUNC_GPIO _pull_none>;
-   };
-   };
 };
 
  {
-- 
2.17.1

Re: [PATCH v2 2/2] riscv: Make __fstate_clean() work correctly.

2019-08-14 Thread Vincent Chen

On Thu, Aug 15, 2019 at 6:17 AM Andreas Schwab  wrote:
>
> On Aug 14 2019, Palmer Dabbelt  wrote:
>
> > On Wed, 14 Aug 2019 13:32:50 PDT (-0700), Paul Walmsley wrote:
> >> On Wed, 14 Aug 2019, Vincent Chen wrote:
> >>
> >>> Make the __fstate_clean() function correctly set the
> >>> state of sstatus.FS in pt_regs to SR_FS_CLEAN.
> >>>
> >>> Fixes: 7db91e5 ("RISC-V: Task implementation")
> >>> Cc: linux-stable 
> >>> Signed-off-by: Vincent Chen 
> >>> Reviewed-by: Anup Patel 
> >>> Reviewed-by: Christoph Hellwig 
> >>
> >> Thanks, I extended the "Fixes" commit ID to 12 digits, as is the usual
> >> practice here, and have queued the following for v5.3-rc.
> >
Thank Paul for correcting my mistake.

> > For reference, something like "git config core.abbrev=12" (or whatever you
> > write to get this in your .gitconfig)
> >
> >https://github.com/palmer-dabbelt/home/blob/master/.gitconfig.in#L23
> >
> > causes git to do the right thing.
>
> Actually, the right setting is core.abbrev=auto (or leaving it unset).
> It lets git chose the appropriate length depending on the repository
> contents.  For the linux repository it will chose 13 right now.
>
> Andreas.
>
Thanks to Palmer and Andreas for sharing this useful information.

> --
> Andreas Schwab, sch...@linux-m68k.org
> GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
> "And now for something completely different."

Re: [PATCH] Makefile: Convert -Wimplicit-fallthrough=3 to just -Wimplicit-fallthrough for clang

2019-08-14 Thread Joe Perches

On Tue, 2019-08-13 at 14:44 +0200, Miguel Ojeda wrote:
> Hm... I would go for either __fallthrough as the rest of attributes,
> or simply fallthrough -- FALLTHROUGH seems wrong. If you want it that
> way for visibility, then I would choose __fallthrough, since the
> underscores are quite prominent and anyway IDEs typically highlight
> macros in a different color than keywords (return etc.).

Just fyi:

I added this line to my .emacs and "fallthrough" is now
syntax highlighted like every other keyword.

  (font-lock-add-keywords 'c-mode
'(("\\<\\(fallthrough\\)\\>" . font-lock-keyword-face)))

So now my linux-c-mode block is:

(defun linux-c-mode ()
  "C mode with adjusted defaults for use with the Linux kernel."
  (interactive)
  (font-lock-add-keywords 'c-mode
'(("\\<\\(fallthrough\\)\\>" . font-lock-keyword-face)))
  (c-mode)
  (c-set-style "K")
  (setq c-basic-offset 8)
  (setq c-indent-level 8)
  (setq c-brace-imaginary-offset 0)
  (setq c-brace-offset -8)
  (setq c-argdecl-indent 8)
  (setq c-label-offset -8)
  (setq c-continued-statement-offset 8)
  (setq indent-tabs-mode t)
  (setq tab-width 8)
  (setq show-trailing-whitespace t)
  )

I don't know to do that for vim nor any other ide,
but I trust someone will know and show how it's done.

Re: [PATCH] x86/fixmap: update stale comments

2019-08-14 Thread Cao jin

Hi,
  Wish to know whether the patch make sense.

On 8/9/19 7:46 PM, Cao jin wrote:
> Signed-off-by: Cao jin 
> ---
>  arch/x86/include/asm/fixmap.h | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/x86/include/asm/fixmap.h b/arch/x86/include/asm/fixmap.h
> index 9da8cccdf3fb..0c47aa82e2e2 100644
> --- a/arch/x86/include/asm/fixmap.h
> +++ b/arch/x86/include/asm/fixmap.h
> @@ -42,8 +42,7 @@
>   * Because of this, FIXADDR_TOP x86 integration was left as later work.
>   */
>  #ifdef CONFIG_X86_32
> -/* used by vmalloc.c, vsyscall.lds.S.
> - *
> +/*

Not seeing variable __FIXADDR_TOP & macro FIXADDR_TOP under
CONFIG_X86_32 referred in vmalloc.c, and there is no vsyscall.lds.S now.

>   * Leave one empty page between vmalloc'ed areas and
>   * the start of the fixmap.
>   */
> @@ -120,7 +119,7 @@ enum fixed_addresses {
>* before ioremap() is functional.
>*
>* If necessary we round it up to the next 512 pages boundary so
> -  * that we can have a single pgd entry and a single pte table:
> +  * that we can have a single pmd entry and a single pte table:

The comments seems missed to be updated in an ancient commit 551889a6e2a24
>*/
>  #define NR_FIX_BTMAPS64
>  #define FIX_BTMAPS_SLOTS 8
> 

-- 
Sincerely,
Cao jin

Re: [PATCH V3 4/4] dt-bindings: iio: adc: ad7192: Add binding documentation for AD7192

2019-08-14 Thread Rob Herring

On Wed, Aug 14, 2019 at 1:32 AM Mircea Caprioru
 wrote:
>
> This patch add device tree binding documentation for AD7192 adc in YAML
> format.
>
> Signed-off-by: Mircea Caprioru 
> ---
> Changelog V2:
> - remove description from spi and interrupt properties
> - changed the name of the device from ad7192 to adc in the example
>
> Changelog V3:
> - added semicolon at the end of the dt example
>
>  .../bindings/iio/adc/adi,ad7192.yaml  | 121 ++
>  1 file changed, 121 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/iio/adc/adi,ad7192.yaml

Reviewed-by: Rob Herring

[PATCH] afs: Move comments after /* fallthrough */

2019-08-14 Thread Joe Perches

Make the code a bit easier for a script to appropriately convert
case statement blocks with /* fallthrough */ comments to a macro by
moving comments describing the next case block to the case statement.

Signed-off-by: Joe Perches 
---
 fs/afs/cmservice.c | 10 +++---
 fs/afs/fsclient.c  | 51 +--
 fs/afs/vlclient.c  | 50 +-
 fs/afs/yfsclient.c | 51 +--
 4 files changed, 62 insertions(+), 100 deletions(-)

diff --git a/fs/afs/cmservice.c b/fs/afs/cmservice.c
index b86195e4dc6c..2270fe9325da 100644
--- a/fs/afs/cmservice.c
+++ b/fs/afs/cmservice.c
@@ -282,10 +282,8 @@ static int afs_deliver_cb_callback(struct afs_call *call)
case 0:
afs_extract_to_tmp(call);
call->unmarshall++;
-
-   /* extract the FID array and its count in two steps */
/* fall through */
-   case 1:
+   case 1: /* extract the FID array and its count in two steps */
_debug("extract FID count");
ret = afs_extract_data(call, true);
if (ret < 0)
@@ -329,9 +327,8 @@ static int afs_deliver_cb_callback(struct afs_call *call)
afs_extract_to_tmp(call);
call->unmarshall++;
 
-   /* extract the callback array and its count in two steps */
/* fall through */
-   case 3:
+   case 3: /* extract the callback array & count in two steps */
_debug("extract CB count");
ret = afs_extract_data(call, true);
if (ret < 0)
@@ -651,9 +648,8 @@ static int afs_deliver_yfs_cb_callback(struct afs_call 
*call)
afs_extract_to_tmp(call);
call->unmarshall++;
 
-   /* extract the FID array and its count in two steps */
/* Fall through */
-   case 1:
+   case 1: /* extract the FID array and its count in two steps */
_debug("extract FID count");
ret = afs_extract_data(call, true);
if (ret < 0)
diff --git a/fs/afs/fsclient.c b/fs/afs/fsclient.c
index 114f281f3687..d9dc1bdfa695 100644
--- a/fs/afs/fsclient.c
+++ b/fs/afs/fsclient.c
@@ -341,8 +341,7 @@ static int afs_deliver_fs_fetch_data(struct afs_call *call)
}
/* Fall through */
 
-   /* extract the returned data length */
-   case 1:
+   case 1: /* extract the returned data length */
_debug("extract data length");
ret = afs_extract_data(call, true);
if (ret < 0)
@@ -369,8 +368,7 @@ static int afs_deliver_fs_fetch_data(struct afs_call *call)
ASSERTCMP(size, <=, PAGE_SIZE);
/* Fall through */
 
-   /* extract the returned data */
-   case 2:
+   case 2: /* extract the returned data */
_debug("extract data %zu/%llu",
   iov_iter_count(>iter), req->remain);
 
@@ -411,8 +409,7 @@ static int afs_deliver_fs_fetch_data(struct afs_call *call)
afs_extract_to_buf(call, (21 + 3 + 6) * 4);
/* Fall through */
 
-   /* extract the metadata */
-   case 4:
+   case 4: /* extract the metadata */
ret = afs_extract_data(call, false);
if (ret < 0)
return ret;
@@ -1476,8 +1473,7 @@ static int afs_deliver_fs_get_volume_status(struct 
afs_call *call)
afs_extract_to_buf(call, 12 * 4);
/* Fall through */
 
-   /* extract the returned status record */
-   case 1:
+   case 1: /* extract the returned status record */
_debug("extract status");
ret = afs_extract_data(call, true);
if (ret < 0)
@@ -1489,8 +1485,7 @@ static int afs_deliver_fs_get_volume_status(struct 
afs_call *call)
afs_extract_to_tmp(call);
/* Fall through */
 
-   /* extract the volume name length */
-   case 2:
+   case 2: /* extract the volume name length */
ret = afs_extract_data(call, true);
if (ret < 0)
return ret;
@@ -1505,8 +1500,7 @@ static int afs_deliver_fs_get_volume_status(struct 
afs_call *call)
call->unmarshall++;
/* Fall through */
 
-   /* extract the volume name */
-   case 3:
+   case 3: /* extract the volume name */
_debug("extract volname");
ret = afs_extract_data(call, true);
if (ret < 0)
@@ -1519,8 +1513,7 @@ static int afs_deliver_fs_get_volume_status(struct 
afs_call *call)
call->unmarshall++;
/* Fall through */
 
-   /* extract the offline message length */
-

[GIT PULL] Devicetree fixes for 5.3-rc, take 3

2019-08-14 Thread Rob Herring

Linus,

Please pull DT fixes for 5.3.

Rob

The following changes since commit 609488bc979f99f805f34e9a32c1e3b71179d10b:

  Linux 5.3-rc2 (2019-07-28 12:47:02 -0700)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux.git
tags/devicetree-fixes-for-5.3-3

for you to fetch changes up to 83f82d7a42583e93d0f0dde3d61ed10f75c0f4d8:

  of: irq: fix a trivial typo in a doc comment (2019-08-14 20:12:16 -0600)


Devicetree fixes for 5.3:

- Fix building DT binding examples for in tree builds

- Correct some refcounting in adjust_local_phandle_references()

- Update FSL FEC binding with deprecated properties

- Schema fix in stm32 pinctrl

- Fix typo in of_irq_parse_one docbook comment


Lubomir Rintel (1):
  of: irq: fix a trivial typo in a doc comment

Nishka Dasgupta (1):
  of: resolver: Add of_node_put() before return and break

Rob Herring (2):
  dt-bindings: Fix generated example files getting added to schemas
  dt-bindings: pinctrl: stm32: Fix 'st,syscfg' schema

Sven Van Asbroeck (1):
  dt-bindings: fec: explicitly mark deprecated properties

 Documentation/devicetree/bindings/Makefile |  4 ++-
 Documentation/devicetree/bindings/net/fsl-fec.txt  | 30 --
 .../bindings/pinctrl/st,stm32-pinctrl.yaml |  3 ++-
 drivers/of/irq.c   |  2 +-
 drivers/of/resolver.c  | 12 ++---
 5 files changed, 32 insertions(+), 19 deletions(-)

Re: [PATCH] scsi: fnic: remove redundant assignment of variable rc

2019-08-14 Thread Martin K. Petersen



Colin,

> Variable ret is initialized to a value that is never read and it is
> re-assigned later and immediatetly returns. Clean up the code by
> removing rc and just returning 0.

Applied to 5.4/scsi-queue. Thanks!

-- 
Martin K. Petersen  Oracle Linux Engineering

Re: [PATCH] of: irq: fix a trivial typo in a doc comment

2019-08-14 Thread Rob Herring

On Wed,  7 Aug 2019 15:22:31 +0200, Lubomir Rintel wrote:
> Diverged from what the code does with commit 530210c7814e ("of/irq: Replace
> of_irq with of_phandle_args").
> 
> Signed-off-by: Lubomir Rintel 
> ---
>  drivers/of/irq.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 

Applied, thanks.

Rob

Re: [PATCH] scsi: ufs: Fix NULL pointer dereference in ufshcd_config_vreg_hpm()

2019-08-14 Thread Martin K. Petersen



Adrian,

> Fix the following BUG:
>
>   [ 187.065689] BUG: kernel NULL pointer dereference, address: 
> 001c
>   [ 187.065790] RIP: 0010:ufshcd_vreg_set_hpm+0x3c/0x110 [ufshcd_core]
>   [ 187.065938] Call Trace:
>   [ 187.065959] ufshcd_resume+0x72/0x290 [ufshcd_core]
>   [ 187.065980] ufshcd_system_resume+0x54/0x140 [ufshcd_core]
>   [ 187.065993] ? pci_pm_restore+0xb0/0xb0
>   [ 187.066005] ufshcd_pci_resume+0x15/0x20 [ufshcd_pci]
>   [ 187.066017] pci_pm_thaw+0x4c/0x90
>   [ 187.066030] dpm_run_callback+0x5b/0x150
>   [ 187.066043] device_resume+0x11b/0x220
>
> Voltage regulators are optional, so functions must check they exist
> before dereferencing.
>
> Note this issue is hidden if CONFIG_REGULATORS is not set, because the
> offending code is optimised away.

Applied to 5.3/scsi-fixes, thanks!

-- 
Martin K. Petersen  Oracle Linux Engineering

Re: [PATCH v2] psi: get poll_work to run when calling poll syscall next time

2019-08-14 Thread Jason Xing


Hello,

It's been delayed for no reason a couple of days. Any comments and 
suggestions on this patch V2 would be appreciated.


Thanks,
Jason

On 2019/7/30 下午1:16, Jason Xing wrote:

Only when calling the poll syscall the first time can user
receive POLLPRI correctly. After that, user always fails to
acquire the event signal.

Reproduce case:
1. Get the monitor code in Documentation/accounting/psi.txt
2. Run it, and wait for the event triggered.
3. Kill and restart the process.

If the user doesn't kill the monitor process, it seems the
poll_work works fine. After killing and restarting the monitor,
the poll_work in kernel will never run again due to the wrong
value of poll_scheduled. Therefore, we should reset the value
as group_init() does after the last trigger is destroyed.

[PATCH V2]
In the patch v2, I put the atomic_set(>poll_scheduled, 0);
into the right place.
Here I quoted from Johannes as the best explaination:
"The question is why we can end up with poll_scheduled = 1 but the work
not running (which would reset it to 0). And the answer is because the
scheduling side sees group->poll_kworker under RCU protection and then
schedules it, but here we cancel the work and destroy the worker. The
cancel needs to pair with resetting the poll_scheduled flag."

Signed-off-by: Jason Xing 
Reviewed-by: Caspar Zhang 
Reviewed-by: Joseph Qi 
Reviewed-by: Suren Baghdasaryan 
Acked-by: Johannes Weiner 
---
  kernel/sched/psi.c | 7 +++
  1 file changed, 7 insertions(+)

diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c
index 7acc632..acdada0 100644
--- a/kernel/sched/psi.c
+++ b/kernel/sched/psi.c
@@ -1131,7 +1131,14 @@ static void psi_trigger_destroy(struct kref *ref)
 * deadlock while waiting for psi_poll_work to acquire trigger_lock
 */
if (kworker_to_destroy) {
+   /*
+* After the RCU grace period has expired, the worker
+* can no longer be found through group->poll_kworker.
+* But it might have been already scheduled before
+* that - deschedule it cleanly before destroying it.
+*/
kthread_cancel_delayed_work_sync(>poll_work);
+   atomic_set(>poll_scheduled, 0);
kthread_destroy_worker(kworker_to_destroy);
}
kfree(t);

Re: [PATCH v2 2/2] riscv: Make __fstate_clean() work correctly.

2019-08-14 Thread Palmer Dabbelt


On Wed, 14 Aug 2019 15:17:18 PDT (-0700), sch...@linux-m68k.org wrote:

On Aug 14 2019, Palmer Dabbelt  wrote:


On Wed, 14 Aug 2019 13:32:50 PDT (-0700), Paul Walmsley wrote:

On Wed, 14 Aug 2019, Vincent Chen wrote:


Make the __fstate_clean() function correctly set the
state of sstatus.FS in pt_regs to SR_FS_CLEAN.

Fixes: 7db91e5 ("RISC-V: Task implementation")
Cc: linux-stable 
Signed-off-by: Vincent Chen 
Reviewed-by: Anup Patel 
Reviewed-by: Christoph Hellwig 


Thanks, I extended the "Fixes" commit ID to 12 digits, as is the usual
practice here, and have queued the following for v5.3-rc.


For reference, something like "git config core.abbrev=12" (or whatever you
write to get this in your .gitconfig)

   https://github.com/palmer-dabbelt/home/blob/master/.gitconfig.in#L23

causes git to do the right thing.


Actually, the right setting is core.abbrev=auto (or leaving it unset).
It lets git chose the appropriate length depending on the repository
contents.  For the linux repository it will chose 13 right now.


Awesome, thanks!  I've updated my config :)

Re: [PATCH 5.2 000/144] 5.2.9-stable review

2019-08-14 Thread Naresh Kamboju

On Wed, 14 Aug 2019 at 22:33, Greg Kroah-Hartman
 wrote:
>
> This is the start of the stable review cycle for the 5.2.9 release.
> There are 144 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Fri 16 Aug 2019 04:55:34 PM UTC.
> Anything received after that time might be too late.
>
> The whole patch series can be found in one patch at:
> 
> https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.2.9-rc1.gz
> or in the git tree and branch at:
> 
> git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git 
> linux-5.2.y
> and the diffstat can be found below.
>
> thanks,
>
> greg k-h
>

Results from Linaro’s test farm.
No regressions on arm64, arm, x86_64, and i386.

Summary


kernel: 5.2.9-rc1
git repo: 
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
git branch: linux-5.2.y
git commit: 2440e485aeda5f36eaf2050eb1bb61be46275b39
git describe: v5.2.8-145-g2440e485aeda
Test details: 
https://qa-reports.linaro.org/lkft/linux-stable-rc-5.2-oe/build/v5.2.8-145-g2440e485aeda


No regressions (compared to build v5.2.8)


No fixes (compared to build v5.2.8)

Ran 22959 total tests in the following environments and test suites.

Environments
--
- dragonboard-410c
- hi6220-hikey
- i386
- juno-r2
- qemu_arm
- qemu_arm64
- qemu_i386
- qemu_x86_64
- x15
- x86

Test Suites
---
* build
* install-android-platform-tools-r2600
* kselftest
* libgpiod
* libhugetlbfs
* ltp-cap_bounds-tests
* ltp-commands-tests
* ltp-containers-tests
* ltp-cpuhotplug-tests
* ltp-cve-tests
* ltp-dio-tests
* ltp-fcntl-locktests-tests
* ltp-filecaps-tests
* ltp-fs_bind-tests
* ltp-fs_perms_simple-tests
* ltp-fsx-tests
* ltp-hugetlb-tests
* ltp-io-tests
* ltp-ipc-tests
* ltp-math-tests
* ltp-mm-tests
* ltp-nptl-tests
* ltp-pty-tests
* ltp-securebits-tests
* ltp-timers-tests
* spectre-meltdown-checker-test
* ltp-sched-tests
* ltp-syscalls-tests
* perf
* v4l2-compliance
* ltp-fs-tests
* ltp-open-posix-tests
* network-basic-tests
* kvm-unit-tests
* kselftest-vsyscall-mode-native
* kselftest-vsyscall-mode-none

-- 
Linaro LKFT
https://lkft.linaro.org

[PATCH 3/3] libnvdimm/security: Consolidate 'security' operations

2019-08-14 Thread Dan Williams

The security operations are exported from libnvdimm/security.c to
libnvdimm/dimm_devs.c, and libnvdimm/security.c is optionally compiled
based on the CONFIG_NVDIMM_KEYS config symbol.

Rather than export the operations across compile objects, just move the
__security_store() entry point to live with the helpers.

Cc: Dave Jiang 
Signed-off-by: Dan Williams 
---
 drivers/nvdimm/dimm_devs.c |   84 -
 drivers/nvdimm/nd-core.h   |   30 +--
 drivers/nvdimm/security.c  |   90 ++--
 3 files changed, 90 insertions(+), 114 deletions(-)

diff --git a/drivers/nvdimm/dimm_devs.c b/drivers/nvdimm/dimm_devs.c
index d837cb9be83d..196aa44c4936 100644
--- a/drivers/nvdimm/dimm_devs.c
+++ b/drivers/nvdimm/dimm_devs.c
@@ -393,88 +393,6 @@ static ssize_t frozen_show(struct device *dev,
 }
 static DEVICE_ATTR_RO(frozen);
 
-#define OPS\
-   C( OP_FREEZE,   "freeze",   1), \
-   C( OP_DISABLE,  "disable",  2), \
-   C( OP_UPDATE,   "update",   3), \
-   C( OP_ERASE,"erase",2), \
-   C( OP_OVERWRITE,"overwrite",2), \
-   C( OP_MASTER_UPDATE,"master_update",3), \
-   C( OP_MASTER_ERASE, "master_erase", 2)
-#undef C
-#define C(a, b, c) a
-enum nvdimmsec_op_ids { OPS };
-#undef C
-#define C(a, b, c) { b, c }
-static struct {
-   const char *name;
-   int args;
-} ops[] = { OPS };
-#undef C
-
-#define SEC_CMD_SIZE 32
-#define KEY_ID_SIZE 10
-
-static ssize_t __security_store(struct device *dev, const char *buf, size_t 
len)
-{
-   struct nvdimm *nvdimm = to_nvdimm(dev);
-   ssize_t rc;
-   char cmd[SEC_CMD_SIZE+1], keystr[KEY_ID_SIZE+1],
-   nkeystr[KEY_ID_SIZE+1];
-   unsigned int key, newkey;
-   int i;
-
-   rc = sscanf(buf, "%"__stringify(SEC_CMD_SIZE)"s"
-   " %"__stringify(KEY_ID_SIZE)"s"
-   " %"__stringify(KEY_ID_SIZE)"s",
-   cmd, keystr, nkeystr);
-   if (rc < 1)
-   return -EINVAL;
-   for (i = 0; i < ARRAY_SIZE(ops); i++)
-   if (sysfs_streq(cmd, ops[i].name))
-   break;
-   if (i >= ARRAY_SIZE(ops))
-   return -EINVAL;
-   if (ops[i].args > 1)
-   rc = kstrtouint(keystr, 0, );
-   if (rc >= 0 && ops[i].args > 2)
-   rc = kstrtouint(nkeystr, 0, );
-   if (rc < 0)
-   return rc;
-
-   if (i == OP_FREEZE) {
-   dev_dbg(dev, "freeze\n");
-   rc = nvdimm_security_freeze(nvdimm);
-   } else if (i == OP_DISABLE) {
-   dev_dbg(dev, "disable %u\n", key);
-   rc = nvdimm_security_disable(nvdimm, key);
-   } else if (i == OP_UPDATE || i == OP_MASTER_UPDATE) {
-   dev_dbg(dev, "%s %u %u\n", ops[i].name, key, newkey);
-   rc = nvdimm_security_update(nvdimm, key, newkey, i == OP_UPDATE
-   ? NVDIMM_USER : NVDIMM_MASTER);
-   } else if (i == OP_ERASE || i == OP_MASTER_ERASE) {
-   dev_dbg(dev, "%s %u\n", ops[i].name, key);
-   if (atomic_read(>busy)) {
-   dev_dbg(dev, "Unable to secure erase while DIMM 
active.\n");
-   return -EBUSY;
-   }
-   rc = nvdimm_security_erase(nvdimm, key, i == OP_ERASE
-   ? NVDIMM_USER : NVDIMM_MASTER);
-   } else if (i == OP_OVERWRITE) {
-   dev_dbg(dev, "overwrite %u\n", key);
-   if (atomic_read(>busy)) {
-   dev_dbg(dev, "Unable to overwrite while DIMM 
active.\n");
-   return -EBUSY;
-   }
-   rc = nvdimm_security_overwrite(nvdimm, key);
-   } else
-   return -EINVAL;
-
-   if (rc == 0)
-   rc = len;
-   return rc;
-}
-
 static ssize_t security_store(struct device *dev,
struct device_attribute *attr, const char *buf, size_t len)
 
@@ -489,7 +407,7 @@ static ssize_t security_store(struct device *dev,
nd_device_lock(dev);
nvdimm_bus_lock(dev);
wait_nvdimm_bus_probe_idle(dev);
-   rc = __security_store(dev, buf, len);
+   rc = nvdimm_security_store(dev, buf, len);
nvdimm_bus_unlock(dev);
nd_device_unlock(dev);
 
diff --git a/drivers/nvdimm/nd-core.h b/drivers/nvdimm/nd-core.h
index da2bbfd56d9f..454454ba1738 100644
--- a/drivers/nvdimm/nd-core.h
+++ b/drivers/nvdimm/nd-core.h
@@ -68,35 +68,11 @@ static inline unsigned long nvdimm_security_flags(
 }
 int nvdimm_security_freeze(struct nvdimm *nvdimm);
 #if IS_ENABLED(CONFIG_NVDIMM_KEYS)
-int nvdimm_security_disable(struct nvdimm *nvdimm, unsigned int keyid);
-int nvdimm_security_update(struct nvdimm

[PATCH 1/3] libnvdimm/security: Introduce a 'frozen' attribute

2019-08-14 Thread Dan Williams

In the process of debugging a system with an NVDIMM that was failing to
unlock it was found that the kernel is reporting 'locked' while the DIMM
security interface is 'frozen'. Unfortunately the security state is
tracked internally as an enum which prevents it from communicating the
difference between 'locked' and 'locked + frozen'. It follows that the
enum also prevents the kernel from communicating 'unlocked + frozen'
which would be useful for debugging why security operations like 'change
passphrase' are disabled.

Ditch the security state enum for a set of flags and introduce a new
sysfs attribute explicitly for the 'frozen' state. The regression risk
is low because the 'frozen' state was already blocked behind the
'locked' state, but will need to revisit if there were cases where
applications need 'frozen' to show up in the primary 'security'
attribute. The expectation is that communicating 'frozen' is mostly a
helper for debug and status monitoring.

Cc: Dave Jiang 
Reported-by: Jeff Moyer 
Signed-off-by: Dan Williams 
---
 drivers/acpi/nfit/intel.c|   65 ++---
 drivers/nvdimm/bus.c |2 -
 drivers/nvdimm/dimm_devs.c   |   59 +--
 drivers/nvdimm/nd-core.h |   21 ++--
 drivers/nvdimm/security.c|   99 ++
 include/linux/libnvdimm.h|9 ++-
 tools/testing/nvdimm/dimm_devs.c |   19 ++-
 7 files changed, 146 insertions(+), 128 deletions(-)

diff --git a/drivers/acpi/nfit/intel.c b/drivers/acpi/nfit/intel.c
index cddd0fcf622c..2c51ca4155dc 100644
--- a/drivers/acpi/nfit/intel.c
+++ b/drivers/acpi/nfit/intel.c
@@ -7,10 +7,11 @@
 #include "intel.h"
 #include "nfit.h"
 
-static enum nvdimm_security_state intel_security_state(struct nvdimm *nvdimm,
+static unsigned long intel_security_flags(struct nvdimm *nvdimm,
enum nvdimm_passphrase_type ptype)
 {
struct nfit_mem *nfit_mem = nvdimm_provider_data(nvdimm);
+   unsigned long security_flags = 0;
struct {
struct nd_cmd_pkg pkg;
struct nd_intel_get_security_state cmd;
@@ -27,46 +28,54 @@ static enum nvdimm_security_state 
intel_security_state(struct nvdimm *nvdimm,
int rc;
 
if (!test_bit(NVDIMM_INTEL_GET_SECURITY_STATE, _mem->dsm_mask))
-   return -ENXIO;
+   return 0;
 
/*
 * Short circuit the state retrieval while we are doing overwrite.
 * The DSM spec states that the security state is indeterminate
 * until the overwrite DSM completes.
 */
-   if (nvdimm_in_overwrite(nvdimm) && ptype == NVDIMM_USER)
-   return NVDIMM_SECURITY_OVERWRITE;
+   if (nvdimm_in_overwrite(nvdimm) && ptype == NVDIMM_USER) {
+   set_bit(NVDIMM_SECURITY_OVERWRITE, _flags);
+   return security_flags;
+   }
 
rc = nvdimm_ctl(nvdimm, ND_CMD_CALL, _cmd, sizeof(nd_cmd), NULL);
-   if (rc < 0)
-   return rc;
-   if (nd_cmd.cmd.status)
-   return -EIO;
+   if (rc < 0 || nd_cmd.cmd.status) {
+   pr_err("%s: security state retrieval failed (%d:%#x)\n",
+   nvdimm_name(nvdimm), rc, nd_cmd.cmd.status);
+   return 0;
+   }
 
/* check and see if security is enabled and locked */
if (ptype == NVDIMM_MASTER) {
if (nd_cmd.cmd.extended_state & ND_INTEL_SEC_ESTATE_ENABLED)
-   return NVDIMM_SECURITY_UNLOCKED;
-   else if (nd_cmd.cmd.extended_state &
-   ND_INTEL_SEC_ESTATE_PLIMIT)
-   return NVDIMM_SECURITY_FROZEN;
-   } else {
-   if (nd_cmd.cmd.state & ND_INTEL_SEC_STATE_UNSUPPORTED)
-   return -ENXIO;
-   else if (nd_cmd.cmd.state & ND_INTEL_SEC_STATE_ENABLED) {
-   if (nd_cmd.cmd.state & ND_INTEL_SEC_STATE_LOCKED)
-   return NVDIMM_SECURITY_LOCKED;
-   else if (nd_cmd.cmd.state & ND_INTEL_SEC_STATE_FROZEN
-   || nd_cmd.cmd.state &
-   ND_INTEL_SEC_STATE_PLIMIT)
-   return NVDIMM_SECURITY_FROZEN;
-   else
-   return NVDIMM_SECURITY_UNLOCKED;
-   }
+   set_bit(NVDIMM_SECURITY_UNLOCKED, _flags);
+   else
+   set_bit(NVDIMM_SECURITY_DISABLED, _flags);
+   if (nd_cmd.cmd.extended_state & ND_INTEL_SEC_ESTATE_PLIMIT)
+   set_bit(NVDIMM_SECURITY_FROZEN, _flags);
+   return security_flags;
}
 
-   /* this should cover master security disabled as well */
-   return NVDIMM_SECURITY_DISABLED;
+
+   if (nd_cmd.cmd.state & ND_INTEL_SEC_STATE_UNSUPPORTED)
+   return 0;
+
+   if

[PATCH 2/3] libnvdimm/security: Tighten scope of nvdimm->busy vs security operations

2019-08-14 Thread Dan Williams

The blanket blocking of all security operations while the DIMM is in
active use in a region is too restrictive. The only security operations
that need to be aware of the ->busy state are those that mutate the
state of data, i.e. erase and overwrite.

Refactor the ->busy checks to be applied at the entry common entry point
in __security_store() rather than each of the helper routines.

Cc: Dave Jiang 
Signed-off-by: Dan Williams 
---
 drivers/nvdimm/dimm_devs.c |   33 -
 drivers/nvdimm/security.c  |   10 --
 2 files changed, 16 insertions(+), 27 deletions(-)

diff --git a/drivers/nvdimm/dimm_devs.c b/drivers/nvdimm/dimm_devs.c
index 53330625fe07..d837cb9be83d 100644
--- a/drivers/nvdimm/dimm_devs.c
+++ b/drivers/nvdimm/dimm_devs.c
@@ -424,9 +424,6 @@ static ssize_t __security_store(struct device *dev, const 
char *buf, size_t len)
unsigned int key, newkey;
int i;
 
-   if (atomic_read(>busy))
-   return -EBUSY;
-
rc = sscanf(buf, "%"__stringify(SEC_CMD_SIZE)"s"
" %"__stringify(KEY_ID_SIZE)"s"
" %"__stringify(KEY_ID_SIZE)"s",
@@ -451,23 +448,25 @@ static ssize_t __security_store(struct device *dev, const 
char *buf, size_t len)
} else if (i == OP_DISABLE) {
dev_dbg(dev, "disable %u\n", key);
rc = nvdimm_security_disable(nvdimm, key);
-   } else if (i == OP_UPDATE) {
-   dev_dbg(dev, "update %u %u\n", key, newkey);
-   rc = nvdimm_security_update(nvdimm, key, newkey, NVDIMM_USER);
-   } else if (i == OP_ERASE) {
-   dev_dbg(dev, "erase %u\n", key);
-   rc = nvdimm_security_erase(nvdimm, key, NVDIMM_USER);
+   } else if (i == OP_UPDATE || i == OP_MASTER_UPDATE) {
+   dev_dbg(dev, "%s %u %u\n", ops[i].name, key, newkey);
+   rc = nvdimm_security_update(nvdimm, key, newkey, i == OP_UPDATE
+   ? NVDIMM_USER : NVDIMM_MASTER);
+   } else if (i == OP_ERASE || i == OP_MASTER_ERASE) {
+   dev_dbg(dev, "%s %u\n", ops[i].name, key);
+   if (atomic_read(>busy)) {
+   dev_dbg(dev, "Unable to secure erase while DIMM 
active.\n");
+   return -EBUSY;
+   }
+   rc = nvdimm_security_erase(nvdimm, key, i == OP_ERASE
+   ? NVDIMM_USER : NVDIMM_MASTER);
} else if (i == OP_OVERWRITE) {
dev_dbg(dev, "overwrite %u\n", key);
+   if (atomic_read(>busy)) {
+   dev_dbg(dev, "Unable to overwrite while DIMM 
active.\n");
+   return -EBUSY;
+   }
rc = nvdimm_security_overwrite(nvdimm, key);
-   } else if (i == OP_MASTER_UPDATE) {
-   dev_dbg(dev, "master_update %u %u\n", key, newkey);
-   rc = nvdimm_security_update(nvdimm, key, newkey,
-   NVDIMM_MASTER);
-   } else if (i == OP_MASTER_ERASE) {
-   dev_dbg(dev, "master_erase %u\n", key);
-   rc = nvdimm_security_erase(nvdimm, key,
-   NVDIMM_MASTER);
} else
return -EINVAL;
 
diff --git a/drivers/nvdimm/security.c b/drivers/nvdimm/security.c
index 5862d0eee9db..2166e627383a 100644
--- a/drivers/nvdimm/security.c
+++ b/drivers/nvdimm/security.c
@@ -334,11 +334,6 @@ int nvdimm_security_erase(struct nvdimm *nvdimm, unsigned 
int keyid,
|| !nvdimm->sec.flags)
return -EOPNOTSUPP;
 
-   if (atomic_read(>busy)) {
-   dev_dbg(dev, "Unable to secure erase while DIMM active.\n");
-   return -EBUSY;
-   }
-
rc = check_security_state(nvdimm);
if (rc)
return rc;
@@ -380,11 +375,6 @@ int nvdimm_security_overwrite(struct nvdimm *nvdimm, 
unsigned int keyid)
|| !nvdimm->sec.flags)
return -EOPNOTSUPP;
 
-   if (atomic_read(>busy)) {
-   dev_dbg(dev, "Unable to overwrite while DIMM active.\n");
-   return -EBUSY;
-   }
-
if (dev->driver == NULL) {
dev_dbg(dev, "Unable to overwrite while DIMM active.\n");
return -EINVAL;

[PATCH 0/3] libnvdimm/security: Enumerate the frozen state and other cleanups

2019-08-14 Thread Dan Williams

Jeff reported a scenario where ndctl was failing to unlock DIMMs [1].
Through the course of debug it was discovered that the security
interface on the DIMMs was in the 'frozen' state disallowing unlock, or
any security operation.  Unfortunately the kernel only showed that the
DIMMs were 'locked', not 'locked' and 'frozen'.

Introduce a new sysfs 'frozen' attribute so that ndctl can reflect the
"security-operations-allowed" state independently of the lock status.
Then, followup with cleanups related to replacing a security-state-enum
with a set of flags.

[1]: https://lists.01.org/pipermail/linux-nvdimm/2019-August/022856.html
---

Dan Williams (3):
  libnvdimm/security: Introduce a 'frozen' attribute
  libnvdimm/security: Tighten scope of nvdimm->busy vs security operations
  libnvdimm/security: Consolidate 'security' operations


 drivers/acpi/nfit/intel.c|   65 +++-
 drivers/nvdimm/bus.c |2 
 drivers/nvdimm/dimm_devs.c   |  134 ++
 drivers/nvdimm/nd-core.h |   51 --
 drivers/nvdimm/security.c|  199 +-
 include/linux/libnvdimm.h|9 +-
 tools/testing/nvdimm/dimm_devs.c |   19 +---
 7 files changed, 231 insertions(+), 248 deletions(-)

Re: [PATCH 4.19 00/91] 4.19.67-stable review

2019-08-14 Thread Naresh Kamboju

On Wed, 14 Aug 2019 at 22:38, Greg Kroah-Hartman
 wrote:
>
> This is the start of the stable review cycle for the 4.19.67 release.
> There are 91 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Fri 16 Aug 2019 04:55:34 PM UTC.
> Anything received after that time might be too late.
>
> The whole patch series can be found in one patch at:
> 
> https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.19.67-rc1.gz
> or in the git tree and branch at:
> 
> git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git 
> linux-4.19.y
> and the diffstat can be found below.
>
> thanks,
>
> greg k-h

Results from Linaro’s test farm.
No regressions on arm64, arm, x86_64, and i386.

Summary


kernel: 4.19.67-rc1
git repo: 
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
git branch: linux-4.19.y
git commit: f777613d3df0e7226d30d0e0ba97e9419e3064f2
git describe: v4.19.66-92-gf777613d3df0
Test details: 
https://qa-reports.linaro.org/lkft/linux-stable-rc-4.19-oe/build/v4.19.66-92-gf777613d3df0


No regressions (compared to build v4.19.66)


No fixes (compared to build v4.19.66)

Ran 25307 total tests in the following environments and test suites.

Environments
--
- dragonboard-410c - arm64
- hi6220-hikey - arm64
- i386
- juno-r2 - arm64
- qemu_arm
- qemu_arm64
- qemu_i386
- qemu_x86_64
- x15 - arm
- x86_64

Test Suites
---
* build
* install-android-platform-tools-r2600
* kselftest
* libgpiod
* libhugetlbfs
* ltp-cap_bounds-tests
* ltp-commands-tests
* ltp-containers-tests
* ltp-cpuhotplug-tests
* ltp-cve-tests
* ltp-dio-tests
* ltp-fcntl-locktests-tests
* ltp-filecaps-tests
* ltp-fs-tests
* ltp-fs_bind-tests
* ltp-fs_perms_simple-tests
* ltp-fsx-tests
* ltp-hugetlb-tests
* ltp-io-tests
* ltp-ipc-tests
* ltp-math-tests
* ltp-mm-tests
* ltp-nptl-tests
* ltp-pty-tests
* ltp-sched-tests
* ltp-securebits-tests
* ltp-syscalls-tests
* ltp-timers-tests
* perf
* spectre-meltdown-checker-test
* v4l2-compliance
* network-basic-tests
* ltp-open-posix-tests
* kvm-unit-tests
* ssuite
* kselftest-vsyscall-mode-native
* kselftest-vsyscall-mode-none

-- 
Linaro LKFT
https://lkft.linaro.org

Re: WARNING in cgroup_rstat_updated

2019-08-14 Thread syzbot


syzbot has bisected this bug to:

commit e9db4ef6bf4ca9894bb324c76e01b8f1a16b2650
Author: John Fastabend 
Date:   Sat Jun 30 13:17:47 2018 +

bpf: sockhash fix omitted bucket lock in sock_close

bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=143286e260
start commit:   31cc088a Merge tag 'drm-next-2019-07-19' of git://anongit...
git tree:   net-next
final crash:https://syzkaller.appspot.com/x/report.txt?x=163286e260
console output: https://syzkaller.appspot.com/x/log.txt?x=123286e260
kernel config:  https://syzkaller.appspot.com/x/.config?x=4dba67bf8b8c9ad7
dashboard link: https://syzkaller.appspot.com/bug?extid=370e4739fa489334a4ef
syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=16dd57dc60

Reported-by: syzbot+370e4739fa489334a...@syzkaller.appspotmail.com
Fixes: e9db4ef6bf4c ("bpf: sockhash fix omitted bucket lock in sock_close")

For information about bisection process see: https://goo.gl/tpsmEJ#bisection

Re: [PATCH 4.14 00/69] 4.14.139-stable review

2019-08-14 Thread Naresh Kamboju

On Wed, 14 Aug 2019 at 22:42, Greg Kroah-Hartman
 wrote:
>
> This is the start of the stable review cycle for the 4.14.139 release.
> There are 69 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Fri 16 Aug 2019 04:55:34 PM UTC.
> Anything received after that time might be too late.
>
> The whole patch series can be found in one patch at:
> 
> https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.14.139-rc1.gz
> or in the git tree and branch at:
> 
> git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git 
> linux-4.14.y
> and the diffstat can be found below.
>
> thanks,
>
> greg k-h

Results from Linaro’s test farm.
No regressions on arm64, arm, x86_64, and i386.

Summary


kernel: 4.14.139-rc1
git repo: 
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
git branch: linux-4.14.y
git commit: 736c2f07319a323c55007bcf8fca70481e9c7175
git describe: v4.14.138-70-g736c2f07319a
Test details: 
https://qa-reports.linaro.org/lkft/linux-stable-rc-4.14-oe/build/v4.14.138-70-g736c2f07319a

No regressions (compared to build v4.14.138)

No fixes (compared to build v4.14.138)


Ran 23727 total tests in the following environments and test suites.

Environments
--
- dragonboard-410c - arm64
- hi6220-hikey - arm64
- i386
- juno-r2 - arm64
- qemu_arm
- qemu_arm64
- qemu_i386
- qemu_x86_64
- x15 - arm
- x86_64

Test Suites
---
* build
* install-android-platform-tools-r2600
* kselftest
* libhugetlbfs
* ltp-cap_bounds-tests
* ltp-commands-tests
* ltp-containers-tests
* ltp-cpuhotplug-tests
* ltp-cve-tests
* ltp-dio-tests
* ltp-fcntl-locktests-tests
* ltp-filecaps-tests
* ltp-fs_bind-tests
* ltp-fs_perms_simple-tests
* ltp-fsx-tests
* ltp-hugetlb-tests
* ltp-io-tests
* ltp-ipc-tests
* ltp-math-tests
* ltp-mm-tests
* ltp-nptl-tests
* ltp-pty-tests
* ltp-sched-tests
* ltp-syscalls-tests
* ltp-timers-tests
* perf
* spectre-meltdown-checker-test
* v4l2-compliance
* ltp-fs-tests
* ltp-securebits-tests
* network-basic-tests
* ltp-open-posix-tests
* kvm-unit-tests
* kselftest-vsyscall-mode-native
* kselftest-vsyscall-mode-none
* ssuite

-- 
Linaro LKFT
https://lkft.linaro.org

CONFIRM AND ACKNOWLEDGE

2019-08-14 Thread Reese Bechnam


Good day,

I write you today with optimism regarding a discussion that will  
benefit us both immensely


I am Reese Bechnam. LLB (Hons). Head Attorney with Clifford Bryant  
Solicitors,  Miami, FL and I got your information from the  
International directory here in Miami and after extensive research,  
you have a good record. So I decided to contact you


I have already drafted a detailed letter but due to the  
confidentiality and sensitive nature of the situation, I deem it  
necessary to confirm that this is your private email before I can  
proceed with sending you the detailed letter.


Kindly reply me on any of my emails. I await your prompt response.

Kind Regards,

Reese Bechnam

1351 NW 12th St,
Miami, FL 33125, USA
+1 786 789 5689

Re: [PATCH] sh: Drop -Werror from kernel Makefile

2019-08-14 Thread Gustavo A. R. Silva

Guenter,

On 8/13/19 8:18 AM, Guenter Roeck wrote:
> 
> Please note that _mainline_ builds are currently broken.
> 

This should be fixed now:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=41de59634046b19cd53a1983594a95135c656997

Thanks
--
Gustavo

Re: [PATCH v8 05/14] media: rkisp1: add Rockchip ISP1 subdev driver

2019-08-14 Thread Helen Koike

Hi Sakari,

Thanks for your review. I just have some comments/questions below.

On 8/8/19 6:14 AM, Sakari Ailus wrote:
> Hi Helen,
> 
> On Tue, Jul 30, 2019 at 03:42:47PM -0300, Helen Koike wrote:
>> From: Jacob Chen 
>>
>> Add the subdev driver for rockchip isp1.
>>
>> Signed-off-by: Jacob Chen 
>> Signed-off-by: Shunqian Zheng 
>> Signed-off-by: Yichong Zhong 
>> Signed-off-by: Jacob Chen 
>> Signed-off-by: Eddie Cai 
>> Signed-off-by: Jeffy Chen 
>> Signed-off-by: Allon Huang 
>> Signed-off-by: Tomasz Figa 
>> [fixed unknown entity type / switched to PIXEL_RATE]
>> Signed-off-by: Ezequiel Garcia 
>> [update for upstream]
>> Signed-off-by: Helen Koike 
>>
>> ---
>>
>> Changes in v8: None
>> Changes in v7:
>> - fixed warning because of unknown entity type
>> - fixed v4l2-compliance errors regarding rkisp1 formats, try formats
>> and default values
>> - fix typo riksp1/rkisp1
>> - redesign: remove mipi/csi subdevice, sensors connect directly to the
>> isp subdevice in the media topology now. As a consequence, remove the
>> hack in mipidphy_g_mbus_config() where information from the sensor was
>> being propagated through the topology.
>> - From the old dphy:
>> * cache get_remote_sensor() in s_stream
>> * use V4L2_CID_PIXEL_RATE instead of V4L2_CID_LINK_FREQ
>> - Replace stream state with a boolean
>> - code styling and checkpatch fixes
>> - fix stop_stream (return after calling stop, do not reenable the stream)
>> - fix rkisp1_isp_sd_get_selection when V4L2_SUBDEV_FORMAT_TRY is set
>> - fix get format in output (isp_sd->out_fmt.mbus_code was being ignored)
>> - s/intput/input
>> - remove #define sd_to_isp_sd(_sd), add a static inline as it will be
>> reused by the capture
>>
>>  drivers/media/platform/rockchip/isp1/rkisp1.c | 1286 +
>>  drivers/media/platform/rockchip/isp1/rkisp1.h |  111 ++
>>  2 files changed, 1397 insertions(+)
>>  create mode 100644 drivers/media/platform/rockchip/isp1/rkisp1.c
>>  create mode 100644 drivers/media/platform/rockchip/isp1/rkisp1.h
>>
>> diff --git a/drivers/media/platform/rockchip/isp1/rkisp1.c 
>> b/drivers/media/platform/rockchip/isp1/rkisp1.c
>> new file mode 100644
>> index ..6d0c0ffb5e03
>> --- /dev/null
>> +++ b/drivers/media/platform/rockchip/isp1/rkisp1.c
>> @@ -0,0 +1,1286 @@
>> +// SPDX-License-Identifier: (GPL-2.0+ OR MIT)
>> +/*
>> + * Rockchip isp1 driver
>> + *
>> + * Copyright (C) 2017 Rockchip Electronics Co., Ltd.
>> + */
>> +
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +
>> +#include "common.h"
>> +#include "regs.h"
>> +
>> +#define CIF_ISP_INPUT_W_MAX 4032
>> +#define CIF_ISP_INPUT_H_MAX 3024
>> +#define CIF_ISP_INPUT_W_MIN 32
>> +#define CIF_ISP_INPUT_H_MIN 32
>> +#define CIF_ISP_OUTPUT_W_MAXCIF_ISP_INPUT_W_MAX
>> +#define CIF_ISP_OUTPUT_H_MAXCIF_ISP_INPUT_H_MAX
>> +#define CIF_ISP_OUTPUT_W_MINCIF_ISP_INPUT_W_MIN
>> +#define CIF_ISP_OUTPUT_H_MINCIF_ISP_INPUT_H_MIN
>> +
>> +/*
>> + * NOTE: MIPI controller and input MUX are also configured in this file,
>> + * because ISP Subdev is not only describe ISP submodule(input size,format,
>> + * output size, format), but also a virtual route device.
>> + */
>> +
>> +/*
>> + * There are many variables named with format/frame in below code,
>> + * please see here for their meaning.
>> + *
>> + * Cropping regions of ISP
>> + *
>> + * +-+
>> + * | Sensor image|
>> + * | +---+   |
>> + * | | ISP_ACQ (for black level) |   |
>> + * | | in_frm|   |
>> + * | | ++|   |
>> + * | | |ISP_OUT ||   |
>> + * | | |in_crop ||   |
>> + * | | |+-+ ||   |
>> + * | | ||   ISP_IS| ||   |
>> + * | | ||   rkisp1_isp_subdev: out_crop   | ||   |
>> + * | | |+-+ ||   |
>> + * | | ++|   |
>> + * | +---+   |
>> + * +-+
>> + */
>> +
>> +static inline struct rkisp1_device *sd_to_isp_dev(struct v4l2_subdev *sd)
>> +{
>> +return container_of(sd->v4l2_dev, struct rkisp1_device, v4l2_dev);
>> +}
>> +
>> +/* Get sensor by enabled media link */
>> +static struct v4l2_subdev *get_remote_sensor(struct v4l2_subdev *sd)
>> +{
>> +struct media_pad *local, *remote;
>> +struct media_entity *sensor_me;
>> +
>> +local = >entity.pads[RKISP1_ISP_PAD_SINK];
>> +remote = media_entity_remote_pad(local);
>> +if (!remote) {

RE: [PATCH] scsi: fnic: remove redundant assignment of variable rc

2019-08-14 Thread Karan Tilak Kumar (kartilak)

Acked-by:   Karan Tilak Kumar   

-Original Message-
From: Colin King  
Sent: Tuesday, August 13, 2019 6:24 AM
To: Satish Kharat (satishkh) ; Sesidhar Baddela (sebaddel) 
; Karan Tilak Kumar (kartilak) ; James 
E . J . Bottomley ; Martin K . Petersen 
; linux-s...@vger.kernel.org
Cc: kernel-janit...@vger.kernel.org; linux-kernel@vger.kernel.org
Subject: [PATCH] scsi: fnic: remove redundant assignment of variable rc

From: Colin Ian King 

Variable ret is initialized to a value that is never read and it is
re-assigned later and immediatetly returns. Clean up the code by
removing rc and just returning 0.

Addresses-Coverity: ("Unused value")
Signed-off-by: Colin Ian King 
---
 drivers/scsi/fnic/fnic_debugfs.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/scsi/fnic/fnic_debugfs.c b/drivers/scsi/fnic/fnic_debugfs.c
index 21991c99db7c..13f7d88d6e57 100644
--- a/drivers/scsi/fnic/fnic_debugfs.c
+++ b/drivers/scsi/fnic/fnic_debugfs.c
@@ -52,7 +52,6 @@ static struct fc_trace_flag_type *fc_trc_flag;
  */
 int fnic_debugfs_init(void)
 {
-   int rc = -1;
fnic_trace_debugfs_root = debugfs_create_dir("fnic", NULL);
 
fnic_stats_debugfs_root = debugfs_create_dir("statistics",
@@ -70,8 +69,7 @@ int fnic_debugfs_init(void)
fc_trc_flag->fc_clear = 4;
}
 
-   rc = 0;
-   return rc;
+   return 0;
 }
 
 /*
-- 
2.20.1

[PATCH v10 5/7] powerpc/memcpy: Add memcpy_mcsafe for pmem

2019-08-14 Thread Santosh Sivaraj

From: Balbir Singh 

The pmem infrastructure uses memcpy_mcsafe in the pmem layer so as to
convert machine check exceptions into a return value on failure in case
a machine check exception is encountered during the memcpy. The return
value is the number of bytes remaining to be copied.

This patch largely borrows from the copyuser_power7 logic and does not add
the VMX optimizations, largely to keep the patch simple. If needed those
optimizations can be folded in.

Signed-off-by: Balbir Singh 
[ar...@linux.ibm.com: Added symbol export]
Co-developed-by: Santosh Sivaraj 
Signed-off-by: Santosh Sivaraj 
---
 arch/powerpc/include/asm/string.h   |   2 +
 arch/powerpc/lib/Makefile   |   2 +-
 arch/powerpc/lib/memcpy_mcsafe_64.S | 242 
 3 files changed, 245 insertions(+), 1 deletion(-)
 create mode 100644 arch/powerpc/lib/memcpy_mcsafe_64.S

diff --git a/arch/powerpc/include/asm/string.h 
b/arch/powerpc/include/asm/string.h
index 9bf6dffb4090..b72692702f35 100644
--- a/arch/powerpc/include/asm/string.h
+++ b/arch/powerpc/include/asm/string.h
@@ -53,7 +53,9 @@ void *__memmove(void *to, const void *from, __kernel_size_t 
n);
 #ifndef CONFIG_KASAN
 #define __HAVE_ARCH_MEMSET32
 #define __HAVE_ARCH_MEMSET64
+#define __HAVE_ARCH_MEMCPY_MCSAFE
 
+extern int memcpy_mcsafe(void *dst, const void *src, __kernel_size_t sz);
 extern void *__memset16(uint16_t *, uint16_t v, __kernel_size_t);
 extern void *__memset32(uint32_t *, uint32_t v, __kernel_size_t);
 extern void *__memset64(uint64_t *, uint64_t v, __kernel_size_t);
diff --git a/arch/powerpc/lib/Makefile b/arch/powerpc/lib/Makefile
index eebc782d89a5..fa6b1b657b43 100644
--- a/arch/powerpc/lib/Makefile
+++ b/arch/powerpc/lib/Makefile
@@ -39,7 +39,7 @@ obj-$(CONFIG_PPC_BOOK3S_64) += copyuser_power7.o 
copypage_power7.o \
   memcpy_power7.o
 
 obj64-y+= copypage_64.o copyuser_64.o mem_64.o hweight_64.o \
-  memcpy_64.o pmem.o
+  memcpy_64.o pmem.o memcpy_mcsafe_64.o
 
 obj64-$(CONFIG_SMP)+= locks.o
 obj64-$(CONFIG_ALTIVEC)+= vmx-helper.o
diff --git a/arch/powerpc/lib/memcpy_mcsafe_64.S 
b/arch/powerpc/lib/memcpy_mcsafe_64.S
new file mode 100644
index ..949976dc115d
--- /dev/null
+++ b/arch/powerpc/lib/memcpy_mcsafe_64.S
@@ -0,0 +1,242 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) IBM Corporation, 2011
+ * Derived from copyuser_power7.s by Anton Blanchard 
+ * Author - Balbir Singh 
+ */
+#include 
+#include 
+#include 
+
+   .macro err1
+100:
+   EX_TABLE(100b,.Ldo_err1)
+   .endm
+
+   .macro err2
+200:
+   EX_TABLE(200b,.Ldo_err2)
+   .endm
+
+   .macro err3
+300:   EX_TABLE(300b,.Ldone)
+   .endm
+
+.Ldo_err2:
+   ld  r22,STK_REG(R22)(r1)
+   ld  r21,STK_REG(R21)(r1)
+   ld  r20,STK_REG(R20)(r1)
+   ld  r19,STK_REG(R19)(r1)
+   ld  r18,STK_REG(R18)(r1)
+   ld  r17,STK_REG(R17)(r1)
+   ld  r16,STK_REG(R16)(r1)
+   ld  r15,STK_REG(R15)(r1)
+   ld  r14,STK_REG(R14)(r1)
+   addir1,r1,STACKFRAMESIZE
+.Ldo_err1:
+   /* Do a byte by byte copy to get the exact remaining size */
+   mtctr   r7
+46:
+err3;  lbz r0,0(r4)
+   addir4,r4,1
+err3;  stb r0,0(r3)
+   addir3,r3,1
+   bdnz46b
+   li  r3,0
+   blr
+
+.Ldone:
+   mfctr   r3
+   blr
+
+
+_GLOBAL(memcpy_mcsafe)
+   mr  r7,r5
+   cmpldi  r5,16
+   blt .Lshort_copy
+
+.Lcopy:
+   /* Get the source 8B aligned */
+   neg r6,r4
+   mtocrf  0x01,r6
+   clrldi  r6,r6,(64-3)
+
+   bf  cr7*4+3,1f
+err1;  lbz r0,0(r4)
+   addir4,r4,1
+err1;  stb r0,0(r3)
+   addir3,r3,1
+   subir7,r7,1
+
+1: bf  cr7*4+2,2f
+err1;  lhz r0,0(r4)
+   addir4,r4,2
+err1;  sth r0,0(r3)
+   addir3,r3,2
+   subir7,r7,2
+
+2: bf  cr7*4+1,3f
+err1;  lwz r0,0(r4)
+   addir4,r4,4
+err1;  stw r0,0(r3)
+   addir3,r3,4
+   subir7,r7,4
+
+3: sub r5,r5,r6
+   cmpldi  r5,128
+   blt 5f
+
+   mflrr0
+   stdur1,-STACKFRAMESIZE(r1)
+   std r14,STK_REG(R14)(r1)
+   std r15,STK_REG(R15)(r1)
+   std r16,STK_REG(R16)(r1)
+   std r17,STK_REG(R17)(r1)
+   std r18,STK_REG(R18)(r1)
+   std r19,STK_REG(R19)(r1)
+   std r20,STK_REG(R20)(r1)
+   std r21,STK_REG(R21)(r1)
+   std r22,STK_REG(R22)(r1)
+   std r0,STACKFRAMESIZE+16(r1)
+
+   srdir6,r5,7
+   mtctr   r6
+
+   /* Now do cacheline (128B) sized loads and stores. */
+   .align  5
+4:
+err2;  ld  r0,0(r4)
+err2;  ld  r6,8(r4)
+err2;  ld  r8,16(r4)
+err2;  ld  r9,24(r4)
+err2;  ld  r10,32(r4)
+err2;  ld  r11,40(r4)
+err2;  ld  r12,48(r4)
+err2;  ld  r14,56(r4)
+err2;  ld  r15,64(r4)
+err2;  ld

[PATCH v10 6/7] powerpc/mce: Handle UE event for memcpy_mcsafe

2019-08-14 Thread Santosh Sivaraj

From: Balbir Singh 

If we take a UE on one of the instructions with a fixup entry, set nip
to continue execution at the fixup entry. Stop processing the event
further or print it.

Co-developed-by: Reza Arbab 
Signed-off-by: Reza Arbab 
Signed-off-by: Balbir Singh 
Signed-off-by: Santosh Sivaraj 
Reviewed-by: Mahesh Salgaonkar 
---
 arch/powerpc/include/asm/mce.h  |  4 +++-
 arch/powerpc/kernel/mce.c   | 16 
 arch/powerpc/kernel/mce_power.c | 15 +--
 3 files changed, 32 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/include/asm/mce.h b/arch/powerpc/include/asm/mce.h
index f3a6036b6bc0..e1931c8c2743 100644
--- a/arch/powerpc/include/asm/mce.h
+++ b/arch/powerpc/include/asm/mce.h
@@ -122,7 +122,8 @@ struct machine_check_event {
enum MCE_UeErrorType ue_error_type:8;
u8  effective_address_provided;
u8  physical_address_provided;
-   u8  reserved_1[5];
+   u8  ignore_event;
+   u8  reserved_1[4];
u64 effective_address;
u64 physical_address;
u8  reserved_2[8];
@@ -193,6 +194,7 @@ struct mce_error_info {
enum MCE_Initiator  initiator:8;
enum MCE_ErrorClass error_class:8;
boolsync_error;
+   boolignore_event;
 };
 
 #define MAX_MC_EVT 100
diff --git a/arch/powerpc/kernel/mce.c b/arch/powerpc/kernel/mce.c
index a3b122a685a5..ec4b3e1087be 100644
--- a/arch/powerpc/kernel/mce.c
+++ b/arch/powerpc/kernel/mce.c
@@ -149,6 +149,7 @@ void save_mce_event(struct pt_regs *regs, long handled,
if (phys_addr != ULONG_MAX) {
mce->u.ue_error.physical_address_provided = true;
mce->u.ue_error.physical_address = phys_addr;
+   mce->u.ue_error.ignore_event = mce_err->ignore_event;
machine_check_ue_event(mce);
}
}
@@ -266,8 +267,17 @@ static void machine_process_ue_event(struct work_struct 
*work)
/*
 * This should probably queued elsewhere, but
 * oh! well
+*
+* Don't report this machine check because the caller has a
+* asked us to ignore the event, it has a fixup handler which
+* will do the appropriate error handling and reporting.
 */
if (evt->error_type == MCE_ERROR_TYPE_UE) {
+   if (evt->u.ue_error.ignore_event) {
+   __this_cpu_dec(mce_ue_count);
+   continue;
+   }
+
if (evt->u.ue_error.physical_address_provided) {
unsigned long pfn;
 
@@ -301,6 +311,12 @@ static void machine_check_process_queued_event(struct 
irq_work *work)
while (__this_cpu_read(mce_queue_count) > 0) {
index = __this_cpu_read(mce_queue_count) - 1;
evt = this_cpu_ptr(_event_queue[index]);
+
+   if (evt->error_type == MCE_ERROR_TYPE_UE &&
+   evt->u.ue_error.ignore_event) {
+   __this_cpu_dec(mce_queue_count);
+   continue;
+   }
machine_check_print_event_info(evt, false, false);
__this_cpu_dec(mce_queue_count);
}
diff --git a/arch/powerpc/kernel/mce_power.c b/arch/powerpc/kernel/mce_power.c
index e74816f045f8..1dd87f6f5186 100644
--- a/arch/powerpc/kernel/mce_power.c
+++ b/arch/powerpc/kernel/mce_power.c
@@ -11,6 +11,7 @@
 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -18,6 +19,7 @@
 #include 
 #include 
 #include 
+#include 
 
 /*
  * Convert an address related to an mm to a physical address.
@@ -559,9 +561,18 @@ static int mce_handle_derror(struct pt_regs *regs,
return 0;
 }
 
-static long mce_handle_ue_error(struct pt_regs *regs)
+static long mce_handle_ue_error(struct pt_regs *regs,
+   struct mce_error_info *mce_err)
 {
long handled = 0;
+   const struct exception_table_entry *entry;
+
+   entry = search_kernel_exception_table(regs->nip);
+   if (entry) {
+   mce_err->ignore_event = true;
+   regs->nip = extable_fixup(entry);
+   return 1;
+   }
 
/*
 * On specific SCOM read via MMIO we may get a machine check
@@ -594,7 +605,7 @@ static long mce_handle_error(struct pt_regs *regs,
_addr);
 
if (!handled && mce_err.error_type == MCE_ERROR_TYPE_UE)
-   handled = mce_handle_ue_error(regs);
+   handled = mce_handle_ue_error(regs, _err);
 
save_mce_event(regs,

[PATCH v10 7/7] powerpc: add machine check safe copy_to_user

2019-08-14 Thread Santosh Sivaraj

Use  memcpy_mcsafe() implementation to define copy_to_user_mcsafe()

Signed-off-by: Santosh Sivaraj 
---
 arch/powerpc/Kconfig   |  1 +
 arch/powerpc/include/asm/uaccess.h | 14 ++
 2 files changed, 15 insertions(+)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 77f6ebf97113..4316e36095a2 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -137,6 +137,7 @@ config PPC
select ARCH_HAS_STRICT_KERNEL_RWX   if ((PPC_BOOK3S_64 || PPC32) && 
!RELOCATABLE && !HIBERNATION)
select ARCH_HAS_TICK_BROADCAST  if GENERIC_CLOCKEVENTS_BROADCAST
select ARCH_HAS_UACCESS_FLUSHCACHE  if PPC64
+   select ARCH_HAS_UACCESS_MCSAFE  if PPC64
select ARCH_HAS_UBSAN_SANITIZE_ALL
select ARCH_HAVE_NMI_SAFE_CMPXCHG
select ARCH_KEEP_MEMBLOCK
diff --git a/arch/powerpc/include/asm/uaccess.h 
b/arch/powerpc/include/asm/uaccess.h
index 8b03eb44e876..15002b51ff18 100644
--- a/arch/powerpc/include/asm/uaccess.h
+++ b/arch/powerpc/include/asm/uaccess.h
@@ -387,6 +387,20 @@ static inline unsigned long raw_copy_to_user(void __user 
*to,
return ret;
 }
 
+static __always_inline unsigned long __must_check
+copy_to_user_mcsafe(void __user *to, const void *from, unsigned long n)
+{
+   if (likely(check_copy_size(from, n, true))) {
+   if (access_ok(to, n)) {
+   allow_write_to_user(to, n);
+   n = memcpy_mcsafe((void *)to, from, n);
+   prevent_write_to_user(to, n);
+   }
+   }
+
+   return n;
+}
+
 extern unsigned long __clear_user(void __user *addr, unsigned long size);
 
 static inline unsigned long clear_user(void __user *addr, unsigned long size)
-- 
2.21.0

[PATCH v10 4/7] extable: Add function to search only kernel exception table

2019-08-14 Thread Santosh Sivaraj

Certain architecture specific operating modes (e.g., in powerpc machine
check handler that is unable to access vmalloc memory), the
search_exception_tables cannot be called because it also searches the
module exception tables if entry is not found in the kernel exception
table.

Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: Nicholas Piggin 
Signed-off-by: Santosh Sivaraj 
Reviewed-by: Nicholas Piggin 
---
 include/linux/extable.h |  2 ++
 kernel/extable.c| 11 +--
 2 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/include/linux/extable.h b/include/linux/extable.h
index 41c5b3a25f67..81ecfaa83ad3 100644
--- a/include/linux/extable.h
+++ b/include/linux/extable.h
@@ -19,6 +19,8 @@ void trim_init_extable(struct module *m);
 
 /* Given an address, look for it in the exception tables */
 const struct exception_table_entry *search_exception_tables(unsigned long add);
+const struct exception_table_entry *
+search_kernel_exception_table(unsigned long addr);
 
 #ifdef CONFIG_MODULES
 /* For extable.c to search modules' exception tables. */
diff --git a/kernel/extable.c b/kernel/extable.c
index e23cce6e6092..f6c9406eec7d 100644
--- a/kernel/extable.c
+++ b/kernel/extable.c
@@ -40,13 +40,20 @@ void __init sort_main_extable(void)
}
 }
 
+/* Given an address, look for it in the kernel exception table */
+const
+struct exception_table_entry *search_kernel_exception_table(unsigned long addr)
+{
+   return search_extable(__start___ex_table,
+ __stop___ex_table - __start___ex_table, addr);
+}
+
 /* Given an address, look for it in the exception tables. */
 const struct exception_table_entry *search_exception_tables(unsigned long addr)
 {
const struct exception_table_entry *e;
 
-   e = search_extable(__start___ex_table,
-  __stop___ex_table - __start___ex_table, addr);
+   e = search_kernel_exception_table(addr);
if (!e)
e = search_module_extables(addr);
return e;
-- 
2.21.0

[PATCH v10 2/7] powerpc/mce: Fix MCE handling for huge pages

2019-08-14 Thread Santosh Sivaraj

From: Balbir Singh 

The current code would fail on huge pages addresses, since the shift would
be incorrect. Use the correct page shift value returned by
__find_linux_pte() to get the correct physical address. The code is more
generic and can handle both regular and compound pages.

Fixes: ba41e1e1ccb9 ("powerpc/mce: Hookup derror (load/store) UE errors")
Signed-off-by: Balbir Singh 
[ar...@linux.ibm.com: Fixup pseries_do_memory_failure()]
Signed-off-by: Reza Arbab 
Co-developed-by: Santosh Sivaraj 
Signed-off-by: Santosh Sivaraj 
Tested-by: Mahesh Salgaonkar 
Cc: sta...@vger.kernel.org # v4.15+
---
 arch/powerpc/include/asm/mce.h   |  2 +-
 arch/powerpc/kernel/mce_power.c  | 55 ++--
 arch/powerpc/platforms/pseries/ras.c |  9 ++---
 3 files changed, 32 insertions(+), 34 deletions(-)

diff --git a/arch/powerpc/include/asm/mce.h b/arch/powerpc/include/asm/mce.h
index a4c6a74ad2fb..f3a6036b6bc0 100644
--- a/arch/powerpc/include/asm/mce.h
+++ b/arch/powerpc/include/asm/mce.h
@@ -209,7 +209,7 @@ extern void release_mce_event(void);
 extern void machine_check_queue_event(void);
 extern void machine_check_print_event_info(struct machine_check_event *evt,
   bool user_mode, bool in_guest);
-unsigned long addr_to_pfn(struct pt_regs *regs, unsigned long addr);
+unsigned long addr_to_phys(struct pt_regs *regs, unsigned long addr);
 #ifdef CONFIG_PPC_BOOK3S_64
 void flush_and_reload_slb(void);
 #endif /* CONFIG_PPC_BOOK3S_64 */
diff --git a/arch/powerpc/kernel/mce_power.c b/arch/powerpc/kernel/mce_power.c
index a814d2dfb5b0..e74816f045f8 100644
--- a/arch/powerpc/kernel/mce_power.c
+++ b/arch/powerpc/kernel/mce_power.c
@@ -20,13 +20,14 @@
 #include 
 
 /*
- * Convert an address related to an mm to a PFN. NOTE: we are in real
- * mode, we could potentially race with page table updates.
+ * Convert an address related to an mm to a physical address.
+ * NOTE: we are in real mode, we could potentially race with page table 
updates.
  */
-unsigned long addr_to_pfn(struct pt_regs *regs, unsigned long addr)
+unsigned long addr_to_phys(struct pt_regs *regs, unsigned long addr)
 {
-   pte_t *ptep;
-   unsigned long flags;
+   pte_t *ptep, pte;
+   unsigned int shift;
+   unsigned long flags, phys_addr;
struct mm_struct *mm;
 
if (user_mode(regs))
@@ -35,14 +36,21 @@ unsigned long addr_to_pfn(struct pt_regs *regs, unsigned 
long addr)
mm = _mm;
 
local_irq_save(flags);
-   if (mm == current->mm)
-   ptep = find_current_mm_pte(mm->pgd, addr, NULL, NULL);
-   else
-   ptep = find_init_mm_pte(addr, NULL);
+   ptep = __find_linux_pte(mm->pgd, addr, NULL, );
local_irq_restore(flags);
+
if (!ptep || pte_special(*ptep))
return ULONG_MAX;
-   return pte_pfn(*ptep);
+
+   pte = *ptep;
+   if (shift > PAGE_SHIFT) {
+   unsigned long rpnmask = (1ul << shift) - PAGE_SIZE;
+
+   pte = __pte(pte_val(pte) | (addr & rpnmask));
+   }
+   phys_addr = pte_pfn(pte) << PAGE_SHIFT;
+
+   return phys_addr;
 }
 
 /* flush SLBs and reload */
@@ -344,7 +352,7 @@ static const struct mce_derror_table mce_p9_derror_table[] 
= {
   MCE_INITIATOR_CPU,   MCE_SEV_SEVERE, true },
 { 0, false, 0, 0, 0, 0, 0 } };
 
-static int mce_find_instr_ea_and_pfn(struct pt_regs *regs, uint64_t *addr,
+static int mce_find_instr_ea_and_phys(struct pt_regs *regs, uint64_t *addr,
uint64_t *phys_addr)
 {
/*
@@ -354,18 +362,16 @@ static int mce_find_instr_ea_and_pfn(struct pt_regs 
*regs, uint64_t *addr,
 * faults
 */
int instr;
-   unsigned long pfn, instr_addr;
+   unsigned long instr_addr;
struct instruction_op op;
struct pt_regs tmp = *regs;
 
-   pfn = addr_to_pfn(regs, regs->nip);
-   if (pfn != ULONG_MAX) {
-   instr_addr = (pfn << PAGE_SHIFT) + (regs->nip & ~PAGE_MASK);
+   instr_addr = addr_to_phys(regs, regs->nip) + (regs->nip & ~PAGE_MASK);
+   if (instr_addr != ULONG_MAX) {
instr = *(unsigned int *)(instr_addr);
if (!analyse_instr(, , instr)) {
-   pfn = addr_to_pfn(regs, op.ea);
*addr = op.ea;
-   *phys_addr = (pfn << PAGE_SHIFT);
+   *phys_addr = addr_to_phys(regs, op.ea);
return 0;
}
/*
@@ -440,15 +446,9 @@ static int mce_handle_ierror(struct pt_regs *regs,
*addr = regs->nip;
if (mce_err->sync_error &&
table[i].error_type == MCE_ERROR_TYPE_UE) {
-   unsigned long pfn;
-
-   if (get_paca()->in_mce < MAX_MCE_DEPTH) {
-   pfn = addr_to_pfn(regs, regs->nip);
-

[PATCH v10 1/7] powerpc/mce: Schedule work from irq_work

2019-08-14 Thread Santosh Sivaraj

schedule_work() cannot be called from MCE exception context as MCE can
interrupt even in interrupt disabled context.

fixes: 733e4a4c ("powerpc/mce: hookup memory_failure for UE errors")
Suggested-by: Mahesh Salgaonkar 
Signed-off-by: Santosh Sivaraj 
Reviewed-by: Mahesh Salgaonkar 
Acked-by: Balbir Singh 
Cc: sta...@vger.kernel.org # v4.15+
---
 arch/powerpc/kernel/mce.c | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/mce.c b/arch/powerpc/kernel/mce.c
index b18df633eae9..cff31d4a501f 100644
--- a/arch/powerpc/kernel/mce.c
+++ b/arch/powerpc/kernel/mce.c
@@ -33,6 +33,7 @@ static DEFINE_PER_CPU(struct machine_check_event[MAX_MC_EVT],
mce_ue_event_queue);
 
 static void machine_check_process_queued_event(struct irq_work *work);
+static void machine_check_ue_irq_work(struct irq_work *work);
 void machine_check_ue_event(struct machine_check_event *evt);
 static void machine_process_ue_event(struct work_struct *work);
 
@@ -40,6 +41,10 @@ static struct irq_work mce_event_process_work = {
 .func = machine_check_process_queued_event,
 };
 
+static struct irq_work mce_ue_event_irq_work = {
+   .func = machine_check_ue_irq_work,
+};
+
 DECLARE_WORK(mce_ue_event_work, machine_process_ue_event);
 
 static void mce_set_error_info(struct machine_check_event *mce,
@@ -199,6 +204,10 @@ void release_mce_event(void)
get_mce_event(NULL, true);
 }
 
+static void machine_check_ue_irq_work(struct irq_work *work)
+{
+   schedule_work(_ue_event_work);
+}
 
 /*
  * Queue up the MCE event which then can be handled later.
@@ -216,7 +225,7 @@ void machine_check_ue_event(struct machine_check_event *evt)
memcpy(this_cpu_ptr(_ue_event_queue[index]), evt, sizeof(*evt));
 
/* Queue work to process this event later. */
-   schedule_work(_ue_event_work);
+   irq_work_queue(_ue_event_irq_work);
 }
 
 /*
-- 
2.21.0

[PATCH v10 3/7] powerpc/mce: Make machine_check_ue_event() static

2019-08-14 Thread Santosh Sivaraj

From: Reza Arbab 

The function doesn't get used outside this file, so make it static.

Signed-off-by: Reza Arbab 
Signed-off-by: Santosh Sivaraj 
Reviewed-by: Nicholas Piggin 
---
 arch/powerpc/kernel/mce.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kernel/mce.c b/arch/powerpc/kernel/mce.c
index cff31d4a501f..a3b122a685a5 100644
--- a/arch/powerpc/kernel/mce.c
+++ b/arch/powerpc/kernel/mce.c
@@ -34,7 +34,7 @@ static DEFINE_PER_CPU(struct machine_check_event[MAX_MC_EVT],
 
 static void machine_check_process_queued_event(struct irq_work *work);
 static void machine_check_ue_irq_work(struct irq_work *work);
-void machine_check_ue_event(struct machine_check_event *evt);
+static void machine_check_ue_event(struct machine_check_event *evt);
 static void machine_process_ue_event(struct work_struct *work);
 
 static struct irq_work mce_event_process_work = {
@@ -212,7 +212,7 @@ static void machine_check_ue_irq_work(struct irq_work *work)
 /*
  * Queue up the MCE event which then can be handled later.
  */
-void machine_check_ue_event(struct machine_check_event *evt)
+static void machine_check_ue_event(struct machine_check_event *evt)
 {
int index;
 
-- 
2.21.0

[PATCH v10 0/7] powerpc: implement machine check safe memcpy

2019-08-14 Thread Santosh Sivaraj

During a memcpy from a pmem device, if a machine check exception is
generated we end up in a panic. In case of fsdax read, this should
only result in a -EIO. Avoid MCE by implementing memcpy_mcsafe.

Before this patch series:

```
bash-4.4# mount -o dax /dev/pmem0 /mnt/pmem/
[ 7621.714094] Disabling lock debugging due to kernel taint
[ 7621.714099] MCE: CPU0: machine check (Severe) Host UE Load/Store [Not 
recovered]
[ 7621.714104] MCE: CPU0: NIP: [c0088978] memcpy_power7+0x418/0x7e0
[ 7621.714107] MCE: CPU0: Hardware error
[ 7621.714112] opal: Hardware platform error: Unrecoverable Machine Check 
exception
[ 7621.714118] CPU: 0 PID: 1368 Comm: mount Tainted: G   M  
5.2.0-rc5-00239-g241e39004581
#50
[ 7621.714123] NIP:  c0088978 LR: c08e16f8 CTR: 01de
[ 7621.714129] REGS: c000fffbfd70 TRAP: 0200   Tainted: G   M  
(5.2.0-rc5-00239-g241e39004581)
[ 7621.714131] MSR:  92209033   CR: 
24428840  XER: 0004
[ 7621.714160] CFAR: c00889a8 DAR: deadbeefdeadbeef DSISR: 8000 
IRQMASK: 0
[ 7621.714171] GPR00: 0e00 c000f0b8b1e0 c12cf100 
c000ed8e1100 
[ 7621.714186] GPR04: c2001100 0001 0200 
03fff1272000 
[ 7621.714201] GPR08: 8000 0010 0020 
0030 
[ 7621.714216] GPR12: 0040 7fffb8c6d390 0050 
0060 
[ 7621.714232] GPR16: 0070  0001 
c000f0b8b960 
[ 7621.714247] GPR20: 0001 c000f0b8b940 0001 
0001 
[ 7621.714262] GPR24: c1382560 c00c003b6380 c00c003b6380 
0001 
[ 7621.714277] GPR28:  0001 c200 
0001 
[ 7621.714294] NIP [c0088978] memcpy_power7+0x418/0x7e0
[ 7621.714298] LR [c08e16f8] pmem_do_bvec+0xf8/0x430
...  ...
```

After this patch series:

```
bash-4.4# mount -o dax /dev/pmem0 /mnt/pmem/
[25302.883978] Buffer I/O error on dev pmem0, logical block 0, async page read
[25303.020816] EXT4-fs (pmem0): DAX enabled. Warning: EXPERIMENTAL, use at your 
own risk
[25303.021236] EXT4-fs (pmem0): Can't read superblock on 2nd try
[25303.152515] EXT4-fs (pmem0): DAX enabled. Warning: EXPERIMENTAL, use at your 
own risk
[25303.284031] EXT4-fs (pmem0): DAX enabled. Warning: EXPERIMENTAL, use at your 
own risk
[25304.084100] UDF-fs: bad mount option "dax" or missing value
mount: /mnt/pmem: wrong fs type, bad option, bad superblock on /dev/pmem0, 
missing codepage or helper
program, or other error.
```

MCE is injected on a pmem address using mambo. The last patch which adds a
nop is only for testing on mambo, where r13 is not restored upon hitting
vector 200.

The memcpy code can be optimised by adding VMX optimizations and GAS macros
can be used to enable code reusablity, which I will send as another series.
--
v10: Fix authorship; add reviewed-bys and acks.

v9:
* Add a new IRQ work for UE events [mahesh]
* Reorder patches, and copy stable

v8:
* While ignoring UE events, return was used instead of continue.
* Checkpatch fixups for commit log

v7:
* Move schedule_work to be called from irq_work.

v6:
* Don't return pfn, all callees are expecting physical address anyway [nick]
* Patch re-ordering: move exception table patch before memcpy_mcsafe patch 
[nick]
* Reword commit log for search_exception_tables patch [nick]

v5:
* Don't use search_exception_tables since it searches for module exception 
tables
  also [Nicholas]
* Fix commit message for patch 2 [Nicholas]

v4:
* Squash return remaining bytes patch to memcpy_mcsafe implemtation patch 
[christophe]
* Access ok should be checked for copy_to_user_mcsafe() [christophe]

v3:
* Drop patch which enables DR/IR for external modules
* Drop notifier call chain, we don't want to do that in real mode
* Return remaining bytes from memcpy_mcsafe correctly
* We no longer restore r13 for simulator tests, rather use a nop at 
  vector 0x200 [workaround for simulator; not to be merged]

v2:
* Don't set RI bit explicitly [mahesh]
* Re-ordered series to get r13 workaround as the last patch

--
Balbir Singh (3):
  powerpc/mce: Fix MCE handling for huge pages
  powerpc/memcpy: Add memcpy_mcsafe for pmem
  powerpc/mce: Handle UE event for memcpy_mcsafe

Reza Arbab (1):
  powerpc/mce: Make machine_check_ue_event() static

Santosh Sivaraj (3):
  powerpc/mce: Schedule work from irq_work
  extable: Add function to search only kernel exception table
  powerpc: add machine check safe copy_to_user

 arch/powerpc/Kconfig |   1 +
 arch/powerpc/include/asm/mce.h   |   6 +-
 arch/powerpc/include/asm/string.h|   2 +
 arch/powerpc/include/asm/uaccess.h   |  14 ++
 arch/powerpc/kernel/mce.c|  31 +++-
 arch/powerpc/kernel/mce_power.c  |  70 
 arch/powerpc/lib/Makefile|   2 +-
 arch/powerpc/lib/memcpy_mcsafe_64.S  | 242

Re: [PATCH v6 5/8] clk: mediatek: Add MT6765 clock support

2019-08-14 Thread Stephen Boyd

Quoting Macpaul Lin (2019-07-12 02:43:41)
> diff --git a/drivers/clk/mediatek/clk-mt6765-audio.c 
> b/drivers/clk/mediatek/clk-mt6765-audio.c
> new file mode 100644
> index ..41f19343dfb9
> --- /dev/null
> +++ b/drivers/clk/mediatek/clk-mt6765-audio.c
> @@ -0,0 +1,109 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Copyright (c) 2018 MediaTek Inc.
> + * Author: Owen Chen 
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.

Please use SPDX tags.

> + */
> +
> +#include 
> +#include 
> +
> +#include "clk-mtk.h"
> +#include "clk-gate.h"
> +
> diff --git a/drivers/clk/mediatek/clk-mt6765-vcodec.c 
> b/drivers/clk/mediatek/clk-mt6765-vcodec.c
> new file mode 100644
> index ..eb9ae1c2c99c
> --- /dev/null
> +++ b/drivers/clk/mediatek/clk-mt6765-vcodec.c
> @@ -0,0 +1,79 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Copyright (c) 2018 MediaTek Inc.
> + * Author: Owen Chen 
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + */

SPDX tags.

> diff --git a/drivers/clk/mediatek/clk-mt6765.c 
> b/drivers/clk/mediatek/clk-mt6765.c
> new file mode 100644
> index ..f716a48a926d
> --- /dev/null
> +++ b/drivers/clk/mediatek/clk-mt6765.c
> @@ -0,0 +1,961 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Copyright (c) 2018 MediaTek Inc.
> + * Author: Owen Chen 
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.

SPDX tags.

> + */
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 

Is this used? Maybe I deleted it.

> +#include 
> +#include 
[...]
> +
> +static const char * const axi_parents[] = {
> +   "clk26m",
> +   "syspll_d7",
> +   "syspll1_d4",
> +   "syspll3_d2"
> +};
> +
> +static const char * const mem_parents[] = {
> +   "clk26m",
> +   "dmpll_ck",
> +   "apll1_ck"
> +};
> +
> +static const char * const mm_parents[] = {
> +   "clk26m",
> +   "mmpll_ck",
> +   "syspll1_d2",
> +   "syspll_d5",
> +   "syspll1_d4",
> +   "univpll_d5",
> +   "univpll1_d2",
> +   "mmpll_d2"
> +};
> +
> +static const char * const scp_parents[] = {
> +   "clk26m",
> +   "syspll4_d2",
> +   "univpll2_d2",
> +   "syspll1_d2",
> +   "univpll1_d2",
> +   "syspll_d3",
> +   "univpll_d3"
> +};
> +
> +static const char * const mfg_parents[] = {
> +   "clk26m",
> +   "mfgpll_ck",
> +   "syspll_d3",
> +   "univpll_d3"
> +};
> +
> +static const char * const atb_parents[] = {
> +   "clk26m",
> +   "syspll1_d4",
> +   "syspll1_d2"
> +};
> +
> +static const char * const camtg_parents[] = {
> +   "clk26m",
> +   "usb20_192m_d8",
> +   "univpll2_d8",
> +   "usb20_192m_d4",
> +   "univpll2_d32",
> +   "usb20_192m_d16",
> +   "usb20_192m_d32"
> +};
> +
> +static const char * const uart_parents[] = {
> +   "clk26m",
> +   "univpll2_d8"
> +};
> +
> +static const char * const spi_parents[] = {
> +   "clk26m",
> +   "syspll3_d2",
> +   "syspll4_d2",
> +   "syspll2_d4"
> +};
> +
> +static const char * const msdc5hclk_parents[] = {
> +   "clk26m",
> +   "syspll1_d2",
> +   "univpll1_d4",
> +   "syspll2_d2"
> +};
> +
> +static const char * const msdc50_0_parents[] = {
> +   "clk26m",
> +   "msdcpll_ck",
> +   "syspll2_d2",
> +   "syspll4_d2",
> +   "univpll1_d2",
> +   "syspll1_d2",
> +   "univpll_d5",
> +   "univpll1_d4"
> +};
> +
> +static const char * const msdc30_1_parents[] = {
> +   "clk26m",
> +   "msdcpll_d2",
> +   "univpll2_d2",
> +   "syspll2_d2",
> +   "syspll1_d4",
> +   "univpll1_d4",
> +   "usb20_192m_d4",
> +   "syspll2_d4"
> +};
> +
> +static const char * const audio_parents[] = {
> +   "clk26m",
> +

Re: [PATCH v4 2/7] x86: kvm: svm: propagate errors from skip_emulated_instruction()

2019-08-14 Thread Sean Christopherson

On Wed, Aug 14, 2019 at 11:34:52AM +0200, Vitaly Kuznetsov wrote:
> Sean Christopherson  writes:
>
> > x86_emulate_instruction() doesn't set vcpu->run->exit_reason when emulation
> > fails with EMULTYPE_SKIP, i.e. this will exit to userspace with garbage in
> > the exit_reason.
> 
> Oh, nice catch, will take a look!

Don't worry about addressing this.  Paolo has already queued the series,
and I've got a patch set waiting that purges emulation_result entirely
that I'll post once your series hits kvm/queue.

[PATCH v4 3/3] x86/kasan: support KASAN_VMALLOC

2019-08-14 Thread Daniel Axtens

In the case where KASAN directly allocates memory to back vmalloc
space, don't map the early shadow page over it.

We prepopulate pgds/p4ds for the range that would otherwise be empty.
This is required to get it synced to hardware on boot, allowing the
lower levels of the page tables to be filled dynamically.

Acked-by: Dmitry Vyukov 
Signed-off-by: Daniel Axtens 

---

v2: move from faulting in shadow pgds to prepopulating
---
 arch/x86/Kconfig|  1 +
 arch/x86/mm/kasan_init_64.c | 61 +
 2 files changed, 62 insertions(+)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 222855cc0158..40562cc3771f 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -134,6 +134,7 @@ config X86
select HAVE_ARCH_JUMP_LABEL
select HAVE_ARCH_JUMP_LABEL_RELATIVE
select HAVE_ARCH_KASAN  if X86_64
+   select HAVE_ARCH_KASAN_VMALLOC  if X86_64
select HAVE_ARCH_KGDB
select HAVE_ARCH_MMAP_RND_BITS  if MMU
select HAVE_ARCH_MMAP_RND_COMPAT_BITS   if MMU && COMPAT
diff --git a/arch/x86/mm/kasan_init_64.c b/arch/x86/mm/kasan_init_64.c
index 296da58f3013..2f57c4ddff61 100644
--- a/arch/x86/mm/kasan_init_64.c
+++ b/arch/x86/mm/kasan_init_64.c
@@ -245,6 +245,52 @@ static void __init kasan_map_early_shadow(pgd_t *pgd)
} while (pgd++, addr = next, addr != end);
 }
 
+static void __init kasan_shallow_populate_p4ds(pgd_t *pgd,
+   unsigned long addr,
+   unsigned long end,
+   int nid)
+{
+   p4d_t *p4d;
+   unsigned long next;
+   void *p;
+
+   p4d = p4d_offset(pgd, addr);
+   do {
+   next = p4d_addr_end(addr, end);
+
+   if (p4d_none(*p4d)) {
+   p = early_alloc(PAGE_SIZE, nid, true);
+   p4d_populate(_mm, p4d, p);
+   }
+   } while (p4d++, addr = next, addr != end);
+}
+
+static void __init kasan_shallow_populate_pgds(void *start, void *end)
+{
+   unsigned long addr, next;
+   pgd_t *pgd;
+   void *p;
+   int nid = early_pfn_to_nid((unsigned long)start);
+
+   addr = (unsigned long)start;
+   pgd = pgd_offset_k(addr);
+   do {
+   next = pgd_addr_end(addr, (unsigned long)end);
+
+   if (pgd_none(*pgd)) {
+   p = early_alloc(PAGE_SIZE, nid, true);
+   pgd_populate(_mm, pgd, p);
+   }
+
+   /*
+* we need to populate p4ds to be synced when running in
+* four level mode - see sync_global_pgds_l4()
+*/
+   kasan_shallow_populate_p4ds(pgd, addr, next, nid);
+   } while (pgd++, addr = next, addr != (unsigned long)end);
+}
+
+
 #ifdef CONFIG_KASAN_INLINE
 static int kasan_die_handler(struct notifier_block *self,
 unsigned long val,
@@ -352,9 +398,24 @@ void __init kasan_init(void)
shadow_cpu_entry_end = (void *)round_up(
(unsigned long)shadow_cpu_entry_end, PAGE_SIZE);
 
+   /*
+* If we're in full vmalloc mode, don't back vmalloc space with early
+* shadow pages. Instead, prepopulate pgds/p4ds so they are synced to
+* the global table and we can populate the lower levels on demand.
+*/
+#ifdef CONFIG_KASAN_VMALLOC
+   kasan_shallow_populate_pgds(
+   kasan_mem_to_shadow((void *)PAGE_OFFSET + MAXMEM),
+   kasan_mem_to_shadow((void *)VMALLOC_END));
+
+   kasan_populate_early_shadow(
+   kasan_mem_to_shadow((void *)VMALLOC_END + 1),
+   shadow_cpu_entry_begin);
+#else
kasan_populate_early_shadow(
kasan_mem_to_shadow((void *)PAGE_OFFSET + MAXMEM),
shadow_cpu_entry_begin);
+#endif
 
kasan_populate_shadow((unsigned long)shadow_cpu_entry_begin,
  (unsigned long)shadow_cpu_entry_end, 0);
-- 
2.20.1

[PATCH v4 1/3] kasan: support backing vmalloc space with real shadow memory

2019-08-14 Thread Daniel Axtens

Hook into vmalloc and vmap, and dynamically allocate real shadow
memory to back the mappings.

Most mappings in vmalloc space are small, requiring less than a full
page of shadow space. Allocating a full shadow page per mapping would
therefore be wasteful. Furthermore, to ensure that different mappings
use different shadow pages, mappings would have to be aligned to
KASAN_SHADOW_SCALE_SIZE * PAGE_SIZE.

Instead, share backing space across multiple mappings. Allocate
a backing page the first time a mapping in vmalloc space uses a
particular page of the shadow region. Keep this page around
regardless of whether the mapping is later freed - in the mean time
the page could have become shared by another vmalloc mapping.

This can in theory lead to unbounded memory growth, but the vmalloc
allocator is pretty good at reusing addresses, so the practical memory
usage grows at first but then stays fairly stable.

This requires architecture support to actually use: arches must stop
mapping the read-only zero page over portion of the shadow region that
covers the vmalloc space and instead leave it unmapped.

This allows KASAN with VMAP_STACK, and will be needed for architectures
that do not have a separate module space (e.g. powerpc64, which I am
currently working on). It also allows relaxing the module alignment
back to PAGE_SIZE.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=202009
Acked-by: Vasily Gorbik 
Signed-off-by: Daniel Axtens 
[Mark: rework shadow allocation]
Signed-off-by: Mark Rutland 

--

v2: let kasan_unpoison_shadow deal with ranges that do not use a
full shadow byte.

v3: relax module alignment
rename to kasan_populate_vmalloc which is a much better name
deal with concurrency correctly

v4: Integrate Mark's rework
Poision pages on vfree
Handle allocation failures. I've tested this by inserting artificial
 failures and using test_vmalloc to stress it. I haven't handled the
 per-cpu case: it looked like it would require a messy hacking-up of
 the function to deal with an OOM failure case in a debug feature.

---
 Documentation/dev-tools/kasan.rst | 60 +++
 include/linux/kasan.h | 24 +++
 include/linux/moduleloader.h  |  2 +-
 include/linux/vmalloc.h   | 12 ++
 lib/Kconfig.kasan | 16 
 lib/test_kasan.c  | 26 
 mm/kasan/common.c | 67 +++
 mm/kasan/generic_report.c |  3 ++
 mm/kasan/kasan.h  |  1 +
 mm/vmalloc.c  | 28 -
 10 files changed, 237 insertions(+), 2 deletions(-)

diff --git a/Documentation/dev-tools/kasan.rst 
b/Documentation/dev-tools/kasan.rst
index b72d07d70239..35fda484a672 100644
--- a/Documentation/dev-tools/kasan.rst
+++ b/Documentation/dev-tools/kasan.rst
@@ -215,3 +215,63 @@ brk handler is used to print bug reports.
 A potential expansion of this mode is a hardware tag-based mode, which would
 use hardware memory tagging support instead of compiler instrumentation and
 manual shadow memory manipulation.
+
+What memory accesses are sanitised by KASAN?
+
+
+The kernel maps memory in a number of different parts of the address
+space. This poses something of a problem for KASAN, which requires
+that all addresses accessed by instrumented code have a valid shadow
+region.
+
+The range of kernel virtual addresses is large: there is not enough
+real memory to support a real shadow region for every address that
+could be accessed by the kernel.
+
+By default
+~~
+
+By default, architectures only map real memory over the shadow region
+for the linear mapping (and potentially other small areas). For all
+other areas - such as vmalloc and vmemmap space - a single read-only
+page is mapped over the shadow area. This read-only shadow page
+declares all memory accesses as permitted.
+
+This presents a problem for modules: they do not live in the linear
+mapping, but in a dedicated module space. By hooking in to the module
+allocator, KASAN can temporarily map real shadow memory to cover
+them. This allows detection of invalid accesses to module globals, for
+example.
+
+This also creates an incompatibility with ``VMAP_STACK``: if the stack
+lives in vmalloc space, it will be shadowed by the read-only page, and
+the kernel will fault when trying to set up the shadow data for stack
+variables.
+
+CONFIG_KASAN_VMALLOC
+
+
+With ``CONFIG_KASAN_VMALLOC``, KASAN can cover vmalloc space at the
+cost of greater memory usage. Currently this is only supported on x86.
+
+This works by hooking into vmalloc and vmap, and dynamically
+allocating real shadow memory to back the mappings.
+
+Most mappings in vmalloc space are small, requiring less than a full
+page of shadow space. Allocating a full shadow page per mapping would
+therefore be wasteful. Furthermore, to ensure that different mappings

[PATCH v4 0/3] kasan: support backing vmalloc space with real shadow memory

2019-08-14 Thread Daniel Axtens

Currently, vmalloc space is backed by the early shadow page. This
means that kasan is incompatible with VMAP_STACK, and it also provides
a hurdle for architectures that do not have a dedicated module space
(like powerpc64).

This series provides a mechanism to back vmalloc space with real,
dynamically allocated memory. I have only wired up x86, because that's
the only currently supported arch I can work with easily, but it's
very easy to wire up other architectures.

This has been discussed before in the context of VMAP_STACK:
 - https://bugzilla.kernel.org/show_bug.cgi?id=202009
 - https://lkml.org/lkml/2018/7/22/198
 - https://lkml.org/lkml/2019/7/19/822

In terms of implementation details:

Most mappings in vmalloc space are small, requiring less than a full
page of shadow space. Allocating a full shadow page per mapping would
therefore be wasteful. Furthermore, to ensure that different mappings
use different shadow pages, mappings would have to be aligned to
KASAN_SHADOW_SCALE_SIZE * PAGE_SIZE.

Instead, share backing space across multiple mappings. Allocate
a backing page the first time a mapping in vmalloc space uses a
particular page of the shadow region. Keep this page around
regardless of whether the mapping is later freed - in the mean time
the page could have become shared by another vmalloc mapping.

This can in theory lead to unbounded memory growth, but the vmalloc
allocator is pretty good at reusing addresses, so the practical memory
usage appears to grow at first but then stay fairly stable.

If we run into practical memory exhaustion issues, I'm happy to
consider hooking into the book-keeping that vmap does, but I am not
convinced that it will be an issue.

v1: https://lore.kernel.org/linux-mm/20190725055503.19507-1-...@axtens.net/
v2: https://lore.kernel.org/linux-mm/20190729142108.23343-1-...@axtens.net/
 Address review comments:
 - Patch 1: use kasan_unpoison_shadow's built-in handling of
ranges that do not align to a full shadow byte
 - Patch 3: prepopulate pgds rather than faulting things in
v3: https://lore.kernel.org/linux-mm/20190731071550.31814-1-...@axtens.net/
 Address comments from Mark Rutland:
 - kasan_populate_vmalloc is a better name
 - handle concurrency correctly
 - various nits and cleanups
 - relax module alignment in KASAN_VMALLOC case
v4: Changes to patch 1 only:
 - Integrate Mark's rework, thanks Mark!
 - handle the case where kasan_populate_shadow might fail
 - poision shadow on free, allowing the alloc path to just
 unpoision memory that it uses

Daniel Axtens (3):
  kasan: support backing vmalloc space with real shadow memory
  fork: support VMAP_STACK with KASAN_VMALLOC
  x86/kasan: support KASAN_VMALLOC

 Documentation/dev-tools/kasan.rst | 60 +++
 arch/Kconfig  |  9 +++--
 arch/x86/Kconfig  |  1 +
 arch/x86/mm/kasan_init_64.c   | 61 
 include/linux/kasan.h | 24 +++
 include/linux/moduleloader.h  |  2 +-
 include/linux/vmalloc.h   | 12 ++
 kernel/fork.c |  4 ++
 lib/Kconfig.kasan | 16 
 lib/test_kasan.c  | 26 
 mm/kasan/common.c | 67 +++
 mm/kasan/generic_report.c |  3 ++
 mm/kasan/kasan.h  |  1 +
 mm/vmalloc.c  | 28 -
 14 files changed, 308 insertions(+), 6 deletions(-)

-- 
2.20.1

[PATCH v4 2/3] fork: support VMAP_STACK with KASAN_VMALLOC

2019-08-14 Thread Daniel Axtens

Supporting VMAP_STACK with KASAN_VMALLOC is straightforward:

 - clear the shadow region of vmapped stacks when swapping them in
 - tweak Kconfig to allow VMAP_STACK to be turned on with KASAN

Reviewed-by: Dmitry Vyukov 
Signed-off-by: Daniel Axtens 
---
 arch/Kconfig  | 9 +
 kernel/fork.c | 4 
 2 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/arch/Kconfig b/arch/Kconfig
index a7b57dd42c26..e791196005e1 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -825,16 +825,17 @@ config HAVE_ARCH_VMAP_STACK
 config VMAP_STACK
default y
bool "Use a virtually-mapped stack"
-   depends on HAVE_ARCH_VMAP_STACK && !KASAN
+   depends on HAVE_ARCH_VMAP_STACK
+   depends on !KASAN || KASAN_VMALLOC
---help---
  Enable this if you want the use virtually-mapped kernel stacks
  with guard pages.  This causes kernel stack overflows to be
  caught immediately rather than causing difficult-to-diagnose
  corruption.
 
- This is presently incompatible with KASAN because KASAN expects
- the stack to map directly to the KASAN shadow map using a formula
- that is incorrect if the stack is in vmalloc space.
+ To use this with KASAN, the architecture must support backing
+ virtual mappings with real shadow memory, and KASAN_VMALLOC must
+ be enabled.
 
 config ARCH_OPTIONAL_KERNEL_RWX
def_bool n
diff --git a/kernel/fork.c b/kernel/fork.c
index d8ae0f1b4148..ce3150fe8ff2 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -94,6 +94,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -215,6 +216,9 @@ static unsigned long *alloc_thread_stack_node(struct 
task_struct *tsk, int node)
if (!s)
continue;
 
+   /* Clear the KASAN shadow of the stack. */
+   kasan_unpoison_shadow(s->addr, THREAD_SIZE);
+
/* Clear stale pointers from reused stack. */
memset(s->addr, 0, THREAD_SIZE);
 
-- 
2.20.1

Re: [PATCH v9 6/7] powerpc/mce: Handle UE event for memcpy_mcsafe

2019-08-14 Thread Santosh Sivaraj

Hi Balbir,

Balbir Singh  writes:

> On 12/8/19 7:22 pm, Santosh Sivaraj wrote:
>> If we take a UE on one of the instructions with a fixup entry, set nip
>> to continue execution at the fixup entry. Stop processing the event
>> further or print it.
>> 
>> Co-developed-by: Reza Arbab 
>> Signed-off-by: Reza Arbab 
>> Cc: Mahesh Salgaonkar 
>> Signed-off-by: Santosh Sivaraj 
>> ---
>
> Isn't this based on https://patchwork.ozlabs.org/patch/895294/? If so it
> should still have my author tag and signed-off-by

Originally when I received the series for posting, I had Reza's authorship and
signed-off-by, since the patch changed significantly I added co-developed-by as
Reza. I will update in the next spin.

https://lore.kernel.org/linuxppc-dev/20190702051932.511-1-sant...@fossix.org/

Santosh
>
> Balbir Singh
>
>>  arch/powerpc/include/asm/mce.h  |  4 +++-
>>  arch/powerpc/kernel/mce.c   | 16 
>>  arch/powerpc/kernel/mce_power.c | 15 +--
>>  3 files changed, 32 insertions(+), 3 deletions(-)
>> 
>> diff --git a/arch/powerpc/include/asm/mce.h b/arch/powerpc/include/asm/mce.h
>> index f3a6036b6bc0..e1931c8c2743 100644
>> --- a/arch/powerpc/include/asm/mce.h
>> +++ b/arch/powerpc/include/asm/mce.h
>> @@ -122,7 +122,8 @@ struct machine_check_event {
>>  enum MCE_UeErrorType ue_error_type:8;
>>  u8  effective_address_provided;
>>  u8  physical_address_provided;
>> -u8  reserved_1[5];
>> +u8  ignore_event;
>> +u8  reserved_1[4];
>>  u64 effective_address;
>>  u64 physical_address;
>>  u8  reserved_2[8];
>> @@ -193,6 +194,7 @@ struct mce_error_info {
>>  enum MCE_Initiator  initiator:8;
>>  enum MCE_ErrorClass error_class:8;
>>  boolsync_error;
>> +boolignore_event;
>>  };
>>  
>>  #define MAX_MC_EVT  100
>> diff --git a/arch/powerpc/kernel/mce.c b/arch/powerpc/kernel/mce.c
>> index a3b122a685a5..ec4b3e1087be 100644
>> --- a/arch/powerpc/kernel/mce.c
>> +++ b/arch/powerpc/kernel/mce.c
>> @@ -149,6 +149,7 @@ void save_mce_event(struct pt_regs *regs, long handled,
>>  if (phys_addr != ULONG_MAX) {
>>  mce->u.ue_error.physical_address_provided = true;
>>  mce->u.ue_error.physical_address = phys_addr;
>> +mce->u.ue_error.ignore_event = mce_err->ignore_event;
>>  machine_check_ue_event(mce);
>>  }
>>  }
>> @@ -266,8 +267,17 @@ static void machine_process_ue_event(struct work_struct 
>> *work)
>>  /*
>>   * This should probably queued elsewhere, but
>>   * oh! well
>> + *
>> + * Don't report this machine check because the caller has a
>> + * asked us to ignore the event, it has a fixup handler which
>> + * will do the appropriate error handling and reporting.
>>   */
>>  if (evt->error_type == MCE_ERROR_TYPE_UE) {
>> +if (evt->u.ue_error.ignore_event) {
>> +__this_cpu_dec(mce_ue_count);
>> +continue;
>> +}
>> +
>>  if (evt->u.ue_error.physical_address_provided) {
>>  unsigned long pfn;
>>  
>> @@ -301,6 +311,12 @@ static void machine_check_process_queued_event(struct 
>> irq_work *work)
>>  while (__this_cpu_read(mce_queue_count) > 0) {
>>  index = __this_cpu_read(mce_queue_count) - 1;
>>  evt = this_cpu_ptr(_event_queue[index]);
>> +
>> +if (evt->error_type == MCE_ERROR_TYPE_UE &&
>> +evt->u.ue_error.ignore_event) {
>> +__this_cpu_dec(mce_queue_count);
>> +continue;
>> +}
>>  machine_check_print_event_info(evt, false, false);
>>  __this_cpu_dec(mce_queue_count);
>>  }
>> diff --git a/arch/powerpc/kernel/mce_power.c 
>> b/arch/powerpc/kernel/mce_power.c
>> index e74816f045f8..1dd87f6f5186 100644
>> --- a/arch/powerpc/kernel/mce_power.c
>> +++ b/arch/powerpc/kernel/mce_power.c
>> @@ -11,6 +11,7 @@
>>  
>>  #include 
>>  #include 
>> +#include 
>>  #include 
>>  #include 
>>  #include 
>> @@ -18,6 +19,7 @@
>>  #include 
>>  #include 
>>  #include 
>> +#include 
>>  
>>  /*
>>   * Convert an address related to an mm to a physical address.
>> @@ -559,9 +561,18 @@ static int mce_handle_derror(struct pt_regs *regs,
>>  return 0;
>>  }
>>  
>> -static long mce_handle_ue_error(struct pt_regs *regs)
>> +static long mce_handle_ue_error(struct pt_regs *regs,
>> +struct mce_error_info *mce_err)
>>  {
>>  long handled = 0;
>> +const struct

[PATCH] ipvlan: set hw_enc_features like macvlan

2019-08-14 Thread Bill Sommerfeld

Allow encapsulated packets sent to tunnels layered over ipvlan to use
offloads rather than forcing SW fallbacks.

Since commit f21e5077010acda73a60 ("macvlan: add offload features for
encapsulation"), macvlan has set dev->hw_enc_features to include
everything in dev->features; do likewise in ipvlan.

Signed-off-by: Bill Sommerfeld 
---
 drivers/net/ipvlan/ipvlan_main.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/ipvlan/ipvlan_main.c b/drivers/net/ipvlan/ipvlan_main.c
index 1c96bed5a7c4..887bbba4631e 100644
--- a/drivers/net/ipvlan/ipvlan_main.c
+++ b/drivers/net/ipvlan/ipvlan_main.c
@@ -126,6 +126,7 @@ static int ipvlan_init(struct net_device *dev)
 (phy_dev->state & IPVLAN_STATE_MASK);
dev->features = phy_dev->features & IPVLAN_FEATURES;
dev->features |= NETIF_F_LLTX | NETIF_F_VLAN_CHALLENGED;
+   dev->hw_enc_features |= dev->features;
dev->gso_max_size = phy_dev->gso_max_size;
dev->gso_max_segs = phy_dev->gso_max_segs;
dev->hard_header_len = phy_dev->hard_header_len;
-- 
2.23.0.rc1.153.gdeed80330f-goog

Re: [PATCH v9 7/7] powerpc: add machine check safe copy_to_user

2019-08-14 Thread Santosh Sivaraj

Hi Balbir,

Balbir Singh  writes:

> On 12/8/19 7:22 pm, Santosh Sivaraj wrote:
>> Use  memcpy_mcsafe() implementation to define copy_to_user_mcsafe()
>> 
>> Signed-off-by: Santosh Sivaraj 
>> ---
>>  arch/powerpc/Kconfig   |  1 +
>>  arch/powerpc/include/asm/uaccess.h | 14 ++
>>  2 files changed, 15 insertions(+)
>> 
>> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
>> index 77f6ebf97113..4316e36095a2 100644
>> --- a/arch/powerpc/Kconfig
>> +++ b/arch/powerpc/Kconfig
>> @@ -137,6 +137,7 @@ config PPC
>>  select ARCH_HAS_STRICT_KERNEL_RWX   if ((PPC_BOOK3S_64 || PPC32) && 
>> !RELOCATABLE && !HIBERNATION)
>>  select ARCH_HAS_TICK_BROADCAST  if GENERIC_CLOCKEVENTS_BROADCAST
>>  select ARCH_HAS_UACCESS_FLUSHCACHE  if PPC64
>> +select ARCH_HAS_UACCESS_MCSAFE  if PPC64
>>  select ARCH_HAS_UBSAN_SANITIZE_ALL
>>  select ARCH_HAVE_NMI_SAFE_CMPXCHG
>>  select ARCH_KEEP_MEMBLOCK
>> diff --git a/arch/powerpc/include/asm/uaccess.h 
>> b/arch/powerpc/include/asm/uaccess.h
>> index 8b03eb44e876..15002b51ff18 100644
>> --- a/arch/powerpc/include/asm/uaccess.h
>> +++ b/arch/powerpc/include/asm/uaccess.h
>> @@ -387,6 +387,20 @@ static inline unsigned long raw_copy_to_user(void 
>> __user *to,
>>  return ret;
>>  }
>>  
>> +static __always_inline unsigned long __must_check
>> +copy_to_user_mcsafe(void __user *to, const void *from, unsigned long n)
>> +{
>> +if (likely(check_copy_size(from, n, true))) {
>> +if (access_ok(to, n)) {
>> +allow_write_to_user(to, n);
>> +n = memcpy_mcsafe((void *)to, from, n);
>> +prevent_write_to_user(to, n);
>> +}
>> +}
>> +
>> +return n;
>
> Do we always return n independent of the check_copy_size return value and
> access_ok return values?

Yes we always return the remaining bytes not copied even if check_copy_size
or access_ok fails.

Santosh

>
> Balbir Singh.
>
>> +}
>> +
>>  extern unsigned long __clear_user(void __user *addr, unsigned long size);
>>  
>>  static inline unsigned long clear_user(void __user *addr, unsigned long 
>> size)
>>

[PATCH bpf-next] tools: libbpf: update extended attributes version of bpf_object__open()

2019-08-14 Thread Anton Protopopov

Update the bpf_object_open_attr structure and corresponding code so that the
bpf_object__open_xattr function could be used to open objects from buffers as
well as from files.  The reason for this change is that the existing
bpf_object__open_buffer function doesn't provide a way to specify neither the
needs_kver nor flags parameters to the internal call to __bpf_object__open
which makes it inconvenient for loading BPF objects which doesn't require a
kernel version.

Two new fields, obj_buf and obj_buf_sz, were added to the structure, and the
file field was union'ed with obj_name so that one can open an object like this:

struct bpf_object_open_attr attr = {
.obj_name   = name,
.obj_buf= obj_buf,
.obj_buf_sz = obj_buf_sz,
.prog_type  = BPF_PROG_TYPE_UNSPEC,
};
return bpf_object__open_xattr();

while still being able to use the file semantics:

struct bpf_object_open_attr attr = {
.file   = path,
.prog_type  = BPF_PROG_TYPE_UNSPEC,
};
return bpf_object__open_xattr();

Another thing to note is that since the commit c034a177d3c8 ("bpf: bpftool, add
flag to allow non-compat map definitions") which introduced a MAPS_RELAX_COMPAT
flag to load objects with non-compat map definitions, bpf_object__open_buffer
was called with this flag enabled (it was passed as the boolean true value in
flags argument to __bpf_object__open).  The default behaviour for all open
functions is to clear this flag and this patch changes bpf_object__open_buffer
to clears this flag.  It can be enabled, if needed, by opening an object from
buffer using __bpf_object__open_xattr.

Signed-off-by: Anton Protopopov 
---
 tools/lib/bpf/libbpf.c | 45 ++
 tools/lib/bpf/libbpf.h |  7 ++-
 2 files changed, 34 insertions(+), 18 deletions(-)

diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index 2233f919dd88..7c8054afd901 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -3630,13 +3630,31 @@ __bpf_object__open(const char *path, void *obj_buf, 
size_t obj_buf_sz,
 struct bpf_object *__bpf_object__open_xattr(struct bpf_object_open_attr *attr,
int flags)
 {
+   char tmp_name[64];
+
/* param validation */
-   if (!attr->file)
+   if (!attr)
return NULL;
 
-   pr_debug("loading %s\n", attr->file);
+   if (attr->obj_buf) {
+   if (attr->obj_buf_sz <= 0)
+   return NULL;
+   if (!attr->file) {
+   snprintf(tmp_name, sizeof(tmp_name), "%lx-%lx",
+(unsigned long)attr->obj_buf,
+(unsigned long)attr->obj_buf_sz);
+   attr->obj_name = tmp_name;
+   }
+   pr_debug("loading object '%s' from buffer\n", attr->obj_name);
+   } else if (!attr->file) {
+   return NULL;
+   } else {
+   attr->obj_buf_sz = 0;
 
-   return __bpf_object__open(attr->file, NULL, 0,
+   pr_debug("loading object file '%s'\n", attr->file);
+   }
+
+   return __bpf_object__open(attr->file, attr->obj_buf, attr->obj_buf_sz,
  bpf_prog_type__needs_kver(attr->prog_type),
  flags);
 }
@@ -3660,21 +3678,14 @@ struct bpf_object *bpf_object__open_buffer(void 
*obj_buf,
   size_t obj_buf_sz,
   const char *name)
 {
-   char tmp_name[64];
-
-   /* param validation */
-   if (!obj_buf || obj_buf_sz <= 0)
-   return NULL;
-
-   if (!name) {
-   snprintf(tmp_name, sizeof(tmp_name), "%lx-%lx",
-(unsigned long)obj_buf,
-(unsigned long)obj_buf_sz);
-   name = tmp_name;
-   }
-   pr_debug("loading object '%s' from buffer\n", name);
+   struct bpf_object_open_attr attr = {
+   .obj_name   = name,
+   .obj_buf= obj_buf,
+   .obj_buf_sz = obj_buf_sz,
+   .prog_type  = BPF_PROG_TYPE_UNSPEC,
+   };
 
-   return __bpf_object__open(name, obj_buf, obj_buf_sz, true, true);
+   return bpf_object__open_xattr();
 }
 
 int bpf_object__unload(struct bpf_object *obj)
diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h
index e8f70977d137..634f278578dd 100644
--- a/tools/lib/bpf/libbpf.h
+++ b/tools/lib/bpf/libbpf.h
@@ -63,8 +63,13 @@ LIBBPF_API libbpf_print_fn_t 
libbpf_set_print(libbpf_print_fn_t fn);
 struct bpf_object;
 
 struct bpf_object_open_attr {
-   const char *file;
+   union {
+   const char *file;
+   const char *obj_name;
+   };
enum bpf_prog_type prog_type;
+   void *obj_buf;
+   size_t obj_buf_sz;
 };
 
 LIBBPF_API struct bpf_object *bpf_object__open(const char

Re: [PATCH v5 1/4] clk: core: link consumer with clock driver

2019-08-14 Thread Stephen Boyd

Quoting Miquel Raynal (2019-05-21 05:51:10)
> One major concern when, for instance, suspending/resuming a platform
> is to never access registers before the underlying clock has been
> resumed, otherwise most of the time the kernel will just crash. One
> solution is to use syscore operations when registering clock drivers
> suspend/resume callbacks. One problem of using syscore_ops is that the
> suspend/resume scheduling will depend on the order of the
> registrations, which brings (unacceptable) randomness in the process.
> 
> A feature called device links has been introduced to handle such
> situation. It creates dependencies between consumers and providers,
> enforcing e.g. the suspend/resume order when needed. Such feature is
> already in use for regulators.
> 
> Add device links support in the clock subsystem by creating/deleting
> the links at get/put time.
> 
> Example of a boot (ESPRESSObin, A3700 SoC) with devices linked to clocks:
> 
> marvell-armada-3700-tbg-clock d0013200.tbg: Linked as a consumer to 
> d0013800.pinctrl:xtal-clk
> marvell-armada-3700-tbg-clock d0013200.tbg: Dropping the link to 
> d0013800.pinctrl:xtal-clk
> marvell-armada-3700-tbg-clock d0013200.tbg: Linked as a consumer to 
> d0013800.pinctrl:xtal-clk
> marvell-armada-3700-periph-clock d0013000.nb-periph-clk: Linked as a consumer 
> to d0013200.tbg
> marvell-armada-3700-periph-clock d0013000.nb-periph-clk: Linked as a consumer 
> to d0013800.pinctrl:xtal-clk
> marvell-armada-3700-periph-clock d0018000.sb-periph-clk: Linked as a consumer 
> to d0013200.tbg
> mvneta d003.ethernet: Linked as a consumer to d0018000.sb-periph-clk
> xhci-hcd d0058000.usb: Linked as a consumer to d0018000.sb-periph-clk
> xenon-sdhci d00d.sdhci: Linked as a consumer to d0013000.nb-periph-clk
> xenon-sdhci d00d.sdhci: Dropping the link to d0013000.nb-periph-clk
> mvebu-uart d0012000.serial: Linked as a consumer to d0013800.pinctrl:xtal-clk
> advk-pcie d007.pcie: Linked as a consumer to d0018000.sb-periph-clk
> xenon-sdhci d00d.sdhci: Linked as a consumer to d0013000.nb-periph-clk
> xenon-sdhci d00d.sdhci: Linked as a consumer to regulator.1
> cpu cpu0: Linked as a consumer to d0013000.nb-periph-clk
> cpu cpu0: Dropping the link to d0013000.nb-periph-clk
> cpu cpu0: Linked as a consumer to d0013000.nb-periph-clk
> 
> Signed-off-by: Miquel Raynal 
> ---

This patch doesn't apply. Things have changed upstream.

> 
> diff --git a/drivers/clk/clk.c b/drivers/clk/clk.c
> index ec6f04dcf5e6..e6b84ab43f9f 100644
> --- a/drivers/clk/clk.c
> +++ b/drivers/clk/clk.c
> @@ -1676,6 +1710,8 @@ static void clk_reparent(struct clk_core *core, struct 
> clk_core *new_parent)
>  
> if (was_orphan != becomes_orphan)
> clk_core_update_orphan_status(core, becomes_orphan);
> +
> +   clk_link_hierarchy(core, new_parent);

This isn't going to work.

 BUG: sleeping function called from invalid context at 
kernel/locking/mutex.c:909
 in_atomic(): 1, irqs_disabled(): 128, pid: 1, name: swapper/0
 3 locks held by swapper/0/1:
  #0: (ptrval) (>mutex){}, at: __device_driver_lock+0x40/0x4c
  #1: (ptrval) (prepare_lock){+.+.}, at: clk_prepare_lock+0x18/0x94
  #2: (ptrval) (enable_lock){}, at: clk_enable_lock+0x34/0xdc
 irq event stamp: 311516
 hardirqs last  enabled at (311515): [] 
_raw_spin_unlock_irqrestore+0x54/0x90
 hardirqs last disabled at (311516): [] 
clk_enable_lock+0x28/0xdc
 softirqs last  enabled at (311348): [] 
__do_softirq+0x4cc/0x514
 softirqs last disabled at (311341): [] irq_exit+0xd8/0xf8
 CPU: 4 PID: 1 Comm: swapper/0 Tainted: GW 
5.3.0-rc4-5-g6be06bbec80ef #10
 Hardware name: Google Cheza (rev3+) (DT)
 Call trace:
  dump_backtrace+0x0/0x13c
  show_stack+0x20/0x2c
  dump_stack+0xc4/0x12c
  ___might_sleep+0x1b4/0x1c4
  __might_sleep+0x50/0x88
  __mutex_lock_common+0x5c/0xbfc
  mutex_lock_nested+0x40/0x50
  device_link_add+0x88/0x3ac
  clk_reparent+0xc4/0x114
  __clk_set_parent_before+0x74/0x90
  clk_change_rate+0x98/0x854
  clk_core_set_rate_nolock+0x1b0/0x21c
  clk_set_rate+0x3c/0x6c
  of_clk_set_defaults+0x29c/0x364
  platform_drv_probe+0x28/0xb0
  really_probe+0x130/0x2b4
  driver_probe_device+0x64/0xfc
  device_driver_attach+0x4c/0x6c
  __driver_attach+0xb0/0xc4
  bus_for_each_dev+0x84/0xcc
  driver_attach+0x2c/0x38
  bus_add_driver+0xfc/0x1d0
  driver_register+0x64/0xf0
  __platform_driver_register+0x4c/0x58
  msm_drm_register+0x5c/0x60
  do_one_initcall+0x1e0/0x478
  do_initcall_level+0x21c/0x25c
  do_basic_setup+0x60/0x78
  kernel_init_freeable+0x128/0x1b0
  kernel_init+0x14/0x100
  ret_from_fork+0x10/0x18

> } else {
> hlist_add_head(>child_node, _orphan_list);
> if (!was_orphan)
> @@ -2402,6 +2438,8 @@ __clk_init_parent(struct clk_core *core, bool 
> update_orphan)
> if (!parent_hw)
> return NULL;
>  
> +   clk_link_hierarchy(core, parent_hw->core);
> +

This is

Re: [RFC PATCH 2/2] mm/gup: introduce vaddr_pin_pages_remote()

2019-08-14 Thread John Hubbard


On 8/14/19 4:50 PM, Ira Weiny wrote:

On Tue, Aug 13, 2019 at 05:56:31PM -0700, John Hubbard wrote:

On 8/13/19 5:51 PM, John Hubbard wrote:

On 8/13/19 2:08 PM, Ira Weiny wrote:

On Mon, Aug 12, 2019 at 05:07:32PM -0700, John Hubbard wrote:

On 8/12/19 4:49 PM, Ira Weiny wrote:

On Sun, Aug 11, 2019 at 06:50:44PM -0700, john.hubb...@gmail.com wrote:

From: John Hubbard 

...

Finally, I struggle with converting everyone to a new call.  It is more
overhead to use vaddr_pin in the call above because now the GUP code is going
to associate a file pin object with that file when in ODP we don't need that
because the pages can move around.


What if the pages in ODP are file-backed?



oops, strike that, you're right: in that case, even the file system case is 
covered.
Don't mind me. :)


Ok so are we agreed we will drop the patch to the ODP code?  I'm going to keep
the FOLL_PIN flag and addition in the vaddr_pin_pages.



Yes. I hope I'm not overlooking anything, but it all seems to make sense to
let ODP just rely on the MMU notifiers.

thanks,
--
John Hubbard
NVIDIA

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 1266 matches

Mail list logo