Re: [Patch v2 13/15] CIFS: Add support for direct I/O read

2018-06-01 Thread kbuild test robot
Hi Long,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on v4.17-rc6]
[cannot apply to cifs/for-next next-20180601]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Long-Li/CIFS-Add-direct-I-O-support/20180602-130240
config: i386-randconfig-x000-201821 (attached as .config)
compiler: gcc-7 (Debian 7.3.0-16) 7.3.0
reproduce:
# save the attached .config to linux build tree
make ARCH=i386 

All warnings (new ones prefixed by >>):

   In file included from include/linux/kernel.h:14:0,
from include/linux/list.h:9,
from include/linux/wait.h:7,
from include/linux/wait_bit.h:8,
from include/linux/fs.h:6,
from fs/cifs/file.c:24:
   fs/cifs/file.c: In function 'cifs_direct_readv':
   include/linux/kern_levels.h:5:18: warning: format '%lu' expects argument of 
type 'long unsigned int', but argument 4 has type 'size_t {aka unsigned int}' 
[-Wformat=]
#define KERN_SOH "\001"  /* ASCII Start Of Header */
 ^
   include/linux/printk.h:136:10: note: in definition of macro 'no_printk'
  printk(fmt, ##__VA_ARGS__);  \
 ^~~
   include/linux/kern_levels.h:15:20: note: in expansion of macro 'KERN_SOH'
#define KERN_DEBUG KERN_SOH "7" /* debug-level messages */
   ^~~~
   include/linux/printk.h:342:12: note: in expansion of macro 'KERN_DEBUG'
 no_printk(KERN_DEBUG pr_fmt(fmt), ##__VA_ARGS__)
   ^~
>> fs/cifs/cifs_debug.h:82:3: note: in expansion of macro 'pr_debug'
  pr_debug(fmt, ##__VA_ARGS__);\
  ^~~~
   fs/cifs/file.c:3346:4: note: in expansion of macro 'cifs_dbg'
   cifs_dbg(VFS,
   ^~~~
   fs/cifs/file.c:3348:20: note: format string is defined here
" iov_offset %lu count %lu\n",
 ~~^
 %u
   In file included from include/linux/kernel.h:14:0,
from include/linux/list.h:9,
from include/linux/wait.h:7,
from include/linux/wait_bit.h:8,
from include/linux/fs.h:6,
from fs/cifs/file.c:24:
   include/linux/kern_levels.h:5:18: warning: format '%lu' expects argument of 
type 'long unsigned int', but argument 5 has type 'size_t {aka unsigned int}' 
[-Wformat=]
#define KERN_SOH "\001"  /* ASCII Start Of Header */
 ^
   include/linux/printk.h:136:10: note: in definition of macro 'no_printk'
  printk(fmt, ##__VA_ARGS__);  \
 ^~~
   include/linux/kern_levels.h:15:20: note: in expansion of macro 'KERN_SOH'
#define KERN_DEBUG KERN_SOH "7" /* debug-level messages */
   ^~~~
   include/linux/printk.h:342:12: note: in expansion of macro 'KERN_DEBUG'
 no_printk(KERN_DEBUG pr_fmt(fmt), ##__VA_ARGS__)
   ^~
>> fs/cifs/cifs_debug.h:82:3: note: in expansion of macro 'pr_debug'
  pr_debug(fmt, ##__VA_ARGS__);\
  ^~~~
   fs/cifs/file.c:3346:4: note: in expansion of macro 'cifs_dbg'
   cifs_dbg(VFS,
   ^~~~
   fs/cifs/file.c:3348:30: note: format string is defined here
" iov_offset %lu count %lu\n",
   ~~^
   %u

vim +/pr_debug +82 fs/cifs/cifs_debug.h

f2f176b4 Aurelien Aptel  2018-04-10  73  
^1da177e Linus Torvalds  2005-04-16  74  /*
^1da177e Linus Torvalds  2005-04-16  75   * debug OFF
^1da177e Linus Torvalds  2005-04-16  76   * -
^1da177e Linus Torvalds  2005-04-16  77   */
^1da177e Linus Torvalds  2005-04-16  78  #else  /* _CIFS_DEBUG */
f96637be Joe Perches 2013-05-04  79  #define cifs_dbg(type, fmt, ...)   
\
bde98197 Joe Perches 2012-12-05  80  do {   
\
bde98197 Joe Perches 2012-12-05  81 if (0)  
\
0b456f04 Andy Shevchenko 2014-08-27 @82 pr_debug(fmt, 
##__VA_ARGS__);   \
bde98197 Joe Perches 2012-12-05  83  } while (0)
f96637be Joe Perches 2013-05-04  84  #endif
^1da177e Linus Torvalds  2005-04-16  85  

:: The code at line 82 was first introduced by commit
:: 0b456f04bcdf5b1151136adaada158bfc26f1be7 cifs: convert printk(LEVEL...) 
to pr_

:: TO: Andy Shevchenko 
:: CC: Steve French 

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip


Re: [Patch v2 13/15] CIFS: Add support for direct I/O read

2018-06-01 Thread kbuild test robot
Hi Long,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on v4.17-rc6]
[cannot apply to cifs/for-next next-20180601]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Long-Li/CIFS-Add-direct-I-O-support/20180602-130240
config: i386-randconfig-x000-201821 (attached as .config)
compiler: gcc-7 (Debian 7.3.0-16) 7.3.0
reproduce:
# save the attached .config to linux build tree
make ARCH=i386 

All warnings (new ones prefixed by >>):

   In file included from include/linux/kernel.h:14:0,
from include/linux/list.h:9,
from include/linux/wait.h:7,
from include/linux/wait_bit.h:8,
from include/linux/fs.h:6,
from fs/cifs/file.c:24:
   fs/cifs/file.c: In function 'cifs_direct_readv':
   include/linux/kern_levels.h:5:18: warning: format '%lu' expects argument of 
type 'long unsigned int', but argument 4 has type 'size_t {aka unsigned int}' 
[-Wformat=]
#define KERN_SOH "\001"  /* ASCII Start Of Header */
 ^
   include/linux/printk.h:136:10: note: in definition of macro 'no_printk'
  printk(fmt, ##__VA_ARGS__);  \
 ^~~
   include/linux/kern_levels.h:15:20: note: in expansion of macro 'KERN_SOH'
#define KERN_DEBUG KERN_SOH "7" /* debug-level messages */
   ^~~~
   include/linux/printk.h:342:12: note: in expansion of macro 'KERN_DEBUG'
 no_printk(KERN_DEBUG pr_fmt(fmt), ##__VA_ARGS__)
   ^~
>> fs/cifs/cifs_debug.h:82:3: note: in expansion of macro 'pr_debug'
  pr_debug(fmt, ##__VA_ARGS__);\
  ^~~~
   fs/cifs/file.c:3346:4: note: in expansion of macro 'cifs_dbg'
   cifs_dbg(VFS,
   ^~~~
   fs/cifs/file.c:3348:20: note: format string is defined here
" iov_offset %lu count %lu\n",
 ~~^
 %u
   In file included from include/linux/kernel.h:14:0,
from include/linux/list.h:9,
from include/linux/wait.h:7,
from include/linux/wait_bit.h:8,
from include/linux/fs.h:6,
from fs/cifs/file.c:24:
   include/linux/kern_levels.h:5:18: warning: format '%lu' expects argument of 
type 'long unsigned int', but argument 5 has type 'size_t {aka unsigned int}' 
[-Wformat=]
#define KERN_SOH "\001"  /* ASCII Start Of Header */
 ^
   include/linux/printk.h:136:10: note: in definition of macro 'no_printk'
  printk(fmt, ##__VA_ARGS__);  \
 ^~~
   include/linux/kern_levels.h:15:20: note: in expansion of macro 'KERN_SOH'
#define KERN_DEBUG KERN_SOH "7" /* debug-level messages */
   ^~~~
   include/linux/printk.h:342:12: note: in expansion of macro 'KERN_DEBUG'
 no_printk(KERN_DEBUG pr_fmt(fmt), ##__VA_ARGS__)
   ^~
>> fs/cifs/cifs_debug.h:82:3: note: in expansion of macro 'pr_debug'
  pr_debug(fmt, ##__VA_ARGS__);\
  ^~~~
   fs/cifs/file.c:3346:4: note: in expansion of macro 'cifs_dbg'
   cifs_dbg(VFS,
   ^~~~
   fs/cifs/file.c:3348:30: note: format string is defined here
" iov_offset %lu count %lu\n",
   ~~^
   %u

vim +/pr_debug +82 fs/cifs/cifs_debug.h

f2f176b4 Aurelien Aptel  2018-04-10  73  
^1da177e Linus Torvalds  2005-04-16  74  /*
^1da177e Linus Torvalds  2005-04-16  75   * debug OFF
^1da177e Linus Torvalds  2005-04-16  76   * -
^1da177e Linus Torvalds  2005-04-16  77   */
^1da177e Linus Torvalds  2005-04-16  78  #else  /* _CIFS_DEBUG */
f96637be Joe Perches 2013-05-04  79  #define cifs_dbg(type, fmt, ...)   
\
bde98197 Joe Perches 2012-12-05  80  do {   
\
bde98197 Joe Perches 2012-12-05  81 if (0)  
\
0b456f04 Andy Shevchenko 2014-08-27 @82 pr_debug(fmt, 
##__VA_ARGS__);   \
bde98197 Joe Perches 2012-12-05  83  } while (0)
f96637be Joe Perches 2013-05-04  84  #endif
^1da177e Linus Torvalds  2005-04-16  85  

:: The code at line 82 was first introduced by commit
:: 0b456f04bcdf5b1151136adaada158bfc26f1be7 cifs: convert printk(LEVEL...) 
to pr_

:: TO: Andy Shevchenko 
:: CC: Steve French 

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip


[PATCH v1 0/2] PCI: shpchp: Minor fixes

2018-06-01 Thread Bjorn Helgaas
An AMD POGO check in shpchp incorrectly included any device with device ID
0x7458, no matter what the vendor.

Also, use pci_info() instead of dbg() for messages related to OSHP.
They're important enough that they should appear in dmesg.

I intend to include these along with Mika's shpchp updates.

---

Bjorn Helgaas (2):
  PCI: shpchp: Use dev_printk() for OSHP-related messages
  PCI: shpchp: Fix AMD POGO identification


 drivers/pci/hotplug/acpi_pcihp.c  |   13 ++---
 drivers/pci/hotplug/shpchp_ctrl.c |8 
 2 files changed, 10 insertions(+), 11 deletions(-)


[PATCH v1 1/2] PCI: shpchp: Use dev_printk() for OSHP-related messages

2018-06-01 Thread Bjorn Helgaas
From: Bjorn Helgaas 

Use dev_printk() for messages related to requesting control of SHPC hotplug
via the OSHP method.

Signed-off-by: Bjorn Helgaas 
---
 drivers/pci/hotplug/acpi_pcihp.c |   13 ++---
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/drivers/pci/hotplug/acpi_pcihp.c b/drivers/pci/hotplug/acpi_pcihp.c
index 7cc50cfef9c6..597d22aeefc1 100644
--- a/drivers/pci/hotplug/acpi_pcihp.c
+++ b/drivers/pci/hotplug/acpi_pcihp.c
@@ -96,7 +96,7 @@ int acpi_get_hp_hw_control_from_firmware(struct pci_dev *pdev)
if (!handle) {
/*
 * This hotplug controller was not listed in the ACPI name
-* space at all. Try to get acpi handle of parent pci bus.
+* space at all. Try to get ACPI handle of parent PCI bus.
 */
struct pci_bus *pbus;
for (pbus = pdev->bus; pbus; pbus = pbus->parent) {
@@ -108,8 +108,8 @@ int acpi_get_hp_hw_control_from_firmware(struct pci_dev 
*pdev)
 
while (handle) {
acpi_get_name(handle, ACPI_FULL_PATHNAME, );
-   dbg("Trying to get hotplug control for %s\n",
-   (char *)string.pointer);
+   pci_info(pdev, "Requesting control of SHPC hotplug via OSHP 
(%s)\n",
+(char *)string.pointer);
status = acpi_run_oshp(handle);
if (ACPI_SUCCESS(status))
goto got_one;
@@ -121,13 +121,12 @@ int acpi_get_hp_hw_control_from_firmware(struct pci_dev 
*pdev)
break;
}
 no_control:
-   dbg("Cannot get control of hotplug hardware for pci %s\n",
-   pci_name(pdev));
+   pci_info(pdev, "Cannot get control of SHPC hotplug\n");
kfree(string.pointer);
return -ENODEV;
 got_one:
-   dbg("Gained control for hotplug HW for pci %s (%s)\n",
-   pci_name(pdev), (char *)string.pointer);
+   pci_info(pdev, "Gained control of SHPC hotplug (%s)\n",
+(char *)string.pointer);
kfree(string.pointer);
return 0;
 }



[PATCH v1 2/2] PCI: shpchp: Fix AMD POGO identification

2018-06-01 Thread Bjorn Helgaas
From: Bjorn Helgaas 

The fix for an AMD POGO erratum related to SHPC incorrectly identified the
device.  The workaround should be applied only for AMD POGO devices, but it
was instead applied to:

  - all AMD bridges, and
  - all devices from any vendor with device ID 0x7458

Fixes: 53044f357448 ("[PATCH] PCI Hotplug: shpchp: AMD POGO errata fix")
Signed-off-by: Bjorn Helgaas 
---
 drivers/pci/hotplug/shpchp_ctrl.c |8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/pci/hotplug/shpchp_ctrl.c 
b/drivers/pci/hotplug/shpchp_ctrl.c
index bedda5bda910..1047b56e5730 100644
--- a/drivers/pci/hotplug/shpchp_ctrl.c
+++ b/drivers/pci/hotplug/shpchp_ctrl.c
@@ -585,13 +585,13 @@ static int shpchp_enable_slot (struct slot *p_slot)
ctrl_dbg(ctrl, "%s: p_slot->pwr_save %x\n", __func__, p_slot->pwr_save);
p_slot->hpc_ops->get_latch_status(p_slot, );
 
-   if (((p_slot->ctrl->pci_dev->vendor == PCI_VENDOR_ID_AMD) ||
-   (p_slot->ctrl->pci_dev->device == PCI_DEVICE_ID_AMD_POGO_7458))
+   if ((p_slot->ctrl->pci_dev->vendor == PCI_VENDOR_ID_AMD &&
+p_slot->ctrl->pci_dev->device == PCI_DEVICE_ID_AMD_POGO_7458)
 && p_slot->ctrl->num_slots == 1) {
-   /* handle amd pogo errata; this must be done before enable  */
+   /* handle AMD POGO errata; this must be done before enable  */
amd_pogo_errata_save_misc_reg(p_slot);
retval = board_added(p_slot);
-   /* handle amd pogo errata; this must be done after enable  */
+   /* handle AMD POGO errata; this must be done after enable  */
amd_pogo_errata_restore_misc_reg(p_slot);
} else
retval = board_added(p_slot);



[PATCH v1 0/2] PCI: shpchp: Minor fixes

2018-06-01 Thread Bjorn Helgaas
An AMD POGO check in shpchp incorrectly included any device with device ID
0x7458, no matter what the vendor.

Also, use pci_info() instead of dbg() for messages related to OSHP.
They're important enough that they should appear in dmesg.

I intend to include these along with Mika's shpchp updates.

---

Bjorn Helgaas (2):
  PCI: shpchp: Use dev_printk() for OSHP-related messages
  PCI: shpchp: Fix AMD POGO identification


 drivers/pci/hotplug/acpi_pcihp.c  |   13 ++---
 drivers/pci/hotplug/shpchp_ctrl.c |8 
 2 files changed, 10 insertions(+), 11 deletions(-)


[PATCH v1 1/2] PCI: shpchp: Use dev_printk() for OSHP-related messages

2018-06-01 Thread Bjorn Helgaas
From: Bjorn Helgaas 

Use dev_printk() for messages related to requesting control of SHPC hotplug
via the OSHP method.

Signed-off-by: Bjorn Helgaas 
---
 drivers/pci/hotplug/acpi_pcihp.c |   13 ++---
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/drivers/pci/hotplug/acpi_pcihp.c b/drivers/pci/hotplug/acpi_pcihp.c
index 7cc50cfef9c6..597d22aeefc1 100644
--- a/drivers/pci/hotplug/acpi_pcihp.c
+++ b/drivers/pci/hotplug/acpi_pcihp.c
@@ -96,7 +96,7 @@ int acpi_get_hp_hw_control_from_firmware(struct pci_dev *pdev)
if (!handle) {
/*
 * This hotplug controller was not listed in the ACPI name
-* space at all. Try to get acpi handle of parent pci bus.
+* space at all. Try to get ACPI handle of parent PCI bus.
 */
struct pci_bus *pbus;
for (pbus = pdev->bus; pbus; pbus = pbus->parent) {
@@ -108,8 +108,8 @@ int acpi_get_hp_hw_control_from_firmware(struct pci_dev 
*pdev)
 
while (handle) {
acpi_get_name(handle, ACPI_FULL_PATHNAME, );
-   dbg("Trying to get hotplug control for %s\n",
-   (char *)string.pointer);
+   pci_info(pdev, "Requesting control of SHPC hotplug via OSHP 
(%s)\n",
+(char *)string.pointer);
status = acpi_run_oshp(handle);
if (ACPI_SUCCESS(status))
goto got_one;
@@ -121,13 +121,12 @@ int acpi_get_hp_hw_control_from_firmware(struct pci_dev 
*pdev)
break;
}
 no_control:
-   dbg("Cannot get control of hotplug hardware for pci %s\n",
-   pci_name(pdev));
+   pci_info(pdev, "Cannot get control of SHPC hotplug\n");
kfree(string.pointer);
return -ENODEV;
 got_one:
-   dbg("Gained control for hotplug HW for pci %s (%s)\n",
-   pci_name(pdev), (char *)string.pointer);
+   pci_info(pdev, "Gained control of SHPC hotplug (%s)\n",
+(char *)string.pointer);
kfree(string.pointer);
return 0;
 }



[PATCH v1 2/2] PCI: shpchp: Fix AMD POGO identification

2018-06-01 Thread Bjorn Helgaas
From: Bjorn Helgaas 

The fix for an AMD POGO erratum related to SHPC incorrectly identified the
device.  The workaround should be applied only for AMD POGO devices, but it
was instead applied to:

  - all AMD bridges, and
  - all devices from any vendor with device ID 0x7458

Fixes: 53044f357448 ("[PATCH] PCI Hotplug: shpchp: AMD POGO errata fix")
Signed-off-by: Bjorn Helgaas 
---
 drivers/pci/hotplug/shpchp_ctrl.c |8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/pci/hotplug/shpchp_ctrl.c 
b/drivers/pci/hotplug/shpchp_ctrl.c
index bedda5bda910..1047b56e5730 100644
--- a/drivers/pci/hotplug/shpchp_ctrl.c
+++ b/drivers/pci/hotplug/shpchp_ctrl.c
@@ -585,13 +585,13 @@ static int shpchp_enable_slot (struct slot *p_slot)
ctrl_dbg(ctrl, "%s: p_slot->pwr_save %x\n", __func__, p_slot->pwr_save);
p_slot->hpc_ops->get_latch_status(p_slot, );
 
-   if (((p_slot->ctrl->pci_dev->vendor == PCI_VENDOR_ID_AMD) ||
-   (p_slot->ctrl->pci_dev->device == PCI_DEVICE_ID_AMD_POGO_7458))
+   if ((p_slot->ctrl->pci_dev->vendor == PCI_VENDOR_ID_AMD &&
+p_slot->ctrl->pci_dev->device == PCI_DEVICE_ID_AMD_POGO_7458)
 && p_slot->ctrl->num_slots == 1) {
-   /* handle amd pogo errata; this must be done before enable  */
+   /* handle AMD POGO errata; this must be done before enable  */
amd_pogo_errata_save_misc_reg(p_slot);
retval = board_added(p_slot);
-   /* handle amd pogo errata; this must be done after enable  */
+   /* handle AMD POGO errata; this must be done after enable  */
amd_pogo_errata_restore_misc_reg(p_slot);
} else
retval = board_added(p_slot);



Re: [PATCH 2/5] lib/rhashtable: guarantee initial hashtable allocation

2018-06-01 Thread Davidlohr Bueso

On Sat, 02 Jun 2018, Herbert Xu wrote:

tbl = bucket_table_alloc(ht, size, GFP_KERNEL);
-   if (tbl == NULL)
-   return -ENOMEM;
+   if (unlikely(tbl == NULL)) {
+   size = min_t(u16, ht->p.min_size, HASH_MIN_SIZE);


You mean max_t?


Not really. I considered some of the users to set quite a large min_size
(such as 1024 buckets). The min() makes sense to me in that it's the smallest
possible value. If memory later becomes available and the hashtable is resized
to a more appropriate value, couldn't any issues regarding collisions not be 
dealt
with organically? And we've agreed that allocating a tiny table is the
least of our problems.

Thanks,
Davidlohr


Re: [PATCH 2/5] lib/rhashtable: guarantee initial hashtable allocation

2018-06-01 Thread Davidlohr Bueso

On Sat, 02 Jun 2018, Herbert Xu wrote:

tbl = bucket_table_alloc(ht, size, GFP_KERNEL);
-   if (tbl == NULL)
-   return -ENOMEM;
+   if (unlikely(tbl == NULL)) {
+   size = min_t(u16, ht->p.min_size, HASH_MIN_SIZE);


You mean max_t?


Not really. I considered some of the users to set quite a large min_size
(such as 1024 buckets). The min() makes sense to me in that it's the smallest
possible value. If memory later becomes available and the hashtable is resized
to a more appropriate value, couldn't any issues regarding collisions not be 
dealt
with organically? And we've agreed that allocating a tiny table is the
least of our problems.

Thanks,
Davidlohr


[PATCH] clk: Return void from debug_init op

2018-06-01 Thread sboyd
From: Stephen Boyd 

We only have two users of the debug_init hook, and we recently stopped
caring about the return value from that op. Finish that off by changing
the clk_op to return void instead of int because it doesn't matter if
debugfs fails or not.

Cc: Eric Anholt 
Cc: David Lechner 
Cc: Sekhar Nori 
Cc: Greg Kroah-Hartman 
Signed-off-by: Stephen Boyd 
---
 Documentation/clk.txt |  2 +-
 drivers/clk/bcm/clk-bcm2835.c | 25 +++--
 drivers/clk/davinci/pll.c |  8 +++-
 include/linux/clk-provider.h  |  2 +-
 4 files changed, 16 insertions(+), 21 deletions(-)

diff --git a/Documentation/clk.txt b/Documentation/clk.txt
index 511628bb3d3a..593cca5058b1 100644
--- a/Documentation/clk.txt
+++ b/Documentation/clk.txt
@@ -96,7 +96,7 @@ the operations defined in clk-provider.h::
int (*get_phase)(struct clk_hw *hw);
int (*set_phase)(struct clk_hw *hw, int degrees);
void(*init)(struct clk_hw *hw);
-   int (*debug_init)(struct clk_hw *hw,
+   void(*debug_init)(struct clk_hw *hw,
  struct dentry *dentry);
};
 
diff --git a/drivers/clk/bcm/clk-bcm2835.c b/drivers/clk/bcm/clk-bcm2835.c
index 1329440af59f..0bd62efc07f8 100644
--- a/drivers/clk/bcm/clk-bcm2835.c
+++ b/drivers/clk/bcm/clk-bcm2835.c
@@ -394,7 +394,7 @@ static unsigned long bcm2835_measure_tcnt_mux(struct 
bcm2835_cprman *cprman,
return count * 1000;
 }
 
-static int bcm2835_debugfs_regset(struct bcm2835_cprman *cprman, u32 base,
+static void bcm2835_debugfs_regset(struct bcm2835_cprman *cprman, u32 base,
  struct debugfs_reg32 *regs, size_t nregs,
  struct dentry *dentry)
 {
@@ -402,15 +402,13 @@ static int bcm2835_debugfs_regset(struct bcm2835_cprman 
*cprman, u32 base,
 
regset = devm_kzalloc(cprman->dev, sizeof(*regset), GFP_KERNEL);
if (!regset)
-   return -ENOMEM;
+   return;
 
regset->regs = regs;
regset->nregs = nregs;
regset->base = cprman->regs + base;
 
debugfs_create_regset32("regdump", S_IRUGO, dentry, regset);
-
-   return 0;
 }
 
 struct bcm2835_pll_data {
@@ -728,7 +726,7 @@ static int bcm2835_pll_set_rate(struct clk_hw *hw,
return 0;
 }
 
-static int bcm2835_pll_debug_init(struct clk_hw *hw,
+static void bcm2835_pll_debug_init(struct clk_hw *hw,
  struct dentry *dentry)
 {
struct bcm2835_pll *pll = container_of(hw, struct bcm2835_pll, hw);
@@ -738,7 +736,7 @@ static int bcm2835_pll_debug_init(struct clk_hw *hw,
 
regs = devm_kzalloc(cprman->dev, 7 * sizeof(*regs), GFP_KERNEL);
if (!regs)
-   return -ENOMEM;
+   return;
 
regs[0].name = "cm_ctrl";
regs[0].offset = data->cm_ctrl_reg;
@@ -755,7 +753,7 @@ static int bcm2835_pll_debug_init(struct clk_hw *hw,
regs[6].name = "ana3";
regs[6].offset = data->ana_reg_base + 3 * 4;
 
-   return bcm2835_debugfs_regset(cprman, 0, regs, 7, dentry);
+   bcm2835_debugfs_regset(cprman, 0, regs, 7, dentry);
 }
 
 static const struct clk_ops bcm2835_pll_clk_ops = {
@@ -859,8 +857,8 @@ static int bcm2835_pll_divider_set_rate(struct clk_hw *hw,
return 0;
 }
 
-static int bcm2835_pll_divider_debug_init(struct clk_hw *hw,
- struct dentry *dentry)
+static void bcm2835_pll_divider_debug_init(struct clk_hw *hw,
+  struct dentry *dentry)
 {
struct bcm2835_pll_divider *divider = bcm2835_pll_divider_from_hw(hw);
struct bcm2835_cprman *cprman = divider->cprman;
@@ -869,14 +867,14 @@ static int bcm2835_pll_divider_debug_init(struct clk_hw 
*hw,
 
regs = devm_kzalloc(cprman->dev, 7 * sizeof(*regs), GFP_KERNEL);
if (!regs)
-   return -ENOMEM;
+   return;
 
regs[0].name = "cm";
regs[0].offset = data->cm_reg;
regs[1].name = "a2w";
regs[1].offset = data->a2w_reg;
 
-   return bcm2835_debugfs_regset(cprman, 0, regs, 2, dentry);
+   bcm2835_debugfs_regset(cprman, 0, regs, 2, dentry);
 }
 
 static const struct clk_ops bcm2835_pll_divider_clk_ops = {
@@ -1252,15 +1250,14 @@ static struct debugfs_reg32 
bcm2835_debugfs_clock_reg32[] = {
},
 };
 
-static int bcm2835_clock_debug_init(struct clk_hw *hw,
+static void bcm2835_clock_debug_init(struct clk_hw *hw,
struct dentry *dentry)
 {
struct bcm2835_clock *clock = bcm2835_clock_from_hw(hw);
struct bcm2835_cprman *cprman = clock->cprman;
const struct bcm2835_clock_data *data = clock->data;
 
-   return bcm2835_debugfs_regset(
-   cprman, data->ctl_reg,
+   bcm2835_debugfs_regset(cprman, data->ctl_reg,

[PATCH] clk: Return void from debug_init op

2018-06-01 Thread sboyd
From: Stephen Boyd 

We only have two users of the debug_init hook, and we recently stopped
caring about the return value from that op. Finish that off by changing
the clk_op to return void instead of int because it doesn't matter if
debugfs fails or not.

Cc: Eric Anholt 
Cc: David Lechner 
Cc: Sekhar Nori 
Cc: Greg Kroah-Hartman 
Signed-off-by: Stephen Boyd 
---
 Documentation/clk.txt |  2 +-
 drivers/clk/bcm/clk-bcm2835.c | 25 +++--
 drivers/clk/davinci/pll.c |  8 +++-
 include/linux/clk-provider.h  |  2 +-
 4 files changed, 16 insertions(+), 21 deletions(-)

diff --git a/Documentation/clk.txt b/Documentation/clk.txt
index 511628bb3d3a..593cca5058b1 100644
--- a/Documentation/clk.txt
+++ b/Documentation/clk.txt
@@ -96,7 +96,7 @@ the operations defined in clk-provider.h::
int (*get_phase)(struct clk_hw *hw);
int (*set_phase)(struct clk_hw *hw, int degrees);
void(*init)(struct clk_hw *hw);
-   int (*debug_init)(struct clk_hw *hw,
+   void(*debug_init)(struct clk_hw *hw,
  struct dentry *dentry);
};
 
diff --git a/drivers/clk/bcm/clk-bcm2835.c b/drivers/clk/bcm/clk-bcm2835.c
index 1329440af59f..0bd62efc07f8 100644
--- a/drivers/clk/bcm/clk-bcm2835.c
+++ b/drivers/clk/bcm/clk-bcm2835.c
@@ -394,7 +394,7 @@ static unsigned long bcm2835_measure_tcnt_mux(struct 
bcm2835_cprman *cprman,
return count * 1000;
 }
 
-static int bcm2835_debugfs_regset(struct bcm2835_cprman *cprman, u32 base,
+static void bcm2835_debugfs_regset(struct bcm2835_cprman *cprman, u32 base,
  struct debugfs_reg32 *regs, size_t nregs,
  struct dentry *dentry)
 {
@@ -402,15 +402,13 @@ static int bcm2835_debugfs_regset(struct bcm2835_cprman 
*cprman, u32 base,
 
regset = devm_kzalloc(cprman->dev, sizeof(*regset), GFP_KERNEL);
if (!regset)
-   return -ENOMEM;
+   return;
 
regset->regs = regs;
regset->nregs = nregs;
regset->base = cprman->regs + base;
 
debugfs_create_regset32("regdump", S_IRUGO, dentry, regset);
-
-   return 0;
 }
 
 struct bcm2835_pll_data {
@@ -728,7 +726,7 @@ static int bcm2835_pll_set_rate(struct clk_hw *hw,
return 0;
 }
 
-static int bcm2835_pll_debug_init(struct clk_hw *hw,
+static void bcm2835_pll_debug_init(struct clk_hw *hw,
  struct dentry *dentry)
 {
struct bcm2835_pll *pll = container_of(hw, struct bcm2835_pll, hw);
@@ -738,7 +736,7 @@ static int bcm2835_pll_debug_init(struct clk_hw *hw,
 
regs = devm_kzalloc(cprman->dev, 7 * sizeof(*regs), GFP_KERNEL);
if (!regs)
-   return -ENOMEM;
+   return;
 
regs[0].name = "cm_ctrl";
regs[0].offset = data->cm_ctrl_reg;
@@ -755,7 +753,7 @@ static int bcm2835_pll_debug_init(struct clk_hw *hw,
regs[6].name = "ana3";
regs[6].offset = data->ana_reg_base + 3 * 4;
 
-   return bcm2835_debugfs_regset(cprman, 0, regs, 7, dentry);
+   bcm2835_debugfs_regset(cprman, 0, regs, 7, dentry);
 }
 
 static const struct clk_ops bcm2835_pll_clk_ops = {
@@ -859,8 +857,8 @@ static int bcm2835_pll_divider_set_rate(struct clk_hw *hw,
return 0;
 }
 
-static int bcm2835_pll_divider_debug_init(struct clk_hw *hw,
- struct dentry *dentry)
+static void bcm2835_pll_divider_debug_init(struct clk_hw *hw,
+  struct dentry *dentry)
 {
struct bcm2835_pll_divider *divider = bcm2835_pll_divider_from_hw(hw);
struct bcm2835_cprman *cprman = divider->cprman;
@@ -869,14 +867,14 @@ static int bcm2835_pll_divider_debug_init(struct clk_hw 
*hw,
 
regs = devm_kzalloc(cprman->dev, 7 * sizeof(*regs), GFP_KERNEL);
if (!regs)
-   return -ENOMEM;
+   return;
 
regs[0].name = "cm";
regs[0].offset = data->cm_reg;
regs[1].name = "a2w";
regs[1].offset = data->a2w_reg;
 
-   return bcm2835_debugfs_regset(cprman, 0, regs, 2, dentry);
+   bcm2835_debugfs_regset(cprman, 0, regs, 2, dentry);
 }
 
 static const struct clk_ops bcm2835_pll_divider_clk_ops = {
@@ -1252,15 +1250,14 @@ static struct debugfs_reg32 
bcm2835_debugfs_clock_reg32[] = {
},
 };
 
-static int bcm2835_clock_debug_init(struct clk_hw *hw,
+static void bcm2835_clock_debug_init(struct clk_hw *hw,
struct dentry *dentry)
 {
struct bcm2835_clock *clock = bcm2835_clock_from_hw(hw);
struct bcm2835_cprman *cprman = clock->cprman;
const struct bcm2835_clock_data *data = clock->data;
 
-   return bcm2835_debugfs_regset(
-   cprman, data->ctl_reg,
+   bcm2835_debugfs_regset(cprman, data->ctl_reg,

Re: [PATCH] x86,switch_mm: skip atomic operations for init_mm

2018-06-01 Thread Rik van Riel
On Fri, 2018-06-01 at 20:35 -0700, Andy Lutomirski wrote:
> On Fri, Jun 1, 2018 at 3:13 PM Rik van Riel  wrote:
> > 
> > On Fri, 1 Jun 2018 14:21:58 -0700
> > Andy Lutomirski  wrote:
> > 
> > > Hmm.  I wonder if there's a more clever data structure than a
> > > bitmap
> > > that we could be using here.  Each CPU only ever needs to be in
> > > one
> > > mm's cpumask, and each cpu only ever changes its own state in the
> > > bitmask.  And writes are much less common than reads for most
> > > workloads.
> > 
> > It would be easy enough to add an mm_struct pointer to the
> > per-cpu tlbstate struct, and iterate over those.
> > 
> > However, that would be an orthogonal change to optimizing
> > lazy TLB mode.
> > 
> > Does the (untested) patch below make sense as a potential
> > improvement to the lazy TLB heuristic?
> > 
> > ---8<---
> > Subject: x86,tlb: workload dependent per CPU lazy TLB switch
> > 
> > Lazy TLB mode is a tradeoff between flushing the TLB and touching
> > the mm_cpumask(_mm) at context switch time, versus potentially
> > incurring a remote TLB flush IPI while in lazy TLB mode.
> > 
> > Whether this pays off is likely to be workload dependent more than
> > anything else. However, the current heuristic keys off hardware
> > type.
> > 
> > This patch changes the lazy TLB mode heuristic to a dynamic, per-
> > CPU
> > decision, dependent on whether we recently received a remote TLB
> > shootdown while in lazy TLB mode.
> > 
> > This is a very simple heuristic. When a CPU receives a remote TLB
> > shootdown IPI while in lazy TLB mode, a counter in the same cache
> > line is set to 16. Every time we skip lazy TLB mode, the counter
> > is decremented.
> > 
> > While the counter is zero (no recent TLB flush IPIs), allow lazy
> > TLB mode.
> 
> Hmm, cute.  That's not a bad idea at all.  It would be nice to get
> some kind of real benchmark on both PCID and !PCID.  If nothing else,
> I would expect the threshold (16 in your patch) to want to be lower
> on
> PCID systems.

That depends on how well we manage to get rid of
the cpumask manipulation overhead. On the PCID
system we first found this issue, the atomic
accesses to the mm_cpumask took about 4x as much
CPU time as the TLB invalidation itself.

That kinda limits how much the cost of cheaper
TLB flushes actually help :)

I agree this code should get some testing.

-- 
All Rights Reversed.

signature.asc
Description: This is a digitally signed message part


Re: [PATCH] x86,switch_mm: skip atomic operations for init_mm

2018-06-01 Thread Rik van Riel
On Fri, 2018-06-01 at 20:35 -0700, Andy Lutomirski wrote:
> On Fri, Jun 1, 2018 at 3:13 PM Rik van Riel  wrote:
> > 
> > On Fri, 1 Jun 2018 14:21:58 -0700
> > Andy Lutomirski  wrote:
> > 
> > > Hmm.  I wonder if there's a more clever data structure than a
> > > bitmap
> > > that we could be using here.  Each CPU only ever needs to be in
> > > one
> > > mm's cpumask, and each cpu only ever changes its own state in the
> > > bitmask.  And writes are much less common than reads for most
> > > workloads.
> > 
> > It would be easy enough to add an mm_struct pointer to the
> > per-cpu tlbstate struct, and iterate over those.
> > 
> > However, that would be an orthogonal change to optimizing
> > lazy TLB mode.
> > 
> > Does the (untested) patch below make sense as a potential
> > improvement to the lazy TLB heuristic?
> > 
> > ---8<---
> > Subject: x86,tlb: workload dependent per CPU lazy TLB switch
> > 
> > Lazy TLB mode is a tradeoff between flushing the TLB and touching
> > the mm_cpumask(_mm) at context switch time, versus potentially
> > incurring a remote TLB flush IPI while in lazy TLB mode.
> > 
> > Whether this pays off is likely to be workload dependent more than
> > anything else. However, the current heuristic keys off hardware
> > type.
> > 
> > This patch changes the lazy TLB mode heuristic to a dynamic, per-
> > CPU
> > decision, dependent on whether we recently received a remote TLB
> > shootdown while in lazy TLB mode.
> > 
> > This is a very simple heuristic. When a CPU receives a remote TLB
> > shootdown IPI while in lazy TLB mode, a counter in the same cache
> > line is set to 16. Every time we skip lazy TLB mode, the counter
> > is decremented.
> > 
> > While the counter is zero (no recent TLB flush IPIs), allow lazy
> > TLB mode.
> 
> Hmm, cute.  That's not a bad idea at all.  It would be nice to get
> some kind of real benchmark on both PCID and !PCID.  If nothing else,
> I would expect the threshold (16 in your patch) to want to be lower
> on
> PCID systems.

That depends on how well we manage to get rid of
the cpumask manipulation overhead. On the PCID
system we first found this issue, the atomic
accesses to the mm_cpumask took about 4x as much
CPU time as the TLB invalidation itself.

That kinda limits how much the cost of cheaper
TLB flushes actually help :)

I agree this code should get some testing.

-- 
All Rights Reversed.

signature.asc
Description: This is a digitally signed message part


Re: [PATCH] clkdev: Remove duplicated negative index check from __of_clk_get()

2018-06-01 Thread Stephen Boyd
Quoting Geert Uytterhoeven (2018-06-01 12:22:33)
> Hi Stephen,
> 
> On Fri, Jun 1, 2018 at 9:20 PM, Stephen Boyd  wrote:
> > Quoting Geert Uytterhoeven (2018-05-18 03:58:40)
> >> __of_clk_get() calls of_parse_phandle_with_args(), which rejects
> >> negative indices since commit bd69f73f2c81eed9 ("of: Create function for
> >> counting number of phandles in a property").
> >>
> >> Signed-off-by: Geert Uytterhoeven 
> >> ---
> >> Commit bd69f73f2c81eed9 is in v3.9.
> >
> > Did you send this to Russell's patch tracker? Otherwise I can pick it up
> 
> Not yet. The patch tracker is for reviewed patches, AFAIK.
> 
> > to clk-next.
> 
> Thanks!
> 

Ok. If you need my reviewed-by to add it to the tracker feel free to
have:

Reviewed-by: Stephen Boyd 



Re: [PATCH] clkdev: Remove duplicated negative index check from __of_clk_get()

2018-06-01 Thread Stephen Boyd
Quoting Geert Uytterhoeven (2018-06-01 12:22:33)
> Hi Stephen,
> 
> On Fri, Jun 1, 2018 at 9:20 PM, Stephen Boyd  wrote:
> > Quoting Geert Uytterhoeven (2018-05-18 03:58:40)
> >> __of_clk_get() calls of_parse_phandle_with_args(), which rejects
> >> negative indices since commit bd69f73f2c81eed9 ("of: Create function for
> >> counting number of phandles in a property").
> >>
> >> Signed-off-by: Geert Uytterhoeven 
> >> ---
> >> Commit bd69f73f2c81eed9 is in v3.9.
> >
> > Did you send this to Russell's patch tracker? Otherwise I can pick it up
> 
> Not yet. The patch tracker is for reviewed patches, AFAIK.
> 
> > to clk-next.
> 
> Thanks!
> 

Ok. If you need my reviewed-by to add it to the tracker feel free to
have:

Reviewed-by: Stephen Boyd 



Re: [PATCH 5/5] lib/test_rhashtable: rhashtable_init() can no longer fail

2018-06-01 Thread Herbert Xu
On Fri, Jun 01, 2018 at 09:01:25AM -0700, Davidlohr Bueso wrote:
> Update the test module as such.
> 
> Signed-off-by: Davidlohr Bueso 

Please drop this patch.

Thanks,
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


Re: [PATCH 5/5] lib/test_rhashtable: rhashtable_init() can no longer fail

2018-06-01 Thread Herbert Xu
On Fri, Jun 01, 2018 at 09:01:25AM -0700, Davidlohr Bueso wrote:
> Update the test module as such.
> 
> Signed-off-by: Davidlohr Bueso 

Please drop this patch.

Thanks,
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


Re: [PATCH 2/5] lib/rhashtable: guarantee initial hashtable allocation

2018-06-01 Thread Herbert Xu
On Fri, Jun 01, 2018 at 09:01:22AM -0700, Davidlohr Bueso wrote:
>
> diff --git a/lib/rhashtable.c b/lib/rhashtable.c
> index 05a4b1b8b8ce..ae17da6f0c75 100644
> --- a/lib/rhashtable.c
> +++ b/lib/rhashtable.c
> @@ -175,7 +175,7 @@ static struct bucket_table *bucket_table_alloc(struct 
> rhashtable *ht,
>   int i;
>  
>   size = sizeof(*tbl) + nbuckets * sizeof(tbl->buckets[0]);
> - if (gfp != GFP_KERNEL)
> + if ((gfp & ~__GFP_NOFAIL) != GFP_KERNEL)
>   tbl = kzalloc(size, gfp | __GFP_NOWARN | __GFP_NORETRY);
>   else
>   tbl = kvzalloc(size, gfp);

There's another GFP_KERNEL check in this function that needs the
same treatment.

> @@ -1067,9 +1067,16 @@ int rhashtable_init(struct rhashtable *ht,
>   }
>   }
>  
> + /*
> +  * This is api initialization and thus we need to guarantee the
> +  * initial rhashtable allocation. Upon failure, retry with the
> +  * smallest possible size with __GFP_NOFAIL semantics.
> +  */
>   tbl = bucket_table_alloc(ht, size, GFP_KERNEL);
> - if (tbl == NULL)
> - return -ENOMEM;
> + if (unlikely(tbl == NULL)) {
> + size = min_t(u16, ht->p.min_size, HASH_MIN_SIZE);

You mean max_t?

Thanks,
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


Re: [PATCH 2/5] lib/rhashtable: guarantee initial hashtable allocation

2018-06-01 Thread Herbert Xu
On Fri, Jun 01, 2018 at 09:01:22AM -0700, Davidlohr Bueso wrote:
>
> diff --git a/lib/rhashtable.c b/lib/rhashtable.c
> index 05a4b1b8b8ce..ae17da6f0c75 100644
> --- a/lib/rhashtable.c
> +++ b/lib/rhashtable.c
> @@ -175,7 +175,7 @@ static struct bucket_table *bucket_table_alloc(struct 
> rhashtable *ht,
>   int i;
>  
>   size = sizeof(*tbl) + nbuckets * sizeof(tbl->buckets[0]);
> - if (gfp != GFP_KERNEL)
> + if ((gfp & ~__GFP_NOFAIL) != GFP_KERNEL)
>   tbl = kzalloc(size, gfp | __GFP_NOWARN | __GFP_NORETRY);
>   else
>   tbl = kvzalloc(size, gfp);

There's another GFP_KERNEL check in this function that needs the
same treatment.

> @@ -1067,9 +1067,16 @@ int rhashtable_init(struct rhashtable *ht,
>   }
>   }
>  
> + /*
> +  * This is api initialization and thus we need to guarantee the
> +  * initial rhashtable allocation. Upon failure, retry with the
> +  * smallest possible size with __GFP_NOFAIL semantics.
> +  */
>   tbl = bucket_table_alloc(ht, size, GFP_KERNEL);
> - if (tbl == NULL)
> - return -ENOMEM;
> + if (unlikely(tbl == NULL)) {
> + size = min_t(u16, ht->p.min_size, HASH_MIN_SIZE);

You mean max_t?

Thanks,
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


Re: Slab out of bounds in setxattr

2018-06-01 Thread shankarapailoor
Hi,

Looking at the crash some more, it seems that if value_len > PAGE_SIZE
then e_buf->max_size is rounded up nearest page size [1]. If a new
attribute is added with value_len < e_buf->max_size - EA_SIZE(ea) then
no new space is allocated for the attiribute list [2] and this
triggers the KASAN slab out of bounds error. This is the case in the C
repro I provided.


1. https://elixir.bootlin.com/linux/v4.17-rc7/source/fs/jfs/xattr.c#L501
2. https://elixir.bootlin.com/linux/v4.17-rc7/source/fs/jfs/xattr.c#L723

On Fri, Jun 1, 2018 at 1:52 PM, shankarapailoor
 wrote:
> Hi Dave et al,
>
> I have been fuzzing linux 4.17-rc4 with JFS using Syzkaller  KASAN:
> slab-out-of-bounds in jfs_xattr.
>
> Attached are my kernel configs and a C reproducer. In the first
> setxattr call it appears that length is much larger than the name. In
> __jfs_setxattr, I don't see where the length is checked against the
> actual value length.
>
> Regards,
> Shankara Pailoor



-- 
Regards,
Shankara Pailoor


Re: Slab out of bounds in setxattr

2018-06-01 Thread shankarapailoor
Hi,

Looking at the crash some more, it seems that if value_len > PAGE_SIZE
then e_buf->max_size is rounded up nearest page size [1]. If a new
attribute is added with value_len < e_buf->max_size - EA_SIZE(ea) then
no new space is allocated for the attiribute list [2] and this
triggers the KASAN slab out of bounds error. This is the case in the C
repro I provided.


1. https://elixir.bootlin.com/linux/v4.17-rc7/source/fs/jfs/xattr.c#L501
2. https://elixir.bootlin.com/linux/v4.17-rc7/source/fs/jfs/xattr.c#L723

On Fri, Jun 1, 2018 at 1:52 PM, shankarapailoor
 wrote:
> Hi Dave et al,
>
> I have been fuzzing linux 4.17-rc4 with JFS using Syzkaller  KASAN:
> slab-out-of-bounds in jfs_xattr.
>
> Attached are my kernel configs and a C reproducer. In the first
> setxattr call it appears that length is much larger than the name. In
> __jfs_setxattr, I don't see where the length is checked against the
> actual value length.
>
> Regards,
> Shankara Pailoor



-- 
Regards,
Shankara Pailoor


Re: [PATCH 30/32] vfs: Allow cloning of a mount tree with open(O_PATH|O_CLONE_MOUNT) [ver #8]

2018-06-01 Thread Al Viro
On Sat, Jun 02, 2018 at 04:42:56AM +0100, Al Viro wrote:
> _If_ I'm interpreting that correctly, that should be something like a bitmap
> of attributes to modify + values to set for each.  Let's see -
>   propagation 1 + 2 bits
>   nodev   1 + 1
>   noexec  1 + 1
>   nosuid  1 + 1
>   ro  1 + 1
>   atime   1 + 3
> That's 15 bits.  On top of that, we have 1 bit for "clone or original"
> and 1 bit for "recursive or single-mount".  As well as AT_EMPTY_PATH,
> and AT_NO_AUTOMOUNT (inconvenient, since these are fixed bits).  In
> principle, that does fit into int, with some space to spare...
> 
> Is that what you have in mind?

TBH, I would probably prefer separate mount_setattr(2) for that kind
of work, with something like
int mount_setattr(int dirfd, const char *path, int flags, int attr)
*not* opening any files.
flags:
AT_EMPTY_PATH, AT_NO_AUTOMOUNT, AT_RECURSIVE
attr:
MOUNT_SETATTR_DEV   (1<<0)
MOUNT_SETATTR_NODEV (1<<0)|(1<<1)
MOUNT_SETATTR_EXEC  (1<<2)
MOUNT_SETATTR_NOEXEC(1<<2)|(1<<3)
MOUNT_SETATTR_SUID  (1<<4)
MOUNT_SETATTR_NOSUID(1<<4)|(1<<5)
MOUNT_SETATTR_RW(1<<6)
MOUNT_SETATTR_RO(1<<6)|(1<<7)
MOUNT_SETATTR_RELATIME  (1<<8)
MOUNT_SETATTR_NOATIME   (1<<8)|(1<<9)
MOUNT_SETATTR_NODIRATIME(1<<8)|(2<<9)
MOUNT_SETATTR_STRICTATIME   (1<<8)|(3<<9)
MOUNT_SETATTR_PRIVATE   (1<<11)
MOUNT_SETATTR_SHARED(1<<11)|(1<<12)
MOUNT_SETATTR_SLAVE (1<<11)|(2<<12)
MOUNT_SETATTR_UNBINDABLE(1<<11)|(3<<12)

With either openat() used as in this series, or explicit
int open_tree(int dirfd, const char *path, int flags)
returning a descriptor, with flags being
AT_EMPTY_PATH, AT_NO_AUTOMOUNT, AT_RECURSIVE, AT_CLONE
with AT_RECURSIVE without AT_CLONE being an error.  Hell, might
even add AT_UMOUNT for "open root and detach, to be dissolved on
close", incompatible with AT_CLONE.

Comments?


Re: [PATCH 30/32] vfs: Allow cloning of a mount tree with open(O_PATH|O_CLONE_MOUNT) [ver #8]

2018-06-01 Thread Al Viro
On Sat, Jun 02, 2018 at 04:42:56AM +0100, Al Viro wrote:
> _If_ I'm interpreting that correctly, that should be something like a bitmap
> of attributes to modify + values to set for each.  Let's see -
>   propagation 1 + 2 bits
>   nodev   1 + 1
>   noexec  1 + 1
>   nosuid  1 + 1
>   ro  1 + 1
>   atime   1 + 3
> That's 15 bits.  On top of that, we have 1 bit for "clone or original"
> and 1 bit for "recursive or single-mount".  As well as AT_EMPTY_PATH,
> and AT_NO_AUTOMOUNT (inconvenient, since these are fixed bits).  In
> principle, that does fit into int, with some space to spare...
> 
> Is that what you have in mind?

TBH, I would probably prefer separate mount_setattr(2) for that kind
of work, with something like
int mount_setattr(int dirfd, const char *path, int flags, int attr)
*not* opening any files.
flags:
AT_EMPTY_PATH, AT_NO_AUTOMOUNT, AT_RECURSIVE
attr:
MOUNT_SETATTR_DEV   (1<<0)
MOUNT_SETATTR_NODEV (1<<0)|(1<<1)
MOUNT_SETATTR_EXEC  (1<<2)
MOUNT_SETATTR_NOEXEC(1<<2)|(1<<3)
MOUNT_SETATTR_SUID  (1<<4)
MOUNT_SETATTR_NOSUID(1<<4)|(1<<5)
MOUNT_SETATTR_RW(1<<6)
MOUNT_SETATTR_RO(1<<6)|(1<<7)
MOUNT_SETATTR_RELATIME  (1<<8)
MOUNT_SETATTR_NOATIME   (1<<8)|(1<<9)
MOUNT_SETATTR_NODIRATIME(1<<8)|(2<<9)
MOUNT_SETATTR_STRICTATIME   (1<<8)|(3<<9)
MOUNT_SETATTR_PRIVATE   (1<<11)
MOUNT_SETATTR_SHARED(1<<11)|(1<<12)
MOUNT_SETATTR_SLAVE (1<<11)|(2<<12)
MOUNT_SETATTR_UNBINDABLE(1<<11)|(3<<12)

With either openat() used as in this series, or explicit
int open_tree(int dirfd, const char *path, int flags)
returning a descriptor, with flags being
AT_EMPTY_PATH, AT_NO_AUTOMOUNT, AT_RECURSIVE, AT_CLONE
with AT_RECURSIVE without AT_CLONE being an error.  Hell, might
even add AT_UMOUNT for "open root and detach, to be dissolved on
close", incompatible with AT_CLONE.

Comments?


Re: [PATCH 30/32] vfs: Allow cloning of a mount tree with open(O_PATH|O_CLONE_MOUNT) [ver #8]

2018-06-01 Thread Al Viro
On Sat, Jun 02, 2018 at 04:09:14AM +0100, Al Viro wrote:
> On Fri, Jun 01, 2018 at 09:27:43AM +0100, David Howells wrote:
> > Al Viro  wrote:
> > 
> > > > Instead of overloading this on open having a specific syscalls just
> > > > seems like a much saner idea.
> > > 
> > > It's not just mount API; these can be used independently of that.
> > > Think of the uses where you pass those to ...at() and you'll see
> > > a bunch of applications of that thing.
> > 
> > I kind of agree with Christoph on this point.  Yes, you can use the 
> > resultant
> > fd for other things, but that doesn't mean it has to be obtained initially
> > through open() or openat() rather than, say, a new pick_mount() syscall.
> > 
> > Further, having more parameters available gives us the opportunity to change
> > the settings on any mounts we create at the point of creation.
> 
> open_subtree(int dirfd, const char *pathname, int flags), then?  How would
> flags be interpreted?  What I see mapping at that thing is
>   * equivalent of O_PATH open
>   * clone subtree, O_PATH open root
>   * clone one mount, O_PATH open root
> and apparently you want to add (orthogonal to that)
>   * make shared/slave/private/unbindable
>   * ditto with recursion?
>   * same for nodev/nosuid/noexec/noatime/nodiratime/relatime/ro/?
> as well as usual AT_... flags (empty path, follow)
> 
> Choose the encoding...

_If_ I'm interpreting that correctly, that should be something like a bitmap
of attributes to modify + values to set for each.  Let's see -
propagation 1 + 2 bits
nodev   1 + 1
noexec  1 + 1
nosuid  1 + 1
ro  1 + 1
atime   1 + 3
That's 15 bits.  On top of that, we have 1 bit for "clone or original"
and 1 bit for "recursive or single-mount".  As well as AT_EMPTY_PATH,
and AT_NO_AUTOMOUNT (inconvenient, since these are fixed bits).  In
principle, that does fit into int, with some space to spare...

Is that what you have in mind?


Re: [PATCH 30/32] vfs: Allow cloning of a mount tree with open(O_PATH|O_CLONE_MOUNT) [ver #8]

2018-06-01 Thread Al Viro
On Sat, Jun 02, 2018 at 04:09:14AM +0100, Al Viro wrote:
> On Fri, Jun 01, 2018 at 09:27:43AM +0100, David Howells wrote:
> > Al Viro  wrote:
> > 
> > > > Instead of overloading this on open having a specific syscalls just
> > > > seems like a much saner idea.
> > > 
> > > It's not just mount API; these can be used independently of that.
> > > Think of the uses where you pass those to ...at() and you'll see
> > > a bunch of applications of that thing.
> > 
> > I kind of agree with Christoph on this point.  Yes, you can use the 
> > resultant
> > fd for other things, but that doesn't mean it has to be obtained initially
> > through open() or openat() rather than, say, a new pick_mount() syscall.
> > 
> > Further, having more parameters available gives us the opportunity to change
> > the settings on any mounts we create at the point of creation.
> 
> open_subtree(int dirfd, const char *pathname, int flags), then?  How would
> flags be interpreted?  What I see mapping at that thing is
>   * equivalent of O_PATH open
>   * clone subtree, O_PATH open root
>   * clone one mount, O_PATH open root
> and apparently you want to add (orthogonal to that)
>   * make shared/slave/private/unbindable
>   * ditto with recursion?
>   * same for nodev/nosuid/noexec/noatime/nodiratime/relatime/ro/?
> as well as usual AT_... flags (empty path, follow)
> 
> Choose the encoding...

_If_ I'm interpreting that correctly, that should be something like a bitmap
of attributes to modify + values to set for each.  Let's see -
propagation 1 + 2 bits
nodev   1 + 1
noexec  1 + 1
nosuid  1 + 1
ro  1 + 1
atime   1 + 3
That's 15 bits.  On top of that, we have 1 bit for "clone or original"
and 1 bit for "recursive or single-mount".  As well as AT_EMPTY_PATH,
and AT_NO_AUTOMOUNT (inconvenient, since these are fixed bits).  In
principle, that does fit into int, with some space to spare...

Is that what you have in mind?


Re: [PATCH] x86,switch_mm: skip atomic operations for init_mm

2018-06-01 Thread Mike Galbraith
On Fri, 2018-06-01 at 13:03 -0700, Andy Lutomirski wrote:
> 
> Mike, you never did say: do you have PCID on your CPU?

Yes.

>   Also, what is
> your workload doing to cause so many switches back and forth between
> init_mm and a task.

pipe-test measures pipe round trip, does nearly nothing but schedule.  

-Mike


Re: [PATCH] x86,switch_mm: skip atomic operations for init_mm

2018-06-01 Thread Mike Galbraith
On Fri, 2018-06-01 at 13:03 -0700, Andy Lutomirski wrote:
> 
> Mike, you never did say: do you have PCID on your CPU?

Yes.

>   Also, what is
> your workload doing to cause so many switches back and forth between
> init_mm and a task.

pipe-test measures pipe round trip, does nearly nothing but schedule.  

-Mike


Re: linux-next: manual merge of the nvdimm tree with the xfs tree

2018-06-01 Thread Dan Williams
On Fri, Jun 1, 2018 at 6:33 PM, Stephen Rothwell  wrote:
> Hi Darrick,
>
> On Fri, 1 Jun 2018 17:59:48 -0700 "Darrick J. Wong"  
> wrote:
>>
>> > +   if (!dax_enabled) {
>> >  -  pr_debug("VFS (%s): error: dax support not enabled\n",
>> >  -  sb->s_id);
>> >  +  pr_debug("%s: error: dax support not enabled\n",
>> >  +  bdevname(bdev, buf));
>> > -   return false;
>> > +   return -EOPNOTSUPP;
>>
>> Hang on a sec, the changes in the xfs tree make this function return a
>> boolean (true for dax-is-supported, false for dax-not-supported), but
>> this change partially reverts the boolean return values.
>
> OK, weird, that is what I though I had done.  Thanks for pointing it
> out (I guess it was getting late :-().
>
>> > }
>> > -
>> > -   return true;
>> > +   return 0;

Good catch Darrick, I missed that too...


Re: linux-next: manual merge of the nvdimm tree with the xfs tree

2018-06-01 Thread Dan Williams
On Fri, Jun 1, 2018 at 6:33 PM, Stephen Rothwell  wrote:
> Hi Darrick,
>
> On Fri, 1 Jun 2018 17:59:48 -0700 "Darrick J. Wong"  
> wrote:
>>
>> > +   if (!dax_enabled) {
>> >  -  pr_debug("VFS (%s): error: dax support not enabled\n",
>> >  -  sb->s_id);
>> >  +  pr_debug("%s: error: dax support not enabled\n",
>> >  +  bdevname(bdev, buf));
>> > -   return false;
>> > +   return -EOPNOTSUPP;
>>
>> Hang on a sec, the changes in the xfs tree make this function return a
>> boolean (true for dax-is-supported, false for dax-not-supported), but
>> this change partially reverts the boolean return values.
>
> OK, weird, that is what I though I had done.  Thanks for pointing it
> out (I guess it was getting late :-().
>
>> > }
>> > -
>> > -   return true;
>> > +   return 0;

Good catch Darrick, I missed that too...


Re: [PATCH] x86,switch_mm: skip atomic operations for init_mm

2018-06-01 Thread Andy Lutomirski
On Fri, Jun 1, 2018 at 3:13 PM Rik van Riel  wrote:
>
> On Fri, 1 Jun 2018 14:21:58 -0700
> Andy Lutomirski  wrote:
>
> > Hmm.  I wonder if there's a more clever data structure than a bitmap
> > that we could be using here.  Each CPU only ever needs to be in one
> > mm's cpumask, and each cpu only ever changes its own state in the
> > bitmask.  And writes are much less common than reads for most
> > workloads.
>
> It would be easy enough to add an mm_struct pointer to the
> per-cpu tlbstate struct, and iterate over those.
>
> However, that would be an orthogonal change to optimizing
> lazy TLB mode.
>
> Does the (untested) patch below make sense as a potential
> improvement to the lazy TLB heuristic?
>
> ---8<---
> Subject: x86,tlb: workload dependent per CPU lazy TLB switch
>
> Lazy TLB mode is a tradeoff between flushing the TLB and touching
> the mm_cpumask(_mm) at context switch time, versus potentially
> incurring a remote TLB flush IPI while in lazy TLB mode.
>
> Whether this pays off is likely to be workload dependent more than
> anything else. However, the current heuristic keys off hardware type.
>
> This patch changes the lazy TLB mode heuristic to a dynamic, per-CPU
> decision, dependent on whether we recently received a remote TLB
> shootdown while in lazy TLB mode.
>
> This is a very simple heuristic. When a CPU receives a remote TLB
> shootdown IPI while in lazy TLB mode, a counter in the same cache
> line is set to 16. Every time we skip lazy TLB mode, the counter
> is decremented.
>
> While the counter is zero (no recent TLB flush IPIs), allow lazy TLB mode.

Hmm, cute.  That's not a bad idea at all.  It would be nice to get
some kind of real benchmark on both PCID and !PCID.  If nothing else,
I would expect the threshold (16 in your patch) to want to be lower on
PCID systems.

--Andy


Re: [PATCH] x86,switch_mm: skip atomic operations for init_mm

2018-06-01 Thread Andy Lutomirski
On Fri, Jun 1, 2018 at 3:13 PM Rik van Riel  wrote:
>
> On Fri, 1 Jun 2018 14:21:58 -0700
> Andy Lutomirski  wrote:
>
> > Hmm.  I wonder if there's a more clever data structure than a bitmap
> > that we could be using here.  Each CPU only ever needs to be in one
> > mm's cpumask, and each cpu only ever changes its own state in the
> > bitmask.  And writes are much less common than reads for most
> > workloads.
>
> It would be easy enough to add an mm_struct pointer to the
> per-cpu tlbstate struct, and iterate over those.
>
> However, that would be an orthogonal change to optimizing
> lazy TLB mode.
>
> Does the (untested) patch below make sense as a potential
> improvement to the lazy TLB heuristic?
>
> ---8<---
> Subject: x86,tlb: workload dependent per CPU lazy TLB switch
>
> Lazy TLB mode is a tradeoff between flushing the TLB and touching
> the mm_cpumask(_mm) at context switch time, versus potentially
> incurring a remote TLB flush IPI while in lazy TLB mode.
>
> Whether this pays off is likely to be workload dependent more than
> anything else. However, the current heuristic keys off hardware type.
>
> This patch changes the lazy TLB mode heuristic to a dynamic, per-CPU
> decision, dependent on whether we recently received a remote TLB
> shootdown while in lazy TLB mode.
>
> This is a very simple heuristic. When a CPU receives a remote TLB
> shootdown IPI while in lazy TLB mode, a counter in the same cache
> line is set to 16. Every time we skip lazy TLB mode, the counter
> is decremented.
>
> While the counter is zero (no recent TLB flush IPIs), allow lazy TLB mode.

Hmm, cute.  That's not a bad idea at all.  It would be nice to get
some kind of real benchmark on both PCID and !PCID.  If nothing else,
I would expect the threshold (16 in your patch) to want to be lower on
PCID systems.

--Andy


Re: [PATCH 30/32] vfs: Allow cloning of a mount tree with open(O_PATH|O_CLONE_MOUNT) [ver #8]

2018-06-01 Thread Al Viro
On Fri, Jun 01, 2018 at 09:27:43AM +0100, David Howells wrote:
> Al Viro  wrote:
> 
> > > Instead of overloading this on open having a specific syscalls just
> > > seems like a much saner idea.
> > 
> > It's not just mount API; these can be used independently of that.
> > Think of the uses where you pass those to ...at() and you'll see
> > a bunch of applications of that thing.
> 
> I kind of agree with Christoph on this point.  Yes, you can use the resultant
> fd for other things, but that doesn't mean it has to be obtained initially
> through open() or openat() rather than, say, a new pick_mount() syscall.
> 
> Further, having more parameters available gives us the opportunity to change
> the settings on any mounts we create at the point of creation.

open_subtree(int dirfd, const char *pathname, int flags), then?  How would
flags be interpreted?  What I see mapping at that thing is
* equivalent of O_PATH open
* clone subtree, O_PATH open root
* clone one mount, O_PATH open root
and apparently you want to add (orthogonal to that)
* make shared/slave/private/unbindable
* ditto with recursion?
* same for nodev/nosuid/noexec/noatime/nodiratime/relatime/ro/?
as well as usual AT_... flags (empty path, follow)

Choose the encoding...


Re: [PATCH 30/32] vfs: Allow cloning of a mount tree with open(O_PATH|O_CLONE_MOUNT) [ver #8]

2018-06-01 Thread Al Viro
On Fri, Jun 01, 2018 at 09:27:43AM +0100, David Howells wrote:
> Al Viro  wrote:
> 
> > > Instead of overloading this on open having a specific syscalls just
> > > seems like a much saner idea.
> > 
> > It's not just mount API; these can be used independently of that.
> > Think of the uses where you pass those to ...at() and you'll see
> > a bunch of applications of that thing.
> 
> I kind of agree with Christoph on this point.  Yes, you can use the resultant
> fd for other things, but that doesn't mean it has to be obtained initially
> through open() or openat() rather than, say, a new pick_mount() syscall.
> 
> Further, having more parameters available gives us the opportunity to change
> the settings on any mounts we create at the point of creation.

open_subtree(int dirfd, const char *pathname, int flags), then?  How would
flags be interpreted?  What I see mapping at that thing is
* equivalent of O_PATH open
* clone subtree, O_PATH open root
* clone one mount, O_PATH open root
and apparently you want to add (orthogonal to that)
* make shared/slave/private/unbindable
* ditto with recursion?
* same for nodev/nosuid/noexec/noatime/nodiratime/relatime/ro/?
as well as usual AT_... flags (empty path, follow)

Choose the encoding...


Re: [PATCH 2/3] ARM: imx: add cpu idle support for i.MX6SLL

2018-06-01 Thread kbuild test robot
Hi Anson,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on shawnguo/for-next]
[also build test ERROR on v4.17-rc7 next-20180601]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Anson-Huang/ARM-imx-add-L2-page-power-control-for-GPC/20180602-080503
base:   https://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux.git 
for-next
config: arm-arm67 (attached as .config)
compiler: arm-linux-gnueabi-gcc (Debian 7.2.0-11) 7.2.0
reproduce:
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
make.cross ARCH=arm 

All errors (new ones prefixed by >>):

   arch/arm/mach-imx/mach-imx6sl.o: In function `imx6sl_init_late':
>> mach-imx6sl.c:(.init.text+0x2c): undefined reference to `imx6sx_cpuidle_init'

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip


Re: [PATCH 2/3] ARM: imx: add cpu idle support for i.MX6SLL

2018-06-01 Thread kbuild test robot
Hi Anson,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on shawnguo/for-next]
[also build test ERROR on v4.17-rc7 next-20180601]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Anson-Huang/ARM-imx-add-L2-page-power-control-for-GPC/20180602-080503
base:   https://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux.git 
for-next
config: arm-arm67 (attached as .config)
compiler: arm-linux-gnueabi-gcc (Debian 7.2.0-11) 7.2.0
reproduce:
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
make.cross ARCH=arm 

All errors (new ones prefixed by >>):

   arch/arm/mach-imx/mach-imx6sl.o: In function `imx6sl_init_late':
>> mach-imx6sl.c:(.init.text+0x2c): undefined reference to `imx6sx_cpuidle_init'

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip


[PATCH] power: Print wakeup_count instead of event_count in the sysfs attribute.

2018-06-01 Thread Ravi Chandra Sadineni
Currently we show event_count instead of wakeup_count as part of per
device wakeup_count sysfs attribute. Change it to wakeup_count to make
it more meaningful.

Signed-off-by: Ravi Chandra Sadineni 
---
 drivers/base/power/sysfs.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/base/power/sysfs.c b/drivers/base/power/sysfs.c
index 0f651efc58a1a..d713738ce7967 100644
--- a/drivers/base/power/sysfs.c
+++ b/drivers/base/power/sysfs.c
@@ -353,7 +353,7 @@ static ssize_t wakeup_count_show(struct device *dev,
 
spin_lock_irq(>power.lock);
if (dev->power.wakeup) {
-   count = dev->power.wakeup->event_count;
+   count = dev->power.wakeup->wakeup_count;
enabled = true;
}
spin_unlock_irq(>power.lock);
-- 
2.17.1.1185.g55be947832-goog



[PATCH] power: Print wakeup_count instead of event_count in the sysfs attribute.

2018-06-01 Thread Ravi Chandra Sadineni
Currently we show event_count instead of wakeup_count as part of per
device wakeup_count sysfs attribute. Change it to wakeup_count to make
it more meaningful.

Signed-off-by: Ravi Chandra Sadineni 
---
 drivers/base/power/sysfs.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/base/power/sysfs.c b/drivers/base/power/sysfs.c
index 0f651efc58a1a..d713738ce7967 100644
--- a/drivers/base/power/sysfs.c
+++ b/drivers/base/power/sysfs.c
@@ -353,7 +353,7 @@ static ssize_t wakeup_count_show(struct device *dev,
 
spin_lock_irq(>power.lock);
if (dev->power.wakeup) {
-   count = dev->power.wakeup->event_count;
+   count = dev->power.wakeup->wakeup_count;
enabled = true;
}
spin_unlock_irq(>power.lock);
-- 
2.17.1.1185.g55be947832-goog



Re: [PATCH V2] platform/x86: intel_pmc_core: Add CNP SLPS0 debug registers

2018-06-01 Thread David E. Box
On Wed, 2018-05-30 at 17:03 +0530, Rajneesh Bhardwaj wrote:
> On Wed, May 30, 2018 at 03:53:12AM -0700, David E. Box wrote:
> 
> Hi Dave,
> 
> > Hi Rajneesh,
> > 
> > On Mon, 2018-05-28 at 12:30 +0530, Rajneesh Bhardwaj wrote:
> > > On Thu, May 24, 2018 at 06:10:56PM -0700, David E. Box wrote:
> > > 
> > > Thanks for sending this, Dave. Few comments below.
> > > 
> > > > Adds debugfs access to registers in the Cannon Point PCH PMC
> > > > that
> > > > are
> > > 
> > > Please use Cannonlake PCH.
> > > 
> > > > useful for debugging #SLP_S0 signal assertion and other low
> > > > power
> > > > related
> > > 
> > > assertion failures and other low power debug.
> > > 
> > > > activities. Device pm states are latched in these registers
> > > > whenever the
> > > > package enters C10 and can be read from slp_s0_debug_status.
> > > > The pm
> > > > states may also be latched by writing 1 to slp_s0_dbg_latch
> > > > which
> > > > will
> > > > immediately capture the current state on the next read of
> > > > slp_s0_debug_status. Also while in intel_pmc_core.h clean up
> > > > spacing.
> > > > 
> > > 
> > > Initially, unless the latch bit is not set the debugfs file does
> > > not
> > > show
> > > any information as expected but once you write Y to latch file,
> > > the
> > > debugfs
> > > file continues to show blockers even though you write N back
> > > there or
> > > unload
> > > - reload the driver. Please fix this.
> > 
> > See comments below
> > 
> > > 
> > > > Signed-off-by: David E. Box 
> > > > ---
> > > > 
> > > > V2:
> > > > Per Andy Shevchenko:
> > > > - Clear latch bit after use
> > > > - Pass pmc_dev as parameter
> > > > - Use DEFINE_SHOW_ATTRIBUTE macro
> > > > 
> > > >  drivers/platform/x86/intel_pmc_core.c | 112
> > > > ++
> > > >  drivers/platform/x86/intel_pmc_core.h |  27 +---
> > > >  2 files changed, 132 insertions(+), 7 deletions(-)
> > > > 
> > > > diff --git a/drivers/platform/x86/intel_pmc_core.c
> > > > b/drivers/platform/x86/intel_pmc_core.c
> > > > index 43bbe74..ed40999 100644
> > > > --- a/drivers/platform/x86/intel_pmc_core.c
> > > > +++ b/drivers/platform/x86/intel_pmc_core.c
> > > > @@ -196,9 +196,64 @@ static const struct pmc_bit_map
> > > > cnp_pfear_map[] = {
> > > > {}
> > > >  };
> > > >  
> > > > +static const struct pmc_bit_map cnp_slps0_dbg0_map[] = {
> > > > +   {"AUDIO_D3",BIT(0)},
> > > > +   {"OTG_D3",  BIT(1)},
> > > > +   {"XHCI_D3", BIT(2)},
> > > > +   {"LPIO_D3", BIT(3)},
> > > > +   {"SDX_D3",  BIT(4)},
> > > > +   {"SATA_D3", BIT(5)},
> > > > +   {"UFS0_D3", BIT(6)},
> > > > +   {"UFS1_D3", BIT(7)},
> > > > +   {"EMMC_D3", BIT(8)},
> > > > +};
> > > > +
> > > > +static const struct pmc_bit_map cnp_slps0_dbg1_map[] = {
> > > > +   {"SDIO_PLL_OFF",BIT(0)},
> > > > +   {"USB2_PLL_OFF",BIT(1)},
> > > > +   {"AUDIO_PLL_OFF",   BIT(2)},
> > > > +   {"OC_PLL_OFF",  BIT(3)},
> > > 
> > > Please rename to ISCLK_OC as it is the preferred nomenclature for
> > > debug.
> > 
> > Okay
> > 
> > > 
> > > > +   {"MAIN_PLL_OFF",BIT(4)},
> > > 
> > > Ditto, please use ISCLK_MAIN.
> > > 
> > > > +   {"XOSC_OFF",BIT(5)},
> > > > +   {"LPC_CLKS_GATED",  BIT(6)},
> > > > +   {"PCIE_CLKREQS_IDLE",   BIT(7)},
> > > > +   {"AUDIO_ROSC_OFF",  BIT(8)},
> > > > +   {"HPET_XOSC_CLK_REQ",   BIT(9)},
> > > > +   {"PMC_ROSC_SLOW_CLK",   BIT(10)},
> > > > +   {"AON2_ROSC_GATED", BIT(11)},
> > > > +   {"CLKACKS_DEASSERTED",  BIT(12)},
> > > > +};
> > > > +
> > > > +static const struct pmc_bit_map cnp_slps0_dbg2_map[] = {
> > > > +   {"MPHY_CORE_GATED", BIT(0)},
> > > > +   {"CSME_GATED",  BIT(1)},
> > > > +   {"USB2_SUS_GATED",  BIT(2)},
> > > > +   {"DYN_FLEX_IO_IDLE",BIT(3)},
> > > > +   {"GBE_NO_LINK", BIT(4)},
> > > > +   {"THERM_SEN_DISABLED",  BIT(5)},
> > > > +   {"PCIE_LOW_POWER",  BIT(6)},
> > > > +   {"ISH_VNNAON_REQ_ACT",  BIT(7)},
> > > > +   {"ISH_VNN_REQ_ACT", BIT(8)},
> > > > +   {"CNV_VNNAON_REQ_ACT",  BIT(9)},
> > > > +   {"CNV_VNN_REQ_ACT", BIT(10)},
> > > > +   {"NPK_VNNON_REQ_ACT",   BIT(11)},
> > > > +   {"PMSYNC_STATE_IDLE",   BIT(12)},
> > > > +   {"ALST_GT_THRES",   BIT(13)},
> > > > +   {"PMC_ARC_PG_READY",BIT(14)},
> > > > +};
> > > > +
> > > > +static const struct slps0_dbg_map cnp_slps0_dbg_maps[] = {
> > > > +   {cnp_slps0_dbg0_map, ARRAY_SIZE(cnp_slps0_dbg0_map)},
> > > > +   {cnp_slps0_dbg1_map, ARRAY_SIZE(cnp_slps0_dbg1_map)},
> > > > +   {cnp_slps0_dbg2_map, ARRAY_SIZE(cnp_slps0_dbg2_map)},
> > > > +};
> > > > +
> > > >  static const struct pmc_reg_map cnp_reg_map = {
> > > > .pfear_sts = cnp_pfear_map,
> > > >  

Re: [PATCH V2] platform/x86: intel_pmc_core: Add CNP SLPS0 debug registers

2018-06-01 Thread David E. Box
On Wed, 2018-05-30 at 17:03 +0530, Rajneesh Bhardwaj wrote:
> On Wed, May 30, 2018 at 03:53:12AM -0700, David E. Box wrote:
> 
> Hi Dave,
> 
> > Hi Rajneesh,
> > 
> > On Mon, 2018-05-28 at 12:30 +0530, Rajneesh Bhardwaj wrote:
> > > On Thu, May 24, 2018 at 06:10:56PM -0700, David E. Box wrote:
> > > 
> > > Thanks for sending this, Dave. Few comments below.
> > > 
> > > > Adds debugfs access to registers in the Cannon Point PCH PMC
> > > > that
> > > > are
> > > 
> > > Please use Cannonlake PCH.
> > > 
> > > > useful for debugging #SLP_S0 signal assertion and other low
> > > > power
> > > > related
> > > 
> > > assertion failures and other low power debug.
> > > 
> > > > activities. Device pm states are latched in these registers
> > > > whenever the
> > > > package enters C10 and can be read from slp_s0_debug_status.
> > > > The pm
> > > > states may also be latched by writing 1 to slp_s0_dbg_latch
> > > > which
> > > > will
> > > > immediately capture the current state on the next read of
> > > > slp_s0_debug_status. Also while in intel_pmc_core.h clean up
> > > > spacing.
> > > > 
> > > 
> > > Initially, unless the latch bit is not set the debugfs file does
> > > not
> > > show
> > > any information as expected but once you write Y to latch file,
> > > the
> > > debugfs
> > > file continues to show blockers even though you write N back
> > > there or
> > > unload
> > > - reload the driver. Please fix this.
> > 
> > See comments below
> > 
> > > 
> > > > Signed-off-by: David E. Box 
> > > > ---
> > > > 
> > > > V2:
> > > > Per Andy Shevchenko:
> > > > - Clear latch bit after use
> > > > - Pass pmc_dev as parameter
> > > > - Use DEFINE_SHOW_ATTRIBUTE macro
> > > > 
> > > >  drivers/platform/x86/intel_pmc_core.c | 112
> > > > ++
> > > >  drivers/platform/x86/intel_pmc_core.h |  27 +---
> > > >  2 files changed, 132 insertions(+), 7 deletions(-)
> > > > 
> > > > diff --git a/drivers/platform/x86/intel_pmc_core.c
> > > > b/drivers/platform/x86/intel_pmc_core.c
> > > > index 43bbe74..ed40999 100644
> > > > --- a/drivers/platform/x86/intel_pmc_core.c
> > > > +++ b/drivers/platform/x86/intel_pmc_core.c
> > > > @@ -196,9 +196,64 @@ static const struct pmc_bit_map
> > > > cnp_pfear_map[] = {
> > > > {}
> > > >  };
> > > >  
> > > > +static const struct pmc_bit_map cnp_slps0_dbg0_map[] = {
> > > > +   {"AUDIO_D3",BIT(0)},
> > > > +   {"OTG_D3",  BIT(1)},
> > > > +   {"XHCI_D3", BIT(2)},
> > > > +   {"LPIO_D3", BIT(3)},
> > > > +   {"SDX_D3",  BIT(4)},
> > > > +   {"SATA_D3", BIT(5)},
> > > > +   {"UFS0_D3", BIT(6)},
> > > > +   {"UFS1_D3", BIT(7)},
> > > > +   {"EMMC_D3", BIT(8)},
> > > > +};
> > > > +
> > > > +static const struct pmc_bit_map cnp_slps0_dbg1_map[] = {
> > > > +   {"SDIO_PLL_OFF",BIT(0)},
> > > > +   {"USB2_PLL_OFF",BIT(1)},
> > > > +   {"AUDIO_PLL_OFF",   BIT(2)},
> > > > +   {"OC_PLL_OFF",  BIT(3)},
> > > 
> > > Please rename to ISCLK_OC as it is the preferred nomenclature for
> > > debug.
> > 
> > Okay
> > 
> > > 
> > > > +   {"MAIN_PLL_OFF",BIT(4)},
> > > 
> > > Ditto, please use ISCLK_MAIN.
> > > 
> > > > +   {"XOSC_OFF",BIT(5)},
> > > > +   {"LPC_CLKS_GATED",  BIT(6)},
> > > > +   {"PCIE_CLKREQS_IDLE",   BIT(7)},
> > > > +   {"AUDIO_ROSC_OFF",  BIT(8)},
> > > > +   {"HPET_XOSC_CLK_REQ",   BIT(9)},
> > > > +   {"PMC_ROSC_SLOW_CLK",   BIT(10)},
> > > > +   {"AON2_ROSC_GATED", BIT(11)},
> > > > +   {"CLKACKS_DEASSERTED",  BIT(12)},
> > > > +};
> > > > +
> > > > +static const struct pmc_bit_map cnp_slps0_dbg2_map[] = {
> > > > +   {"MPHY_CORE_GATED", BIT(0)},
> > > > +   {"CSME_GATED",  BIT(1)},
> > > > +   {"USB2_SUS_GATED",  BIT(2)},
> > > > +   {"DYN_FLEX_IO_IDLE",BIT(3)},
> > > > +   {"GBE_NO_LINK", BIT(4)},
> > > > +   {"THERM_SEN_DISABLED",  BIT(5)},
> > > > +   {"PCIE_LOW_POWER",  BIT(6)},
> > > > +   {"ISH_VNNAON_REQ_ACT",  BIT(7)},
> > > > +   {"ISH_VNN_REQ_ACT", BIT(8)},
> > > > +   {"CNV_VNNAON_REQ_ACT",  BIT(9)},
> > > > +   {"CNV_VNN_REQ_ACT", BIT(10)},
> > > > +   {"NPK_VNNON_REQ_ACT",   BIT(11)},
> > > > +   {"PMSYNC_STATE_IDLE",   BIT(12)},
> > > > +   {"ALST_GT_THRES",   BIT(13)},
> > > > +   {"PMC_ARC_PG_READY",BIT(14)},
> > > > +};
> > > > +
> > > > +static const struct slps0_dbg_map cnp_slps0_dbg_maps[] = {
> > > > +   {cnp_slps0_dbg0_map, ARRAY_SIZE(cnp_slps0_dbg0_map)},
> > > > +   {cnp_slps0_dbg1_map, ARRAY_SIZE(cnp_slps0_dbg1_map)},
> > > > +   {cnp_slps0_dbg2_map, ARRAY_SIZE(cnp_slps0_dbg2_map)},
> > > > +};
> > > > +
> > > >  static const struct pmc_reg_map cnp_reg_map = {
> > > > .pfear_sts = cnp_pfear_map,
> > > >  

[PATCH v5 2/2] regulator: add QCOM RPMh regulator driver

2018-06-01 Thread David Collins
Add the QCOM RPMh regulator driver to manage PMIC regulators
which are controlled via RPMh on some Qualcomm Technologies, Inc.
SoCs.  RPMh is a hardware block which contains several
accelerators which are used to manage various hardware resources
that are shared between the processors of the SoC.  The final
hardware state of a regulator is determined within RPMh by
performing max aggregation of the requests made by all of the
processors.

Add support for PMIC regulator control via the voltage regulator
manager (VRM) and oscillator buffer (XOB) RPMh accelerators.
VRM supports manipulation of enable state, voltage, and mode.
XOB supports manipulation of enable state.

Signed-off-by: David Collins 
---
 drivers/regulator/Kconfig   |   9 +
 drivers/regulator/Makefile  |   1 +
 drivers/regulator/qcom-rpmh-regulator.c | 770 
 3 files changed, 780 insertions(+)
 create mode 100644 drivers/regulator/qcom-rpmh-regulator.c

diff --git a/drivers/regulator/Kconfig b/drivers/regulator/Kconfig
index 5dbccf5..96b701f 100644
--- a/drivers/regulator/Kconfig
+++ b/drivers/regulator/Kconfig
@@ -682,6 +682,15 @@ config REGULATOR_QCOM_RPM
  Qualcomm RPM as a module. The module will be named
  "qcom_rpm-regulator".
 
+config REGULATOR_QCOM_RPMH
+   tristate "Qualcomm Technologies, Inc. RPMh regulator driver"
+   depends on QCOM_RPMH || COMPILE_TEST
+   help
+ This driver supports control of PMIC regulators via the RPMh hardware
+ block found on Qualcomm Technologies Inc. SoCs.  RPMh regulator
+ control allows for voting on regulator state between multiple
+ processors within the SoC.
+
 config REGULATOR_QCOM_SMD_RPM
tristate "Qualcomm SMD based RPM regulator driver"
depends on QCOM_SMD_RPM
diff --git a/drivers/regulator/Makefile b/drivers/regulator/Makefile
index bd818ce..06e76a6 100644
--- a/drivers/regulator/Makefile
+++ b/drivers/regulator/Makefile
@@ -78,6 +78,7 @@ obj-$(CONFIG_REGULATOR_MT6323)+= mt6323-regulator.o
 obj-$(CONFIG_REGULATOR_MT6380) += mt6380-regulator.o
 obj-$(CONFIG_REGULATOR_MT6397) += mt6397-regulator.o
 obj-$(CONFIG_REGULATOR_QCOM_RPM) += qcom_rpm-regulator.o
+obj-$(CONFIG_REGULATOR_QCOM_RPMH) += qcom-rpmh-regulator.o
 obj-$(CONFIG_REGULATOR_QCOM_SMD_RPM) += qcom_smd-regulator.o
 obj-$(CONFIG_REGULATOR_QCOM_SPMI) += qcom_spmi-regulator.o
 obj-$(CONFIG_REGULATOR_PALMAS) += palmas-regulator.o
diff --git a/drivers/regulator/qcom-rpmh-regulator.c 
b/drivers/regulator/qcom-rpmh-regulator.c
new file mode 100644
index 000..b2af35a
--- /dev/null
+++ b/drivers/regulator/qcom-rpmh-regulator.c
@@ -0,0 +1,770 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2018, The Linux Foundation. All rights reserved. */
+
+#define pr_fmt(fmt) "%s: " fmt, __func__
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+#include 
+
+#include 
+
+/**
+ * enum rpmh_regulator_type - supported RPMh accelerator types
+ * %VRM:   RPMh VRM accelerator which supports voting on enable, voltage,
+ * and mode of LDO, SMPS, and BOB type PMIC regulators.
+ * %XOB:   RPMh XOB accelerator which supports voting on the enable state
+ * of PMIC regulators.
+ */
+enum rpmh_regulator_type {
+   VRM,
+   XOB,
+};
+
+#define RPMH_VRM_HEADROOM_MAX_UV   511000
+
+#define RPMH_REGULATOR_REG_VRM_VOLTAGE 0x0
+#define RPMH_REGULATOR_REG_ENABLE  0x4
+#define RPMH_REGULATOR_REG_VRM_MODE0x8
+#define RPMH_REGULATOR_REG_VRM_HEADROOM0xC
+
+#define RPMH_REGULATOR_MODE_COUNT  4
+
+#define PMIC4_LDO_MODE_RETENTION   4
+#define PMIC4_LDO_MODE_LPM 5
+#define PMIC4_LDO_MODE_HPM 7
+
+#define PMIC4_SMPS_MODE_RETENTION  4
+#define PMIC4_SMPS_MODE_PFM5
+#define PMIC4_SMPS_MODE_AUTO   6
+#define PMIC4_SMPS_MODE_PWM7
+
+#define PMIC4_BOB_MODE_PASS0
+#define PMIC4_BOB_MODE_PFM 1
+#define PMIC4_BOB_MODE_AUTO2
+#define PMIC4_BOB_MODE_PWM 3
+
+/**
+ * struct rpmh_vreg_hw_data - RPMh regulator hardware configurations
+ * @regulator_type:RPMh accelerator type used to manage this
+ * regulator
+ * @ops:   Pointer to regulator ops callback structure
+ * @voltage_range: The single range of voltages supported by this
+ * PMIC regulator type
+ * @n_voltages:The number of unique voltage set points 
defined
+ * by voltage_range
+ * @hpm_min_load_uA:   Minimum load current in microamps that requires
+ * high power mode (HPM) operation.  This is used
+ *

[PATCH v5 2/2] regulator: add QCOM RPMh regulator driver

2018-06-01 Thread David Collins
Add the QCOM RPMh regulator driver to manage PMIC regulators
which are controlled via RPMh on some Qualcomm Technologies, Inc.
SoCs.  RPMh is a hardware block which contains several
accelerators which are used to manage various hardware resources
that are shared between the processors of the SoC.  The final
hardware state of a regulator is determined within RPMh by
performing max aggregation of the requests made by all of the
processors.

Add support for PMIC regulator control via the voltage regulator
manager (VRM) and oscillator buffer (XOB) RPMh accelerators.
VRM supports manipulation of enable state, voltage, and mode.
XOB supports manipulation of enable state.

Signed-off-by: David Collins 
---
 drivers/regulator/Kconfig   |   9 +
 drivers/regulator/Makefile  |   1 +
 drivers/regulator/qcom-rpmh-regulator.c | 770 
 3 files changed, 780 insertions(+)
 create mode 100644 drivers/regulator/qcom-rpmh-regulator.c

diff --git a/drivers/regulator/Kconfig b/drivers/regulator/Kconfig
index 5dbccf5..96b701f 100644
--- a/drivers/regulator/Kconfig
+++ b/drivers/regulator/Kconfig
@@ -682,6 +682,15 @@ config REGULATOR_QCOM_RPM
  Qualcomm RPM as a module. The module will be named
  "qcom_rpm-regulator".
 
+config REGULATOR_QCOM_RPMH
+   tristate "Qualcomm Technologies, Inc. RPMh regulator driver"
+   depends on QCOM_RPMH || COMPILE_TEST
+   help
+ This driver supports control of PMIC regulators via the RPMh hardware
+ block found on Qualcomm Technologies Inc. SoCs.  RPMh regulator
+ control allows for voting on regulator state between multiple
+ processors within the SoC.
+
 config REGULATOR_QCOM_SMD_RPM
tristate "Qualcomm SMD based RPM regulator driver"
depends on QCOM_SMD_RPM
diff --git a/drivers/regulator/Makefile b/drivers/regulator/Makefile
index bd818ce..06e76a6 100644
--- a/drivers/regulator/Makefile
+++ b/drivers/regulator/Makefile
@@ -78,6 +78,7 @@ obj-$(CONFIG_REGULATOR_MT6323)+= mt6323-regulator.o
 obj-$(CONFIG_REGULATOR_MT6380) += mt6380-regulator.o
 obj-$(CONFIG_REGULATOR_MT6397) += mt6397-regulator.o
 obj-$(CONFIG_REGULATOR_QCOM_RPM) += qcom_rpm-regulator.o
+obj-$(CONFIG_REGULATOR_QCOM_RPMH) += qcom-rpmh-regulator.o
 obj-$(CONFIG_REGULATOR_QCOM_SMD_RPM) += qcom_smd-regulator.o
 obj-$(CONFIG_REGULATOR_QCOM_SPMI) += qcom_spmi-regulator.o
 obj-$(CONFIG_REGULATOR_PALMAS) += palmas-regulator.o
diff --git a/drivers/regulator/qcom-rpmh-regulator.c 
b/drivers/regulator/qcom-rpmh-regulator.c
new file mode 100644
index 000..b2af35a
--- /dev/null
+++ b/drivers/regulator/qcom-rpmh-regulator.c
@@ -0,0 +1,770 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2018, The Linux Foundation. All rights reserved. */
+
+#define pr_fmt(fmt) "%s: " fmt, __func__
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+#include 
+
+#include 
+
+/**
+ * enum rpmh_regulator_type - supported RPMh accelerator types
+ * %VRM:   RPMh VRM accelerator which supports voting on enable, voltage,
+ * and mode of LDO, SMPS, and BOB type PMIC regulators.
+ * %XOB:   RPMh XOB accelerator which supports voting on the enable state
+ * of PMIC regulators.
+ */
+enum rpmh_regulator_type {
+   VRM,
+   XOB,
+};
+
+#define RPMH_VRM_HEADROOM_MAX_UV   511000
+
+#define RPMH_REGULATOR_REG_VRM_VOLTAGE 0x0
+#define RPMH_REGULATOR_REG_ENABLE  0x4
+#define RPMH_REGULATOR_REG_VRM_MODE0x8
+#define RPMH_REGULATOR_REG_VRM_HEADROOM0xC
+
+#define RPMH_REGULATOR_MODE_COUNT  4
+
+#define PMIC4_LDO_MODE_RETENTION   4
+#define PMIC4_LDO_MODE_LPM 5
+#define PMIC4_LDO_MODE_HPM 7
+
+#define PMIC4_SMPS_MODE_RETENTION  4
+#define PMIC4_SMPS_MODE_PFM5
+#define PMIC4_SMPS_MODE_AUTO   6
+#define PMIC4_SMPS_MODE_PWM7
+
+#define PMIC4_BOB_MODE_PASS0
+#define PMIC4_BOB_MODE_PFM 1
+#define PMIC4_BOB_MODE_AUTO2
+#define PMIC4_BOB_MODE_PWM 3
+
+/**
+ * struct rpmh_vreg_hw_data - RPMh regulator hardware configurations
+ * @regulator_type:RPMh accelerator type used to manage this
+ * regulator
+ * @ops:   Pointer to regulator ops callback structure
+ * @voltage_range: The single range of voltages supported by this
+ * PMIC regulator type
+ * @n_voltages:The number of unique voltage set points 
defined
+ * by voltage_range
+ * @hpm_min_load_uA:   Minimum load current in microamps that requires
+ * high power mode (HPM) operation.  This is used
+ *

[PATCH v5 0/2] regulator: add QCOM RPMh regulator driver

2018-06-01 Thread David Collins
This patch series adds a driver and device tree binding documentation for
PMIC regulator control via Resource Power Manager-hardened (RPMh) on some
Qualcomm Technologies, Inc. SoCs such as SDM845.  RPMh is a hardware block
which contains several accelerators which are used to manage various
hardware resources that are shared between the processors of the SoC.  The
final hardware state of a regulator is determined within RPMh by performing
max aggregation of the requests made by all of the processors.

The RPMh regulator driver depends upon the RPMh driver [1] and command DB
driver [2] which are both still undergoing review.  It also depends upon
three recent regulator changes: [3], [4], and [5].

Changes since v4 [6]:
 - Removed support for DT properties qcom,regulator-drms-modes and
   qcom,drms-mode-max-microamps
 - Specified fixed DRMS high power mode minimum limits for LDO type
   regulators
 - Removed DRMS support for SMPS and BOB type regulators
 - Simplified voltage caching logic

Changes since v3 [7]:
 - Removed support for DT properties qcom,regulator-initial-microvolt
   and qcom,headroom-microvolt
 - Renamed DT property qcom,allowed-drms-modes to be
   qcom,regulator-drms-modes
 - Updated DT binding documentation to mention which common regulator
   bindings can be used for qcom-rpmh-regulator devices
 - Added voltage caching so that voltage requests are only sent to RPMh
   after the regulator has been enabled at least once
 - Changed 'voltage_selector' default value to be -ENOTRECOVERABLE to
   interact with [5]
 - Initialized 'enabled' to -EINVAL so that unused regulators are
   disabled by regulator_late_cleanup()
 - Removed rpmh_regulator_load_default_parameters() as it is no longer
   needed
 - Updated the mode usage description in qcom,rpmh-regulator.h

Changes since v2 [8]:
 - Replaced '_' with '-' in device tree supply property names
 - Renamed qcom_rpmh-regulator.c to be qcom-rpmh-regulator.c
 - Updated various DT property names to use "microvolt" and "microamp"
 - Moved allowed modes constraint specification out of the driver [4]
 - Replaced rpmh_client with device pointer to match new RPMh API [1]
 - Corrected drms mode threshold checking
 - Initialized voltage_selector to -EINVAL when not specified in DT
 - Added constants for PMIC regulator hardware modes
 - Corrected type sign of mode mapping tables
 - Made variable names for mode arrays plural
 - Simplified Kconfig depends on
 - Removed unnecessary constants and struct fields
 - Added some descriptive comments

Changes since v1 [9]:
 - Addressed review feedback from Doug, Mark, and Stephen
 - Replaced set_voltage()/get_voltage() callbacks with set_voltage_sel()/
   get_voltage_sel()
 - Added set_bypass()/get_bypass() callbacks for BOB pass-through mode
   control
 - Removed top-level PMIC data structures
 - Removed initialization variables from structs and passed them as
   function parameters
 - Removed various comments and error messages
 - Simplified mode handling
 - Refactored per-PMIC rpmh-regulator data specification
 - Simplified probe function
 - Moved header into DT patch
 - Removed redundant property listings from DT binding documentation

[1]: https://lkml.org/lkml/2018/5/9/729
[2]: https://lkml.org/lkml/2018/4/10/714
[3]: https://lkml.org/lkml/2018/4/18/556
[4]: https://lkml.org/lkml/2018/5/11/696
[5]: https://lkml.org/lkml/2018/5/15/1005
[6]: https://lkml.org/lkml/2018/5/22/1168
[7]: https://lkml.org/lkml/2018/5/11/701
[8]: https://lkml.org/lkml/2018/4/13/687
[9]: https://lkml.org/lkml/2018/3/16/1431

David Collins (2):
  regulator: dt-bindings: add QCOM RPMh regulator bindings
  regulator: add QCOM RPMh regulator driver

 .../bindings/regulator/qcom,rpmh-regulator.txt | 160 +
 drivers/regulator/Kconfig  |   9 +
 drivers/regulator/Makefile |   1 +
 drivers/regulator/qcom-rpmh-regulator.c| 770 +
 .../dt-bindings/regulator/qcom,rpmh-regulator.h|  36 +
 5 files changed, 976 insertions(+)
 create mode 100644 
Documentation/devicetree/bindings/regulator/qcom,rpmh-regulator.txt
 create mode 100644 drivers/regulator/qcom-rpmh-regulator.c
 create mode 100644 include/dt-bindings/regulator/qcom,rpmh-regulator.h

-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project



[PATCH v5 1/2] regulator: dt-bindings: add QCOM RPMh regulator bindings

2018-06-01 Thread David Collins
Introduce bindings for RPMh regulator devices found on some
Qualcomm Technlogies, Inc. SoCs.  These devices allow a given
processor within the SoC to make PMIC regulator requests which
are aggregated within the RPMh hardware block along with requests
from other processors in the SoC to determine the final PMIC
regulator hardware state.

Signed-off-by: David Collins 
---
 .../bindings/regulator/qcom,rpmh-regulator.txt | 160 +
 .../dt-bindings/regulator/qcom,rpmh-regulator.h|  36 +
 2 files changed, 196 insertions(+)
 create mode 100644 
Documentation/devicetree/bindings/regulator/qcom,rpmh-regulator.txt
 create mode 100644 include/dt-bindings/regulator/qcom,rpmh-regulator.h

diff --git 
a/Documentation/devicetree/bindings/regulator/qcom,rpmh-regulator.txt 
b/Documentation/devicetree/bindings/regulator/qcom,rpmh-regulator.txt
new file mode 100644
index 000..7ef2dbe
--- /dev/null
+++ b/Documentation/devicetree/bindings/regulator/qcom,rpmh-regulator.txt
@@ -0,0 +1,160 @@
+Qualcomm Technologies, Inc. RPMh Regulators
+
+rpmh-regulator devices support PMIC regulator management via the Voltage
+Regulator Manager (VRM) and Oscillator Buffer (XOB) RPMh accelerators.  The 
APPS
+processor communicates with these hardware blocks via a Resource State
+Coordinator (RSC) using command packets.  The VRM allows changing three
+parameters for a given regulator: enable state, output voltage, and operating
+mode.  The XOB allows changing only a single parameter for a given regulator:
+its enable state.  Despite its name, the XOB is capable of controlling the
+enable state of any PMIC peripheral.  It is used for clock buffers, low-voltage
+switches, and LDO/SMPS regulators which have a fixed voltage and mode.
+
+===
+Required Node Structure
+===
+
+RPMh regulators must be described in two levels of device nodes.  The first
+level describes the PMIC containing the regulators and must reside within an
+RPMh device node.  The second level describes each regulator within the PMIC
+which is to be used on the board.  Each of these regulators maps to a single
+RPMh resource.
+
+The names used for regulator nodes must match those supported by a given PMIC.
+Supported regulator node names:
+   PM8998: smps1 - smps13, ldo1 - ldo28, lvs1 - lvs2
+   PMI8998:bob
+   PM8005: smps1 - smps4
+
+
+First Level Nodes - PMIC
+
+
+- compatible
+   Usage:  required
+   Value type: 
+   Definition: Must be one of: "qcom,pm8998-rpmh-regulators",
+   "qcom,pmi8998-rpmh-regulators" or
+   "qcom,pm8005-rpmh-regulators".
+
+- qcom,pmic-id
+   Usage:  required
+   Value type: 
+   Definition: RPMh resource name suffix used for the regulators found on
+   this PMIC.  Typical values: "a", "b", "c", "d", "e", "f".
+
+- vdd-s1-supply
+- vdd-s2-supply
+- vdd-s3-supply
+- vdd-s4-supply
+   Usage:  optional (PM8998 and PM8005 only)
+   Value type: 
+   Definition: phandle of the parent supply regulator of one or more of the
+   regulators for this PMIC.
+
+- vdd-s5-supply
+- vdd-s6-supply
+- vdd-s7-supply
+- vdd-s8-supply
+- vdd-s9-supply
+- vdd-s10-supply
+- vdd-s11-supply
+- vdd-s12-supply
+- vdd-s13-supply
+- vdd-l1-l27-supply
+- vdd-l2-l8-l17-supply
+- vdd-l3-l11-supply
+- vdd-l4-l5-supply
+- vdd-l6-supply
+- vdd-l7-l12-l14-l15-supply
+- vdd-l9-supply
+- vdd-l10-l23-l25-supply
+- vdd-l13-l19-l21-supply
+- vdd-l16-l28-supply
+- vdd-l18-l22-supply
+- vdd-l20-l24-supply
+- vdd-l26-supply
+- vin-lvs-1-2-supply
+   Usage:  optional (PM8998 only)
+   Value type: 
+   Definition: phandle of the parent supply regulator of one or more of the
+   regulators for this PMIC.
+
+- vdd-bob-supply
+   Usage:  optional (PMI8998 only)
+   Value type: 
+   Definition: BOB regulator parent supply phandle
+
+===
+Second Level Nodes - Regulators
+===
+
+- qcom,always-wait-for-ack
+   Usage:  optional
+   Value type: 
+   Definition: Boolean flag which indicates that the application processor
+   must wait for an ACK or a NACK from RPMh for every request
+   sent for this regulator including those which are for a
+   strictly lower power state.
+
+Other properties defined in Documentation/devicetree/bindings/regulator.txt
+may also be used.  regulator-initial-mode and regulator-allowed-modes may be
+specified for VRM regulators using mode values from
+include/dt-bindings/regulator/qcom,rpmh-regulator.h.  regulator-allow-bypass
+may be specified for BOB type regulators managed via VRM.
+regulator-allow-set-load may be specified for LDO type regulators managed via
+VRM.
+
+
+Examples
+
+
+#include 
+
+_rsc {
+   

[PATCH v5 0/2] regulator: add QCOM RPMh regulator driver

2018-06-01 Thread David Collins
This patch series adds a driver and device tree binding documentation for
PMIC regulator control via Resource Power Manager-hardened (RPMh) on some
Qualcomm Technologies, Inc. SoCs such as SDM845.  RPMh is a hardware block
which contains several accelerators which are used to manage various
hardware resources that are shared between the processors of the SoC.  The
final hardware state of a regulator is determined within RPMh by performing
max aggregation of the requests made by all of the processors.

The RPMh regulator driver depends upon the RPMh driver [1] and command DB
driver [2] which are both still undergoing review.  It also depends upon
three recent regulator changes: [3], [4], and [5].

Changes since v4 [6]:
 - Removed support for DT properties qcom,regulator-drms-modes and
   qcom,drms-mode-max-microamps
 - Specified fixed DRMS high power mode minimum limits for LDO type
   regulators
 - Removed DRMS support for SMPS and BOB type regulators
 - Simplified voltage caching logic

Changes since v3 [7]:
 - Removed support for DT properties qcom,regulator-initial-microvolt
   and qcom,headroom-microvolt
 - Renamed DT property qcom,allowed-drms-modes to be
   qcom,regulator-drms-modes
 - Updated DT binding documentation to mention which common regulator
   bindings can be used for qcom-rpmh-regulator devices
 - Added voltage caching so that voltage requests are only sent to RPMh
   after the regulator has been enabled at least once
 - Changed 'voltage_selector' default value to be -ENOTRECOVERABLE to
   interact with [5]
 - Initialized 'enabled' to -EINVAL so that unused regulators are
   disabled by regulator_late_cleanup()
 - Removed rpmh_regulator_load_default_parameters() as it is no longer
   needed
 - Updated the mode usage description in qcom,rpmh-regulator.h

Changes since v2 [8]:
 - Replaced '_' with '-' in device tree supply property names
 - Renamed qcom_rpmh-regulator.c to be qcom-rpmh-regulator.c
 - Updated various DT property names to use "microvolt" and "microamp"
 - Moved allowed modes constraint specification out of the driver [4]
 - Replaced rpmh_client with device pointer to match new RPMh API [1]
 - Corrected drms mode threshold checking
 - Initialized voltage_selector to -EINVAL when not specified in DT
 - Added constants for PMIC regulator hardware modes
 - Corrected type sign of mode mapping tables
 - Made variable names for mode arrays plural
 - Simplified Kconfig depends on
 - Removed unnecessary constants and struct fields
 - Added some descriptive comments

Changes since v1 [9]:
 - Addressed review feedback from Doug, Mark, and Stephen
 - Replaced set_voltage()/get_voltage() callbacks with set_voltage_sel()/
   get_voltage_sel()
 - Added set_bypass()/get_bypass() callbacks for BOB pass-through mode
   control
 - Removed top-level PMIC data structures
 - Removed initialization variables from structs and passed them as
   function parameters
 - Removed various comments and error messages
 - Simplified mode handling
 - Refactored per-PMIC rpmh-regulator data specification
 - Simplified probe function
 - Moved header into DT patch
 - Removed redundant property listings from DT binding documentation

[1]: https://lkml.org/lkml/2018/5/9/729
[2]: https://lkml.org/lkml/2018/4/10/714
[3]: https://lkml.org/lkml/2018/4/18/556
[4]: https://lkml.org/lkml/2018/5/11/696
[5]: https://lkml.org/lkml/2018/5/15/1005
[6]: https://lkml.org/lkml/2018/5/22/1168
[7]: https://lkml.org/lkml/2018/5/11/701
[8]: https://lkml.org/lkml/2018/4/13/687
[9]: https://lkml.org/lkml/2018/3/16/1431

David Collins (2):
  regulator: dt-bindings: add QCOM RPMh regulator bindings
  regulator: add QCOM RPMh regulator driver

 .../bindings/regulator/qcom,rpmh-regulator.txt | 160 +
 drivers/regulator/Kconfig  |   9 +
 drivers/regulator/Makefile |   1 +
 drivers/regulator/qcom-rpmh-regulator.c| 770 +
 .../dt-bindings/regulator/qcom,rpmh-regulator.h|  36 +
 5 files changed, 976 insertions(+)
 create mode 100644 
Documentation/devicetree/bindings/regulator/qcom,rpmh-regulator.txt
 create mode 100644 drivers/regulator/qcom-rpmh-regulator.c
 create mode 100644 include/dt-bindings/regulator/qcom,rpmh-regulator.h

-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project



[PATCH v5 1/2] regulator: dt-bindings: add QCOM RPMh regulator bindings

2018-06-01 Thread David Collins
Introduce bindings for RPMh regulator devices found on some
Qualcomm Technlogies, Inc. SoCs.  These devices allow a given
processor within the SoC to make PMIC regulator requests which
are aggregated within the RPMh hardware block along with requests
from other processors in the SoC to determine the final PMIC
regulator hardware state.

Signed-off-by: David Collins 
---
 .../bindings/regulator/qcom,rpmh-regulator.txt | 160 +
 .../dt-bindings/regulator/qcom,rpmh-regulator.h|  36 +
 2 files changed, 196 insertions(+)
 create mode 100644 
Documentation/devicetree/bindings/regulator/qcom,rpmh-regulator.txt
 create mode 100644 include/dt-bindings/regulator/qcom,rpmh-regulator.h

diff --git 
a/Documentation/devicetree/bindings/regulator/qcom,rpmh-regulator.txt 
b/Documentation/devicetree/bindings/regulator/qcom,rpmh-regulator.txt
new file mode 100644
index 000..7ef2dbe
--- /dev/null
+++ b/Documentation/devicetree/bindings/regulator/qcom,rpmh-regulator.txt
@@ -0,0 +1,160 @@
+Qualcomm Technologies, Inc. RPMh Regulators
+
+rpmh-regulator devices support PMIC regulator management via the Voltage
+Regulator Manager (VRM) and Oscillator Buffer (XOB) RPMh accelerators.  The 
APPS
+processor communicates with these hardware blocks via a Resource State
+Coordinator (RSC) using command packets.  The VRM allows changing three
+parameters for a given regulator: enable state, output voltage, and operating
+mode.  The XOB allows changing only a single parameter for a given regulator:
+its enable state.  Despite its name, the XOB is capable of controlling the
+enable state of any PMIC peripheral.  It is used for clock buffers, low-voltage
+switches, and LDO/SMPS regulators which have a fixed voltage and mode.
+
+===
+Required Node Structure
+===
+
+RPMh regulators must be described in two levels of device nodes.  The first
+level describes the PMIC containing the regulators and must reside within an
+RPMh device node.  The second level describes each regulator within the PMIC
+which is to be used on the board.  Each of these regulators maps to a single
+RPMh resource.
+
+The names used for regulator nodes must match those supported by a given PMIC.
+Supported regulator node names:
+   PM8998: smps1 - smps13, ldo1 - ldo28, lvs1 - lvs2
+   PMI8998:bob
+   PM8005: smps1 - smps4
+
+
+First Level Nodes - PMIC
+
+
+- compatible
+   Usage:  required
+   Value type: 
+   Definition: Must be one of: "qcom,pm8998-rpmh-regulators",
+   "qcom,pmi8998-rpmh-regulators" or
+   "qcom,pm8005-rpmh-regulators".
+
+- qcom,pmic-id
+   Usage:  required
+   Value type: 
+   Definition: RPMh resource name suffix used for the regulators found on
+   this PMIC.  Typical values: "a", "b", "c", "d", "e", "f".
+
+- vdd-s1-supply
+- vdd-s2-supply
+- vdd-s3-supply
+- vdd-s4-supply
+   Usage:  optional (PM8998 and PM8005 only)
+   Value type: 
+   Definition: phandle of the parent supply regulator of one or more of the
+   regulators for this PMIC.
+
+- vdd-s5-supply
+- vdd-s6-supply
+- vdd-s7-supply
+- vdd-s8-supply
+- vdd-s9-supply
+- vdd-s10-supply
+- vdd-s11-supply
+- vdd-s12-supply
+- vdd-s13-supply
+- vdd-l1-l27-supply
+- vdd-l2-l8-l17-supply
+- vdd-l3-l11-supply
+- vdd-l4-l5-supply
+- vdd-l6-supply
+- vdd-l7-l12-l14-l15-supply
+- vdd-l9-supply
+- vdd-l10-l23-l25-supply
+- vdd-l13-l19-l21-supply
+- vdd-l16-l28-supply
+- vdd-l18-l22-supply
+- vdd-l20-l24-supply
+- vdd-l26-supply
+- vin-lvs-1-2-supply
+   Usage:  optional (PM8998 only)
+   Value type: 
+   Definition: phandle of the parent supply regulator of one or more of the
+   regulators for this PMIC.
+
+- vdd-bob-supply
+   Usage:  optional (PMI8998 only)
+   Value type: 
+   Definition: BOB regulator parent supply phandle
+
+===
+Second Level Nodes - Regulators
+===
+
+- qcom,always-wait-for-ack
+   Usage:  optional
+   Value type: 
+   Definition: Boolean flag which indicates that the application processor
+   must wait for an ACK or a NACK from RPMh for every request
+   sent for this regulator including those which are for a
+   strictly lower power state.
+
+Other properties defined in Documentation/devicetree/bindings/regulator.txt
+may also be used.  regulator-initial-mode and regulator-allowed-modes may be
+specified for VRM regulators using mode values from
+include/dt-bindings/regulator/qcom,rpmh-regulator.h.  regulator-allow-bypass
+may be specified for BOB type regulators managed via VRM.
+regulator-allow-set-load may be specified for LDO type regulators managed via
+VRM.
+
+
+Examples
+
+
+#include 
+
+_rsc {
+   

Re: linux-next: manual merge of the nvdimm tree with the xfs tree

2018-06-01 Thread Stephen Rothwell
Hi Darrick,

On Fri, 1 Jun 2018 17:59:48 -0700 "Darrick J. Wong"  
wrote:
>
> > +   if (!dax_enabled) {
> >  -  pr_debug("VFS (%s): error: dax support not enabled\n",
> >  -  sb->s_id);
> >  +  pr_debug("%s: error: dax support not enabled\n",
> >  +  bdevname(bdev, buf));
> > -   return false;
> > +   return -EOPNOTSUPP;  
> 
> Hang on a sec, the changes in the xfs tree make this function return a
> boolean (true for dax-is-supported, false for dax-not-supported), but
> this change partially reverts the boolean return values.

OK, weird, that is what I though I had done.  Thanks for pointing it
out (I guess it was getting late :-().

> > }
> > - 
> > -   return true;
> > +   return 0;  
> 
> The merge should retain the 'return false' above and the 'return true'
> here... or possibly just return dax_enabled:
> 
> if (!dax_enabled) {
>   pr_debug(...);
>   pr_debug(...);
> }
> 
> return dax_enabled;

OK, I have fixed up my merge resolution.  The end of that function will
now look like this:

if (IS_ENABLED(CONFIG_FS_DAX_LIMITED) && pfn_t_special(pfn)) {
/*
 * An arch that has enabled the pmem api should also
 * have its drivers support pfn_t_devmap()
 *
 * This is a developer warning and should not trigger in
 * production. dax_flush() will crash since it depends
 * on being able to do (page_address(pfn_to_page())).
 */
WARN_ON(IS_ENABLED(CONFIG_ARCH_HAS_PMEM_API));
dax_enabled = true;
} else if (pfn_t_devmap(pfn)) {
struct dev_pagemap *pgmap;

pgmap = get_dev_pagemap(pfn_t_to_pfn(pfn), NULL);
if (pgmap && pgmap->type == MEMORY_DEVICE_FS_DAX)
dax_enabled = true;
put_dev_pagemap(pgmap);
}

if (!dax_enabled)
pr_debug("%s: error: dax support not enabled\n",
bdevname(bdev, buf));

return dax_enabled;
}
EXPORT_SYMBOL_GPL(__bdev_dax_supported);

-- 
Cheers,
Stephen Rothwell


pgpQM1JB8VUDJ.pgp
Description: OpenPGP digital signature


Re: linux-next: manual merge of the nvdimm tree with the xfs tree

2018-06-01 Thread Stephen Rothwell
Hi Darrick,

On Fri, 1 Jun 2018 17:59:48 -0700 "Darrick J. Wong"  
wrote:
>
> > +   if (!dax_enabled) {
> >  -  pr_debug("VFS (%s): error: dax support not enabled\n",
> >  -  sb->s_id);
> >  +  pr_debug("%s: error: dax support not enabled\n",
> >  +  bdevname(bdev, buf));
> > -   return false;
> > +   return -EOPNOTSUPP;  
> 
> Hang on a sec, the changes in the xfs tree make this function return a
> boolean (true for dax-is-supported, false for dax-not-supported), but
> this change partially reverts the boolean return values.

OK, weird, that is what I though I had done.  Thanks for pointing it
out (I guess it was getting late :-().

> > }
> > - 
> > -   return true;
> > +   return 0;  
> 
> The merge should retain the 'return false' above and the 'return true'
> here... or possibly just return dax_enabled:
> 
> if (!dax_enabled) {
>   pr_debug(...);
>   pr_debug(...);
> }
> 
> return dax_enabled;

OK, I have fixed up my merge resolution.  The end of that function will
now look like this:

if (IS_ENABLED(CONFIG_FS_DAX_LIMITED) && pfn_t_special(pfn)) {
/*
 * An arch that has enabled the pmem api should also
 * have its drivers support pfn_t_devmap()
 *
 * This is a developer warning and should not trigger in
 * production. dax_flush() will crash since it depends
 * on being able to do (page_address(pfn_to_page())).
 */
WARN_ON(IS_ENABLED(CONFIG_ARCH_HAS_PMEM_API));
dax_enabled = true;
} else if (pfn_t_devmap(pfn)) {
struct dev_pagemap *pgmap;

pgmap = get_dev_pagemap(pfn_t_to_pfn(pfn), NULL);
if (pgmap && pgmap->type == MEMORY_DEVICE_FS_DAX)
dax_enabled = true;
put_dev_pagemap(pgmap);
}

if (!dax_enabled)
pr_debug("%s: error: dax support not enabled\n",
bdevname(bdev, buf));

return dax_enabled;
}
EXPORT_SYMBOL_GPL(__bdev_dax_supported);

-- 
Cheers,
Stephen Rothwell


pgpQM1JB8VUDJ.pgp
Description: OpenPGP digital signature


[PATCH v3 2/2] dt-bindings: power: sbs-battery: re-document "ti,bq20z75"

2018-06-01 Thread Brian Norris
This compatible property was documented before the driver was renamed to
"SBS" (see commit e57f1b68c406 ("devicetree-bindings: Propagate
bq20z75->sbs rename to dt bindings")). The driver has continued to
support this property as an alternative to "sbs,sbs-battery", and
because we've noticed there are some lingering TI specifics (in the
manufacturer-specific portion of the SBS spec), we'd like to start using
this property again to differentiate.

Signed-off-by: Brian Norris 
Acked-by: Rhyland Klein 
---
v2: add Rhyland's Acked-by
v3: no change
---
 .../devicetree/bindings/power/supply/sbs_sbs-battery.txt| 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/devicetree/bindings/power/supply/sbs_sbs-battery.txt 
b/Documentation/devicetree/bindings/power/supply/sbs_sbs-battery.txt
index c40e8926facf..a7a9c3366f82 100644
--- a/Documentation/devicetree/bindings/power/supply/sbs_sbs-battery.txt
+++ b/Documentation/devicetree/bindings/power/supply/sbs_sbs-battery.txt
@@ -2,7 +2,7 @@ SBS sbs-battery
 ~~
 
 Required properties :
- - compatible : "sbs,sbs-battery"
+ - compatible : "sbs,sbs-battery" or "ti,bq20z75"
 
 Optional properties :
  - sbs,i2c-retry-count : The number of times to retry i2c transactions on i2c
-- 
2.17.1.1185.g55be947832-goog



[PATCH v3 1/2] power: supply: sbs-battery: don't assume MANUFACTURER_DATA formats

2018-06-01 Thread Brian Norris
This driver was originally submitted for the TI BQ20Z75 battery IC
(commit a7640bfa10c5 ("power_supply: Add driver for TI BQ20Z75 gas gauge
IC")) and later renamed to express generic SBS support. While it's
mostly true that this driver implemented a standard SBS command set, it
takes liberties with the REG_MANUFACTURER_DATA register. This register
is specified in the SBS spec, but it doesn't make any mention of what
its actual contents are.

We've sort of noticed this optionality previously, with commit
17c6d3979e5b ("sbs-battery: make writes to ManufacturerAccess
optional"), where we found that some batteries NAK writes to this
register.

What this really means is that so far, we've just been lucky that most
batteries have either been compatible with the TI chip, or else at least
haven't reported highly-unexpected values.

For instance, one battery I have here seems to report either 0x or
0x0100 to the MANUFACTURER_ACCESS_STATUS command -- while this seems to
match either Wake Up (bits[11:8] = b) or Normal Discharge
(bits[11:8] = 0001b) status for the TI part [1], they don't seem to
actually correspond to real states (for instance, I never see 0101b =
Charge, even when charging).

On other batteries, I'm getting apparently random data in return, which
means that occasionally, we interpret this as "battery not present" or
"battery is not healthy".

All in all, it seems to be a really bad idea to make assumptions about
REG_MANUFACTURER_DATA, unless we already know what battery we're using.
Therefore, this patch reimplements the "present" and "health" checks to
the following on most SBS batteries:

1. HEALTH: report "unknown" -- I couldn't find a standard SBS command
   that gives us much useful here
2. PRESENT: just send a REG_STATUS command; if it succeeds, then the
   battery is present

Also, we stop sending MANUFACTURER_ACCESS_SLEEP to non-TI parts. I have
no proof that this is useful and supported.

If someone explicitly provided a 'ti,bq20z75' compatible property, then
we retain the existing TI command behaviors.

[1] http://www.ti.com/lit/er/sluu265a/sluu265a.pdf

Signed-off-by: Brian Norris 
Reviewed-by: Guenter Roeck 
Acked-by: Rhyland Klein 
---
v2:
 * don't stub out POWER_SUPPLY_PROP_PRESENT from sbs_data[]
 * use if/else instead of switch/case

v3:
 * pull 'return 0' out of if/else, to satisfy braindead tooling
---
 drivers/power/supply/sbs-battery.c | 54 +-
 1 file changed, 46 insertions(+), 8 deletions(-)

diff --git a/drivers/power/supply/sbs-battery.c 
b/drivers/power/supply/sbs-battery.c
index 83d7b4115857..a9691ea42f44 100644
--- a/drivers/power/supply/sbs-battery.c
+++ b/drivers/power/supply/sbs-battery.c
@@ -23,6 +23,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -156,6 +157,9 @@ static enum power_supply_property sbs_properties[] = {
POWER_SUPPLY_PROP_MODEL_NAME
 };
 
+/* Supports special manufacturer commands from TI BQ20Z75 IC. */
+#define SBS_FLAGS_TI_BQ20Z75   BIT(0)
+
 struct sbs_info {
struct i2c_client   *client;
struct power_supply *power_supply;
@@ -168,6 +172,7 @@ struct sbs_info {
u32 poll_retry_count;
struct delayed_work work;
struct mutexmode_lock;
+   u32 flags;
 };
 
 static char model_name[I2C_SMBUS_BLOCK_MAX + 1];
@@ -315,6 +320,27 @@ static int sbs_status_correct(struct i2c_client *client, 
int *intval)
 static int sbs_get_battery_presence_and_health(
struct i2c_client *client, enum power_supply_property psp,
union power_supply_propval *val)
+{
+   int ret;
+
+   if (psp == POWER_SUPPLY_PROP_PRESENT) {
+   /* Dummy command; if it succeeds, battery is present. */
+   ret = sbs_read_word_data(client, sbs_data[REG_STATUS].addr);
+   if (ret < 0)
+   val->intval = 0; /* battery disconnected */
+   else
+   val->intval = 1; /* battery present */
+   } else { /* POWER_SUPPLY_PROP_HEALTH */
+   /* SBS spec doesn't have a general health command. */
+   val->intval = POWER_SUPPLY_HEALTH_UNKNOWN;
+   }
+
+   return 0;
+}
+
+static int sbs_get_ti_battery_presence_and_health(
+   struct i2c_client *client, enum power_supply_property psp,
+   union power_supply_propval *val)
 {
s32 ret;
 
@@ -600,7 +626,12 @@ static int sbs_get_property(struct power_supply *psy,
switch (psp) {
case POWER_SUPPLY_PROP_PRESENT:
case POWER_SUPPLY_PROP_HEALTH:
-   ret = sbs_get_battery_presence_and_health(client, psp, val);
+   if (client->flags & SBS_FLAGS_TI_BQ20Z75)
+   ret = sbs_get_ti_battery_presence_and_health(client,
+psp, val);
+   else
+  

[PATCH v3 2/2] dt-bindings: power: sbs-battery: re-document "ti,bq20z75"

2018-06-01 Thread Brian Norris
This compatible property was documented before the driver was renamed to
"SBS" (see commit e57f1b68c406 ("devicetree-bindings: Propagate
bq20z75->sbs rename to dt bindings")). The driver has continued to
support this property as an alternative to "sbs,sbs-battery", and
because we've noticed there are some lingering TI specifics (in the
manufacturer-specific portion of the SBS spec), we'd like to start using
this property again to differentiate.

Signed-off-by: Brian Norris 
Acked-by: Rhyland Klein 
---
v2: add Rhyland's Acked-by
v3: no change
---
 .../devicetree/bindings/power/supply/sbs_sbs-battery.txt| 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/devicetree/bindings/power/supply/sbs_sbs-battery.txt 
b/Documentation/devicetree/bindings/power/supply/sbs_sbs-battery.txt
index c40e8926facf..a7a9c3366f82 100644
--- a/Documentation/devicetree/bindings/power/supply/sbs_sbs-battery.txt
+++ b/Documentation/devicetree/bindings/power/supply/sbs_sbs-battery.txt
@@ -2,7 +2,7 @@ SBS sbs-battery
 ~~
 
 Required properties :
- - compatible : "sbs,sbs-battery"
+ - compatible : "sbs,sbs-battery" or "ti,bq20z75"
 
 Optional properties :
  - sbs,i2c-retry-count : The number of times to retry i2c transactions on i2c
-- 
2.17.1.1185.g55be947832-goog



[PATCH v3 1/2] power: supply: sbs-battery: don't assume MANUFACTURER_DATA formats

2018-06-01 Thread Brian Norris
This driver was originally submitted for the TI BQ20Z75 battery IC
(commit a7640bfa10c5 ("power_supply: Add driver for TI BQ20Z75 gas gauge
IC")) and later renamed to express generic SBS support. While it's
mostly true that this driver implemented a standard SBS command set, it
takes liberties with the REG_MANUFACTURER_DATA register. This register
is specified in the SBS spec, but it doesn't make any mention of what
its actual contents are.

We've sort of noticed this optionality previously, with commit
17c6d3979e5b ("sbs-battery: make writes to ManufacturerAccess
optional"), where we found that some batteries NAK writes to this
register.

What this really means is that so far, we've just been lucky that most
batteries have either been compatible with the TI chip, or else at least
haven't reported highly-unexpected values.

For instance, one battery I have here seems to report either 0x or
0x0100 to the MANUFACTURER_ACCESS_STATUS command -- while this seems to
match either Wake Up (bits[11:8] = b) or Normal Discharge
(bits[11:8] = 0001b) status for the TI part [1], they don't seem to
actually correspond to real states (for instance, I never see 0101b =
Charge, even when charging).

On other batteries, I'm getting apparently random data in return, which
means that occasionally, we interpret this as "battery not present" or
"battery is not healthy".

All in all, it seems to be a really bad idea to make assumptions about
REG_MANUFACTURER_DATA, unless we already know what battery we're using.
Therefore, this patch reimplements the "present" and "health" checks to
the following on most SBS batteries:

1. HEALTH: report "unknown" -- I couldn't find a standard SBS command
   that gives us much useful here
2. PRESENT: just send a REG_STATUS command; if it succeeds, then the
   battery is present

Also, we stop sending MANUFACTURER_ACCESS_SLEEP to non-TI parts. I have
no proof that this is useful and supported.

If someone explicitly provided a 'ti,bq20z75' compatible property, then
we retain the existing TI command behaviors.

[1] http://www.ti.com/lit/er/sluu265a/sluu265a.pdf

Signed-off-by: Brian Norris 
Reviewed-by: Guenter Roeck 
Acked-by: Rhyland Klein 
---
v2:
 * don't stub out POWER_SUPPLY_PROP_PRESENT from sbs_data[]
 * use if/else instead of switch/case

v3:
 * pull 'return 0' out of if/else, to satisfy braindead tooling
---
 drivers/power/supply/sbs-battery.c | 54 +-
 1 file changed, 46 insertions(+), 8 deletions(-)

diff --git a/drivers/power/supply/sbs-battery.c 
b/drivers/power/supply/sbs-battery.c
index 83d7b4115857..a9691ea42f44 100644
--- a/drivers/power/supply/sbs-battery.c
+++ b/drivers/power/supply/sbs-battery.c
@@ -23,6 +23,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -156,6 +157,9 @@ static enum power_supply_property sbs_properties[] = {
POWER_SUPPLY_PROP_MODEL_NAME
 };
 
+/* Supports special manufacturer commands from TI BQ20Z75 IC. */
+#define SBS_FLAGS_TI_BQ20Z75   BIT(0)
+
 struct sbs_info {
struct i2c_client   *client;
struct power_supply *power_supply;
@@ -168,6 +172,7 @@ struct sbs_info {
u32 poll_retry_count;
struct delayed_work work;
struct mutexmode_lock;
+   u32 flags;
 };
 
 static char model_name[I2C_SMBUS_BLOCK_MAX + 1];
@@ -315,6 +320,27 @@ static int sbs_status_correct(struct i2c_client *client, 
int *intval)
 static int sbs_get_battery_presence_and_health(
struct i2c_client *client, enum power_supply_property psp,
union power_supply_propval *val)
+{
+   int ret;
+
+   if (psp == POWER_SUPPLY_PROP_PRESENT) {
+   /* Dummy command; if it succeeds, battery is present. */
+   ret = sbs_read_word_data(client, sbs_data[REG_STATUS].addr);
+   if (ret < 0)
+   val->intval = 0; /* battery disconnected */
+   else
+   val->intval = 1; /* battery present */
+   } else { /* POWER_SUPPLY_PROP_HEALTH */
+   /* SBS spec doesn't have a general health command. */
+   val->intval = POWER_SUPPLY_HEALTH_UNKNOWN;
+   }
+
+   return 0;
+}
+
+static int sbs_get_ti_battery_presence_and_health(
+   struct i2c_client *client, enum power_supply_property psp,
+   union power_supply_propval *val)
 {
s32 ret;
 
@@ -600,7 +626,12 @@ static int sbs_get_property(struct power_supply *psy,
switch (psp) {
case POWER_SUPPLY_PROP_PRESENT:
case POWER_SUPPLY_PROP_HEALTH:
-   ret = sbs_get_battery_presence_and_health(client, psp, val);
+   if (client->flags & SBS_FLAGS_TI_BQ20Z75)
+   ret = sbs_get_ti_battery_presence_and_health(client,
+psp, val);
+   else
+  

Re: [PATCH v8 5/8] tracepoint: Make rcuidle tracepoint callers use SRCU

2018-06-01 Thread Mathieu Desnoyers
- On May 31, 2018, at 1:51 PM, Joel Fernandes, Google 
j...@joelfernandes.org wrote:

>> I find it odd to have a "return" in a macro that consists of a
>> do { } while (0). I'm tempted to replace "return" by "break" here,
>> to break the macro do/while (0) loop.
> 
> "return;" is also used from "if (!(cond))" above so I prefer be consistent
> and just use return than break as done above, but please let me know if you
> still object.

It's fine by me,

Thanks!

Mathieu

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com


Re: [PATCH v8 5/8] tracepoint: Make rcuidle tracepoint callers use SRCU

2018-06-01 Thread Mathieu Desnoyers
- On May 31, 2018, at 1:51 PM, Joel Fernandes, Google 
j...@joelfernandes.org wrote:

>> I find it odd to have a "return" in a macro that consists of a
>> do { } while (0). I'm tempted to replace "return" by "break" here,
>> to break the macro do/while (0) loop.
> 
> "return;" is also used from "if (!(cond))" above so I prefer be consistent
> and just use return than break as done above, but please let me know if you
> still object.

It's fine by me,

Thanks!

Mathieu

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com


[PATCH V2] i8042: Increment wakeup_count for the respective port.

2018-06-01 Thread Ravi Chandra Sadineni
Call pm_wakeup_event on every irq. This should help us in identifying if
keyboard was a potential wake reason for the last resume.

Signed-off-by: Ravi Chandra Sadineni 
---
V2: Increment the wakeup count only when there is a irq and not when the
method is called internally.

drivers/input/serio/i8042.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/input/serio/i8042.c b/drivers/input/serio/i8042.c
index 824f4c1c1f310..2bd6f2633e29a 100644
--- a/drivers/input/serio/i8042.c
+++ b/drivers/input/serio/i8042.c
@@ -573,6 +573,9 @@ static irqreturn_t i8042_interrupt(int irq, void *dev_id)
port = _ports[port_no];
serio = port->exists ? port->serio : NULL;
 
+   if (irq && serio && device_may_wakeup(>dev))
+   pm_wakeup_event(>dev, 0);
+
filter_dbg(port->driver_bound, data, "<- i8042 (interrupt, %d, 
%d%s%s)\n",
   port_no, irq,
   dfl & SERIO_PARITY ? ", bad parity" : "",
-- 
2.17.1.1185.g55be947832-goog



[PATCH V2] i8042: Increment wakeup_count for the respective port.

2018-06-01 Thread Ravi Chandra Sadineni
Call pm_wakeup_event on every irq. This should help us in identifying if
keyboard was a potential wake reason for the last resume.

Signed-off-by: Ravi Chandra Sadineni 
---
V2: Increment the wakeup count only when there is a irq and not when the
method is called internally.

drivers/input/serio/i8042.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/input/serio/i8042.c b/drivers/input/serio/i8042.c
index 824f4c1c1f310..2bd6f2633e29a 100644
--- a/drivers/input/serio/i8042.c
+++ b/drivers/input/serio/i8042.c
@@ -573,6 +573,9 @@ static irqreturn_t i8042_interrupt(int irq, void *dev_id)
port = _ports[port_no];
serio = port->exists ? port->serio : NULL;
 
+   if (irq && serio && device_may_wakeup(>dev))
+   pm_wakeup_event(>dev, 0);
+
filter_dbg(port->driver_bound, data, "<- i8042 (interrupt, %d, 
%d%s%s)\n",
   port_no, irq,
   dfl & SERIO_PARITY ? ", bad parity" : "",
-- 
2.17.1.1185.g55be947832-goog



Re: [PATCH v1 2/3] x86/bugs: Add AMD's SPEC_CTRL MSR usage

2018-06-01 Thread Tom Lendacky
On 6/1/2018 9:59 AM, Konrad Rzeszutek Wilk wrote:

Hi Konrad,

Thanks for doing this.  It was on my to-do list to get this
support out after everything settled down.

Just some questions/comments below.

> The AMD document outlining the SSBD handling
> 124441_AMD64_SpeculativeStoreBypassDisable_Whitepaper_final.pdf
> mentions that if CPUID 8000_0008.EBX[24] is set we should be using
> the SPEC_CTRL MSR (0x48) over the VIRT SPEC_CTRL MSR (0xC001_011f)
> for speculative store bypass disable.
> 
> This in effect means we should clear the X86_FEATURE_VIRT_SSBD
> flag so that we would prefer the SPEC_CTRL MSR.
> 
> See the document titled:
> 124441_AMD64_SpeculativeStoreBypassDisable_Whitepaper_final.pdf
> 
> A copy of this document is available at
>https://bugzilla.kernel.org/show_bug.cgi?id=199889
> 
> Signed-off-by: Konrad Rzeszutek Wilk 
> 
> ---
> Cc: Thomas Gleixner 
> Cc: Ingo Molnar 
> Cc: "H. Peter Anvin" 
> Cc: Paolo Bonzini 
> Cc: "Radim Krčmář" 
> Cc: Joerg Roedel 
> Cc: Borislav Petkov 
> Cc: Konrad Rzeszutek Wilk 
> Cc: David Woodhouse 
> Cc: Tom Lendacky 
> Cc: Janakarajan Natarajan 
> Cc: Kees Cook 
> Cc: KarimAllah Ahmed 
> Cc: Andy Lutomirski 
> ---
>  arch/x86/include/asm/cpufeatures.h |  1 +
>  arch/x86/kernel/cpu/bugs.c | 12 +++-
>  arch/x86/kernel/cpu/common.c   |  6 ++
>  arch/x86/kvm/cpuid.c   | 10 --
>  arch/x86/kvm/svm.c |  8 +---
>  5 files changed, 27 insertions(+), 10 deletions(-)
> 
> diff --git a/arch/x86/include/asm/cpufeatures.h 
> b/arch/x86/include/asm/cpufeatures.h
> index b6d7ce32927a..5701f5cecd31 100644
> --- a/arch/x86/include/asm/cpufeatures.h
> +++ b/arch/x86/include/asm/cpufeatures.h
> @@ -282,6 +282,7 @@
>  #define X86_FEATURE_AMD_IBPB (13*32+12) /* "" Indirect Branch 
> Prediction Barrier */
>  #define X86_FEATURE_AMD_IBRS (13*32+14) /* "" Indirect Branch 
> Restricted Speculation */
>  #define X86_FEATURE_AMD_STIBP(13*32+15) /* "" Single Thread 
> Indirect Branch Predictors */
> +#define X86_FEATURE_AMD_SSBD (13*32+24) /* "" Speculative Store 
> Bypass Disable */
>  #define X86_FEATURE_VIRT_SSBD(13*32+25) /* Virtualized 
> Speculative Store Bypass Disable */
>  #define X86_FEATURE_AMD_SSB_NO   (13*32+26) /* "" Speculative 
> Store Bypass is fixed in hardware. */
>  
> diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
> index 7416fc206b4a..6bea81855cdd 100644
> --- a/arch/x86/kernel/cpu/bugs.c
> +++ b/arch/x86/kernel/cpu/bugs.c
> @@ -529,18 +529,20 @@ static enum ssb_mitigation __init 
> __ssb_select_mitigation(void)
>   if (mode == SPEC_STORE_BYPASS_DISABLE) {
>   setup_force_cpu_cap(X86_FEATURE_SPEC_STORE_BYPASS_DISABLE);
>   /*
> -  * Intel uses the SPEC CTRL MSR Bit(2) for this, while AMD uses
> -  * a completely different MSR and bit dependent on family.
> +  * Intel uses the SPEC CTRL MSR Bit(2) for this, while AMD may
> +  * use a completely different MSR and bit dependent on family.
>*/
>   switch (boot_cpu_data.x86_vendor) {
>   case X86_VENDOR_INTEL:
> + case X86_VENDOR_AMD:
> + if (!static_cpu_has(X86_FEATURE_MSR_SPEC_CTRL)) {
> + x86_amd_ssb_disable();
> + break;
> + }
>   x86_spec_ctrl_base |= SPEC_CTRL_SSBD;
>   x86_spec_ctrl_mask |= SPEC_CTRL_SSBD;
>   wrmsrl(MSR_IA32_SPEC_CTRL, x86_spec_ctrl_base);
>   break;
> - case X86_VENDOR_AMD:
> - x86_amd_ssb_disable();
> - break;
>   }
>   }
>  
> diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
> index 494735cf63f5..d08a29bd0385 100644
> --- a/arch/x86/kernel/cpu/common.c
> +++ b/arch/x86/kernel/cpu/common.c
> @@ -783,6 +783,12 @@ static void init_speculation_control(struct cpuinfo_x86 
> *c)
>   set_cpu_cap(c, X86_FEATURE_STIBP);
>   set_cpu_cap(c, X86_FEATURE_MSR_SPEC_CTRL);
>   }
> +
> + if (cpu_has(c, X86_FEATURE_AMD_SSBD)) {
> + set_cpu_cap(c, X86_FEATURE_SSBD);
> + set_cpu_cap(c, X86_FEATURE_MSR_SPEC_CTRL);
> + clear_cpu_cap(c, X86_FEATURE_VIRT_SSBD);
> + }
>  }
>  
>  void get_cpu_cap(struct cpuinfo_x86 *c)
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index 132f8a58692e..f4f30d0c25c4 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -379,7 +379,8 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 
> *entry, u32 function,
>  
>   /* cpuid 0x8008.ebx */
>   const u32 kvm_cpuid_8000_0008_ebx_x86_features =
> - F(AMD_IBPB) | F(AMD_IBRS) | F(VIRT_SSBD) | F(AMD_SSB_NO);
> + F(AMD_IBPB) | F(AMD_IBRS) | F(AMD_SSBD) | 

Re: [PATCH v1 2/3] x86/bugs: Add AMD's SPEC_CTRL MSR usage

2018-06-01 Thread Tom Lendacky
On 6/1/2018 9:59 AM, Konrad Rzeszutek Wilk wrote:

Hi Konrad,

Thanks for doing this.  It was on my to-do list to get this
support out after everything settled down.

Just some questions/comments below.

> The AMD document outlining the SSBD handling
> 124441_AMD64_SpeculativeStoreBypassDisable_Whitepaper_final.pdf
> mentions that if CPUID 8000_0008.EBX[24] is set we should be using
> the SPEC_CTRL MSR (0x48) over the VIRT SPEC_CTRL MSR (0xC001_011f)
> for speculative store bypass disable.
> 
> This in effect means we should clear the X86_FEATURE_VIRT_SSBD
> flag so that we would prefer the SPEC_CTRL MSR.
> 
> See the document titled:
> 124441_AMD64_SpeculativeStoreBypassDisable_Whitepaper_final.pdf
> 
> A copy of this document is available at
>https://bugzilla.kernel.org/show_bug.cgi?id=199889
> 
> Signed-off-by: Konrad Rzeszutek Wilk 
> 
> ---
> Cc: Thomas Gleixner 
> Cc: Ingo Molnar 
> Cc: "H. Peter Anvin" 
> Cc: Paolo Bonzini 
> Cc: "Radim Krčmář" 
> Cc: Joerg Roedel 
> Cc: Borislav Petkov 
> Cc: Konrad Rzeszutek Wilk 
> Cc: David Woodhouse 
> Cc: Tom Lendacky 
> Cc: Janakarajan Natarajan 
> Cc: Kees Cook 
> Cc: KarimAllah Ahmed 
> Cc: Andy Lutomirski 
> ---
>  arch/x86/include/asm/cpufeatures.h |  1 +
>  arch/x86/kernel/cpu/bugs.c | 12 +++-
>  arch/x86/kernel/cpu/common.c   |  6 ++
>  arch/x86/kvm/cpuid.c   | 10 --
>  arch/x86/kvm/svm.c |  8 +---
>  5 files changed, 27 insertions(+), 10 deletions(-)
> 
> diff --git a/arch/x86/include/asm/cpufeatures.h 
> b/arch/x86/include/asm/cpufeatures.h
> index b6d7ce32927a..5701f5cecd31 100644
> --- a/arch/x86/include/asm/cpufeatures.h
> +++ b/arch/x86/include/asm/cpufeatures.h
> @@ -282,6 +282,7 @@
>  #define X86_FEATURE_AMD_IBPB (13*32+12) /* "" Indirect Branch 
> Prediction Barrier */
>  #define X86_FEATURE_AMD_IBRS (13*32+14) /* "" Indirect Branch 
> Restricted Speculation */
>  #define X86_FEATURE_AMD_STIBP(13*32+15) /* "" Single Thread 
> Indirect Branch Predictors */
> +#define X86_FEATURE_AMD_SSBD (13*32+24) /* "" Speculative Store 
> Bypass Disable */
>  #define X86_FEATURE_VIRT_SSBD(13*32+25) /* Virtualized 
> Speculative Store Bypass Disable */
>  #define X86_FEATURE_AMD_SSB_NO   (13*32+26) /* "" Speculative 
> Store Bypass is fixed in hardware. */
>  
> diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
> index 7416fc206b4a..6bea81855cdd 100644
> --- a/arch/x86/kernel/cpu/bugs.c
> +++ b/arch/x86/kernel/cpu/bugs.c
> @@ -529,18 +529,20 @@ static enum ssb_mitigation __init 
> __ssb_select_mitigation(void)
>   if (mode == SPEC_STORE_BYPASS_DISABLE) {
>   setup_force_cpu_cap(X86_FEATURE_SPEC_STORE_BYPASS_DISABLE);
>   /*
> -  * Intel uses the SPEC CTRL MSR Bit(2) for this, while AMD uses
> -  * a completely different MSR and bit dependent on family.
> +  * Intel uses the SPEC CTRL MSR Bit(2) for this, while AMD may
> +  * use a completely different MSR and bit dependent on family.
>*/
>   switch (boot_cpu_data.x86_vendor) {
>   case X86_VENDOR_INTEL:
> + case X86_VENDOR_AMD:
> + if (!static_cpu_has(X86_FEATURE_MSR_SPEC_CTRL)) {
> + x86_amd_ssb_disable();
> + break;
> + }
>   x86_spec_ctrl_base |= SPEC_CTRL_SSBD;
>   x86_spec_ctrl_mask |= SPEC_CTRL_SSBD;
>   wrmsrl(MSR_IA32_SPEC_CTRL, x86_spec_ctrl_base);
>   break;
> - case X86_VENDOR_AMD:
> - x86_amd_ssb_disable();
> - break;
>   }
>   }
>  
> diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
> index 494735cf63f5..d08a29bd0385 100644
> --- a/arch/x86/kernel/cpu/common.c
> +++ b/arch/x86/kernel/cpu/common.c
> @@ -783,6 +783,12 @@ static void init_speculation_control(struct cpuinfo_x86 
> *c)
>   set_cpu_cap(c, X86_FEATURE_STIBP);
>   set_cpu_cap(c, X86_FEATURE_MSR_SPEC_CTRL);
>   }
> +
> + if (cpu_has(c, X86_FEATURE_AMD_SSBD)) {
> + set_cpu_cap(c, X86_FEATURE_SSBD);
> + set_cpu_cap(c, X86_FEATURE_MSR_SPEC_CTRL);
> + clear_cpu_cap(c, X86_FEATURE_VIRT_SSBD);
> + }
>  }
>  
>  void get_cpu_cap(struct cpuinfo_x86 *c)
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index 132f8a58692e..f4f30d0c25c4 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -379,7 +379,8 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 
> *entry, u32 function,
>  
>   /* cpuid 0x8008.ebx */
>   const u32 kvm_cpuid_8000_0008_ebx_x86_features =
> - F(AMD_IBPB) | F(AMD_IBRS) | F(VIRT_SSBD) | F(AMD_SSB_NO);
> + F(AMD_IBPB) | F(AMD_IBRS) | F(AMD_SSBD) | 

Re: linux-next: manual merge of the nvdimm tree with the xfs tree

2018-06-01 Thread Darrick J. Wong
On Fri, Jun 01, 2018 at 06:58:46PM +1000, Stephen Rothwell wrote:
> Hi Dan,
> 
> Today's linux-next merge of the nvdimm tree got a conflict in:
> 
>   drivers/dax/super.c
> 
> between commits:
> 
>   ba23cba9b3bd ("fs: allow per-device dax status checking for filesystems")
>   80660f20252d ("dax: change bdev_dax_supported() to support boolean returns")
> 
> from the xfs tree and commit:
> 
>   e76384884344 ("mm: introduce MEMORY_DEVICE_FS_DAX and 
> CONFIG_DEV_PAGEMAP_OPS")
> 
> from the nvdimm tree.
> 
> I fixed it up (see below) and can carry the fix as necessary. This
> is now fixed as far as linux-next is concerned, but any non trivial
> conflicts should be mentioned to your upstream maintainer when your tree
> is submitted for merging.  You may also want to consider cooperating
> with the maintainer of the conflicting tree to minimise any particularly
> complex conflicts.
> 
> -- 
> Cheers,
> Stephen Rothwell
> diff --cc drivers/dax/super.c
> index 1d7bd96511f0,88672b6f6252..
> --- a/drivers/dax/super.c
> +++ b/drivers/dax/super.c
> @@@ -80,11 -80,13 +80,12 @@@ EXPORT_SYMBOL_GPL(fs_dax_get_by_bdev)
>* This is a library function for filesystems to check if the block device
>* can be mounted with dax option.
>*
>  - * Return: negative errno if unsupported, 0 if supported.
>  + * Return: true if supported, false if unsupported
>*/
>  -int __bdev_dax_supported(struct super_block *sb, int blocksize)
>  +bool __bdev_dax_supported(struct block_device *bdev, int blocksize)
>   {
>  -struct block_device *bdev = sb->s_bdev;
>   struct dax_device *dax_dev;
> + bool dax_enabled = false;
>   pgoff_t pgoff;
>   int err, id;
>   void *kaddr;
> @@@ -134,15 -135,22 +135,22 @@@
>* on being able to do (page_address(pfn_to_page())).
>*/
>   WARN_ON(IS_ENABLED(CONFIG_ARCH_HAS_PMEM_API));
> + dax_enabled = true;
>   } else if (pfn_t_devmap(pfn)) {
> - /* pass */;
> - } else {
> + struct dev_pagemap *pgmap;
> + 
> + pgmap = get_dev_pagemap(pfn_t_to_pfn(pfn), NULL);
> + if (pgmap && pgmap->type == MEMORY_DEVICE_FS_DAX)
> + dax_enabled = true;
> + put_dev_pagemap(pgmap);
> + }
> + 
> + if (!dax_enabled) {
>  -pr_debug("VFS (%s): error: dax support not enabled\n",
>  -sb->s_id);
>  +pr_debug("%s: error: dax support not enabled\n",
>  +bdevname(bdev, buf));
> - return false;
> + return -EOPNOTSUPP;

Hang on a sec, the changes in the xfs tree make this function return a
boolean (true for dax-is-supported, false for dax-not-supported), but
this change partially reverts the boolean return values.

>   }
> - 
> - return true;
> + return 0;

The merge should retain the 'return false' above and the 'return true'
here... or possibly just return dax_enabled:

if (!dax_enabled) {
pr_debug(...);
pr_debug(...);
}

return dax_enabled;

--D

>   }
>   EXPORT_SYMBOL_GPL(__bdev_dax_supported);
>   #endif




Re: linux-next: manual merge of the nvdimm tree with the xfs tree

2018-06-01 Thread Darrick J. Wong
On Fri, Jun 01, 2018 at 06:58:46PM +1000, Stephen Rothwell wrote:
> Hi Dan,
> 
> Today's linux-next merge of the nvdimm tree got a conflict in:
> 
>   drivers/dax/super.c
> 
> between commits:
> 
>   ba23cba9b3bd ("fs: allow per-device dax status checking for filesystems")
>   80660f20252d ("dax: change bdev_dax_supported() to support boolean returns")
> 
> from the xfs tree and commit:
> 
>   e76384884344 ("mm: introduce MEMORY_DEVICE_FS_DAX and 
> CONFIG_DEV_PAGEMAP_OPS")
> 
> from the nvdimm tree.
> 
> I fixed it up (see below) and can carry the fix as necessary. This
> is now fixed as far as linux-next is concerned, but any non trivial
> conflicts should be mentioned to your upstream maintainer when your tree
> is submitted for merging.  You may also want to consider cooperating
> with the maintainer of the conflicting tree to minimise any particularly
> complex conflicts.
> 
> -- 
> Cheers,
> Stephen Rothwell
> diff --cc drivers/dax/super.c
> index 1d7bd96511f0,88672b6f6252..
> --- a/drivers/dax/super.c
> +++ b/drivers/dax/super.c
> @@@ -80,11 -80,13 +80,12 @@@ EXPORT_SYMBOL_GPL(fs_dax_get_by_bdev)
>* This is a library function for filesystems to check if the block device
>* can be mounted with dax option.
>*
>  - * Return: negative errno if unsupported, 0 if supported.
>  + * Return: true if supported, false if unsupported
>*/
>  -int __bdev_dax_supported(struct super_block *sb, int blocksize)
>  +bool __bdev_dax_supported(struct block_device *bdev, int blocksize)
>   {
>  -struct block_device *bdev = sb->s_bdev;
>   struct dax_device *dax_dev;
> + bool dax_enabled = false;
>   pgoff_t pgoff;
>   int err, id;
>   void *kaddr;
> @@@ -134,15 -135,22 +135,22 @@@
>* on being able to do (page_address(pfn_to_page())).
>*/
>   WARN_ON(IS_ENABLED(CONFIG_ARCH_HAS_PMEM_API));
> + dax_enabled = true;
>   } else if (pfn_t_devmap(pfn)) {
> - /* pass */;
> - } else {
> + struct dev_pagemap *pgmap;
> + 
> + pgmap = get_dev_pagemap(pfn_t_to_pfn(pfn), NULL);
> + if (pgmap && pgmap->type == MEMORY_DEVICE_FS_DAX)
> + dax_enabled = true;
> + put_dev_pagemap(pgmap);
> + }
> + 
> + if (!dax_enabled) {
>  -pr_debug("VFS (%s): error: dax support not enabled\n",
>  -sb->s_id);
>  +pr_debug("%s: error: dax support not enabled\n",
>  +bdevname(bdev, buf));
> - return false;
> + return -EOPNOTSUPP;

Hang on a sec, the changes in the xfs tree make this function return a
boolean (true for dax-is-supported, false for dax-not-supported), but
this change partially reverts the boolean return values.

>   }
> - 
> - return true;
> + return 0;

The merge should retain the 'return false' above and the 'return true'
here... or possibly just return dax_enabled:

if (!dax_enabled) {
pr_debug(...);
pr_debug(...);
}

return dax_enabled;

--D

>   }
>   EXPORT_SYMBOL_GPL(__bdev_dax_supported);
>   #endif




[PATCH v3 1/3] dt-bindings: clk: Update Stingray binding doc

2018-06-01 Thread Ray Jui
From: Pramod Kumar 

Update Stingray clock binding document to add additional clock entries
with names matching the latest ASIC datasheet. Also modify a few existing
entries to make their naming more consistent with the rest of the entries

Signed-off-by: Pramod Kumar 
Signed-off-by: Ray Jui 
---
 .../bindings/clock/brcm,iproc-clocks.txt   | 26 --
 include/dt-bindings/clock/bcm-sr.h | 24 ++--
 2 files changed, 31 insertions(+), 19 deletions(-)

diff --git a/Documentation/devicetree/bindings/clock/brcm,iproc-clocks.txt 
b/Documentation/devicetree/bindings/clock/brcm,iproc-clocks.txt
index f8e4a93..ab730ea 100644
--- a/Documentation/devicetree/bindings/clock/brcm,iproc-clocks.txt
+++ b/Documentation/devicetree/bindings/clock/brcm,iproc-clocks.txt
@@ -276,36 +276,38 @@ These clock IDs are defined in:
 clk_ts_500_ref genpll2 2   BCM_SR_GENPLL2_TS_500_REF_CLK
 clk_125_nitro  genpll2 3   BCM_SR_GENPLL2_125_NITRO_CLK
 clk_chimp  genpll2 4   BCM_SR_GENPLL2_CHIMP_CLK
-clk_nic_flash  genpll2 5   BCM_SR_GENPLL2_NIC_FLASH
+clk_nic_flash  genpll2 5   BCM_SR_GENPLL2_NIC_FLASH_CLK
+clk_fs genpll2 6   BCM_SR_GENPLL2_FS_CLK
 
 genpll3crystal 0   BCM_SR_GENPLL3
 clk_hsls   genpll3 1   BCM_SR_GENPLL3_HSLS_CLK
 clk_sdio   genpll3 2   BCM_SR_GENPLL3_SDIO_CLK
 
 genpll4crystal 0   BCM_SR_GENPLL4
-ccngenpll4 1   BCM_SR_GENPLL4_CCN_CLK
+clk_ccngenpll4 1   BCM_SR_GENPLL4_CCN_CLK
 clk_tpiu_pll   genpll4 2   BCM_SR_GENPLL4_TPIU_PLL_CLK
-noc_clkgenpll4 3   BCM_SR_GENPLL4_NOC_CLK
+clk_nocgenpll4 3   BCM_SR_GENPLL4_NOC_CLK
 clk_chclk_fs4  genpll4 4   BCM_SR_GENPLL4_CHCLK_FS4_CLK
 clk_bridge_fscpu   genpll4 5   BCM_SR_GENPLL4_BRIDGE_FSCPU_CLK
 
-
 genpll5crystal 0   BCM_SR_GENPLL5
-fs4_hf_clk genpll5 1   BCM_SR_GENPLL5_FS4_HF_CLK
-crypto_ae_clk  genpll5 2   BCM_SR_GENPLL5_CRYPTO_AE_CLK
-raid_ae_clkgenpll5 3   
BCM_SR_GENPLL5_RAID_AE_CLK
+clk_fs4_hf genpll5 1   BCM_SR_GENPLL5_FS4_HF_CLK
+clk_crypto_ae  genpll5 2   BCM_SR_GENPLL5_CRYPTO_AE_CLK
+clk_raid_aegenpll5 3   
BCM_SR_GENPLL5_RAID_AE_CLK
 
 genpll6crystal 0   BCM_SR_GENPLL6
-48_usb genpll6 1   BCM_SR_GENPLL6_48_USB_CLK
+clk_48_usb genpll6 1   BCM_SR_GENPLL6_48_USB_CLK
 
 lcpll0 crystal 0   BCM_SR_LCPLL0
 clk_sata_refp  lcpll0  1   BCM_SR_LCPLL0_SATA_REFP_CLK
 clk_sata_refn  lcpll0  2   BCM_SR_LCPLL0_SATA_REFN_CLK
-clk_usb_reflcpll0  3   
BCM_SR_LCPLL0_USB_REF_CLK
-sata_refpn lcpll0  3   BCM_SR_LCPLL0_SATA_REFPN_CLK
+clk_sata_350   lcpll0  3   BCM_SR_LCPLL0_SATA_350_CLK
+clk_sata_500   lcpll0  4   BCM_SR_LCPLL0_SATA_500_CLK
 
 lcpll1 crystal 0   BCM_SR_LCPLL1
-wanlcpll1  1   BCM_SR_LCPLL0_WAN_CLK
+clk_wanlcpll1  1   BCM_SR_LCPLL1_WAN_CLK
+clk_usb_reflcpll1  2   
BCM_SR_LCPLL1_USB_REF_CLK
+clk_crmu_tslcpll1  3   
BCM_SR_LCPLL1_CRMU_TS_CLK
 
 lcpll_pcie crystal 0   BCM_SR_LCPLL_PCIE
-pcie_phy_ref   lcpll1  1   BCM_SR_LCPLL_PCIE_PHY_REF_CLK
+clk_pcie_phy_ref   lcpll1  1   BCM_SR_LCPLL_PCIE_PHY_REF_CLK
diff --git a/include/dt-bindings/clock/bcm-sr.h 
b/include/dt-bindings/clock/bcm-sr.h
index cff6c6f..419011b 100644
--- a/include/dt-bindings/clock/bcm-sr.h
+++ b/include/dt-bindings/clock/bcm-sr.h
@@ -35,7 +35,7 @@
 
 /* GENPLL 0 clock channel ID SCR HSLS FS PCIE */
 #define BCM_SR_GENPLL0 0
-#define BCM_SR_GENPLL0_SATA_CLK1
+#define BCM_SR_GENPLL0_125M_CLK1
 #define BCM_SR_GENPLL0_SCR_CLK 2
 #define BCM_SR_GENPLL0_250M_CLK3
 #define BCM_SR_GENPLL0_PCIE_AXI_CLK4
@@ -50,9 +50,11 @@
 /* GENPLL 2 clock channel ID NITRO MHB*/
 #define BCM_SR_GENPLL2 0
 #define BCM_SR_GENPLL2_NIC_CLK 1
-#define BCM_SR_GENPLL2_250_NITRO_CLK   2
+#define BCM_SR_GENPLL2_TS_500_CLK  2
 #define BCM_SR_GENPLL2_125_NITRO_CLK   3
 #define BCM_SR_GENPLL2_CHIMP_CLK   4
+#define BCM_SR_GENPLL2_NIC_FLASH_CLK   5
+#define BCM_SR_GENPLL2_FS4_CLK 6
 
 /* GENPLL 3 HSLS clock channel ID */
 #define BCM_SR_GENPLL3 

[PATCH v3 1/3] dt-bindings: clk: Update Stingray binding doc

2018-06-01 Thread Ray Jui
From: Pramod Kumar 

Update Stingray clock binding document to add additional clock entries
with names matching the latest ASIC datasheet. Also modify a few existing
entries to make their naming more consistent with the rest of the entries

Signed-off-by: Pramod Kumar 
Signed-off-by: Ray Jui 
---
 .../bindings/clock/brcm,iproc-clocks.txt   | 26 --
 include/dt-bindings/clock/bcm-sr.h | 24 ++--
 2 files changed, 31 insertions(+), 19 deletions(-)

diff --git a/Documentation/devicetree/bindings/clock/brcm,iproc-clocks.txt 
b/Documentation/devicetree/bindings/clock/brcm,iproc-clocks.txt
index f8e4a93..ab730ea 100644
--- a/Documentation/devicetree/bindings/clock/brcm,iproc-clocks.txt
+++ b/Documentation/devicetree/bindings/clock/brcm,iproc-clocks.txt
@@ -276,36 +276,38 @@ These clock IDs are defined in:
 clk_ts_500_ref genpll2 2   BCM_SR_GENPLL2_TS_500_REF_CLK
 clk_125_nitro  genpll2 3   BCM_SR_GENPLL2_125_NITRO_CLK
 clk_chimp  genpll2 4   BCM_SR_GENPLL2_CHIMP_CLK
-clk_nic_flash  genpll2 5   BCM_SR_GENPLL2_NIC_FLASH
+clk_nic_flash  genpll2 5   BCM_SR_GENPLL2_NIC_FLASH_CLK
+clk_fs genpll2 6   BCM_SR_GENPLL2_FS_CLK
 
 genpll3crystal 0   BCM_SR_GENPLL3
 clk_hsls   genpll3 1   BCM_SR_GENPLL3_HSLS_CLK
 clk_sdio   genpll3 2   BCM_SR_GENPLL3_SDIO_CLK
 
 genpll4crystal 0   BCM_SR_GENPLL4
-ccngenpll4 1   BCM_SR_GENPLL4_CCN_CLK
+clk_ccngenpll4 1   BCM_SR_GENPLL4_CCN_CLK
 clk_tpiu_pll   genpll4 2   BCM_SR_GENPLL4_TPIU_PLL_CLK
-noc_clkgenpll4 3   BCM_SR_GENPLL4_NOC_CLK
+clk_nocgenpll4 3   BCM_SR_GENPLL4_NOC_CLK
 clk_chclk_fs4  genpll4 4   BCM_SR_GENPLL4_CHCLK_FS4_CLK
 clk_bridge_fscpu   genpll4 5   BCM_SR_GENPLL4_BRIDGE_FSCPU_CLK
 
-
 genpll5crystal 0   BCM_SR_GENPLL5
-fs4_hf_clk genpll5 1   BCM_SR_GENPLL5_FS4_HF_CLK
-crypto_ae_clk  genpll5 2   BCM_SR_GENPLL5_CRYPTO_AE_CLK
-raid_ae_clkgenpll5 3   
BCM_SR_GENPLL5_RAID_AE_CLK
+clk_fs4_hf genpll5 1   BCM_SR_GENPLL5_FS4_HF_CLK
+clk_crypto_ae  genpll5 2   BCM_SR_GENPLL5_CRYPTO_AE_CLK
+clk_raid_aegenpll5 3   
BCM_SR_GENPLL5_RAID_AE_CLK
 
 genpll6crystal 0   BCM_SR_GENPLL6
-48_usb genpll6 1   BCM_SR_GENPLL6_48_USB_CLK
+clk_48_usb genpll6 1   BCM_SR_GENPLL6_48_USB_CLK
 
 lcpll0 crystal 0   BCM_SR_LCPLL0
 clk_sata_refp  lcpll0  1   BCM_SR_LCPLL0_SATA_REFP_CLK
 clk_sata_refn  lcpll0  2   BCM_SR_LCPLL0_SATA_REFN_CLK
-clk_usb_reflcpll0  3   
BCM_SR_LCPLL0_USB_REF_CLK
-sata_refpn lcpll0  3   BCM_SR_LCPLL0_SATA_REFPN_CLK
+clk_sata_350   lcpll0  3   BCM_SR_LCPLL0_SATA_350_CLK
+clk_sata_500   lcpll0  4   BCM_SR_LCPLL0_SATA_500_CLK
 
 lcpll1 crystal 0   BCM_SR_LCPLL1
-wanlcpll1  1   BCM_SR_LCPLL0_WAN_CLK
+clk_wanlcpll1  1   BCM_SR_LCPLL1_WAN_CLK
+clk_usb_reflcpll1  2   
BCM_SR_LCPLL1_USB_REF_CLK
+clk_crmu_tslcpll1  3   
BCM_SR_LCPLL1_CRMU_TS_CLK
 
 lcpll_pcie crystal 0   BCM_SR_LCPLL_PCIE
-pcie_phy_ref   lcpll1  1   BCM_SR_LCPLL_PCIE_PHY_REF_CLK
+clk_pcie_phy_ref   lcpll1  1   BCM_SR_LCPLL_PCIE_PHY_REF_CLK
diff --git a/include/dt-bindings/clock/bcm-sr.h 
b/include/dt-bindings/clock/bcm-sr.h
index cff6c6f..419011b 100644
--- a/include/dt-bindings/clock/bcm-sr.h
+++ b/include/dt-bindings/clock/bcm-sr.h
@@ -35,7 +35,7 @@
 
 /* GENPLL 0 clock channel ID SCR HSLS FS PCIE */
 #define BCM_SR_GENPLL0 0
-#define BCM_SR_GENPLL0_SATA_CLK1
+#define BCM_SR_GENPLL0_125M_CLK1
 #define BCM_SR_GENPLL0_SCR_CLK 2
 #define BCM_SR_GENPLL0_250M_CLK3
 #define BCM_SR_GENPLL0_PCIE_AXI_CLK4
@@ -50,9 +50,11 @@
 /* GENPLL 2 clock channel ID NITRO MHB*/
 #define BCM_SR_GENPLL2 0
 #define BCM_SR_GENPLL2_NIC_CLK 1
-#define BCM_SR_GENPLL2_250_NITRO_CLK   2
+#define BCM_SR_GENPLL2_TS_500_CLK  2
 #define BCM_SR_GENPLL2_125_NITRO_CLK   3
 #define BCM_SR_GENPLL2_CHIMP_CLK   4
+#define BCM_SR_GENPLL2_NIC_FLASH_CLK   5
+#define BCM_SR_GENPLL2_FS4_CLK 6
 
 /* GENPLL 3 HSLS clock channel ID */
 #define BCM_SR_GENPLL3 

[PATCH v3 0/3] Update Broadcom Stingray clock entries

2018-06-01 Thread Ray Jui
This patch series updates Broadcom Stingray clock entries so they match the
latest ASIC datasheet

This patch series is based off v4.17-rc5 and is available on GIHUB:
repo: https://github.com/Broadcom/arm64-linux.git
branch: sr-clk-v3

Changes since v2:
 - Move dt-binding header change to the same patch with the binding doc
update

Changes since v1:
 - Fix patch author to Pramod Kumar on all 3 patches
 - Fix patch subject spelling error on patch 2/3

Pramod Kumar (3):
  dt-bindings: clk: Update Stingray binding doc
  clk: bcm: Update and add Stingray clock entries
  arm64: dts: Update Stingray clock DT nodes

 .../bindings/clock/brcm,iproc-clocks.txt   |  26 ++--
 .../boot/dts/broadcom/stingray/stingray-clock.dtsi |  26 ++--
 drivers/clk/bcm/clk-sr.c   | 135 ++---
 include/dt-bindings/clock/bcm-sr.h |  24 ++--
 4 files changed, 170 insertions(+), 41 deletions(-)

-- 
2.1.4



[PATCH v3 3/3] arm64: dts: Update Stingray clock DT nodes

2018-06-01 Thread Ray Jui
From: Pramod Kumar 

Update clock output names in the Stingray clock DT nodes so they match
the binding document and the latest ASIC datasheet. Also add entries
for LCPLL2

Signed-off-by: Pramod Kumar 
Signed-off-by: Ray Jui 
---
 .../boot/dts/broadcom/stingray/stingray-clock.dtsi | 26 --
 1 file changed, 19 insertions(+), 7 deletions(-)

diff --git a/arch/arm64/boot/dts/broadcom/stingray/stingray-clock.dtsi 
b/arch/arm64/boot/dts/broadcom/stingray/stingray-clock.dtsi
index 3a4d452..10a106a 100644
--- a/arch/arm64/boot/dts/broadcom/stingray/stingray-clock.dtsi
+++ b/arch/arm64/boot/dts/broadcom/stingray/stingray-clock.dtsi
@@ -52,12 +52,24 @@
reg = <0x0001d104 0x32>,
  <0x0001c854 0x4>;
clocks = <>;
-   clock-output-names = "genpll0", "clk_125", "clk_scr",
+   clock-output-names = "genpll0", "clk_125m", "clk_scr",
 "clk_250", "clk_pcie_axi",
 "clk_paxc_axi_x2",
 "clk_paxc_axi";
};
 
+   genpll2: genpll2@1d1ac {
+   #clock-cells = <1>;
+   compatible = "brcm,sr-genpll2";
+   reg = <0x0001d1ac 0x32>,
+ <0x0001c854 0x4>;
+   clocks = <>;
+   clock-output-names = "genpll2", "clk_nic",
+"clk_ts_500_ref", "clk_125_nitro",
+"clk_chimp", "clk_nic_flash",
+"clk_fs";
+   };
+
genpll3: genpll3@1d1e0 {
#clock-cells = <1>;
compatible = "brcm,sr-genpll3";
@@ -75,8 +87,8 @@
  <0x0001c854 0x4>;
clocks = <>;
clock-output-names = "genpll4", "clk_ccn",
-"clk_tpiu_pll", "noc_clk",
-"pll_chclk_fs4",
+"clk_tpiu_pll", "clk_noc",
+"clk_chclk_fs4",
 "clk_bridge_fscpu";
};
 
@@ -86,8 +98,8 @@
reg = <0x0001d248 0x32>,
  <0x0001c870 0x4>;
clocks = <>;
-   clock-output-names = "genpll5", "fs4_hf_clk",
-"crypto_ae_clk", "raid_ae_clk";
+   clock-output-names = "genpll5", "clk_fs4_hf",
+"clk_crypto_ae", "clk_raid_ae";
};
 
lcpll0: lcpll0@1d0c4 {
@@ -107,9 +119,9 @@
reg = <0x0001d138 0x3c>,
  <0x0001c870 0x4>;
clocks = <>;
-   clock-output-names = "lcpll1", "clk_wanpn",
+   clock-output-names = "lcpll1", "clk_wan",
 "clk_usb_ref",
-"timesync_evt_clk";
+"clk_crmu_ts";
};
 
hsls_clk: hsls_clk {
-- 
2.1.4



[PATCH v3 2/3] clk: bcm: Update and add Stingray clock entries

2018-06-01 Thread Ray Jui
From: Pramod Kumar 

Update and add Stingray clock definitions and tables so they match the
binding document and the latest ASIC datasheet

Signed-off-by: Pramod Kumar 
Signed-off-by: Ray Jui 
---
 drivers/clk/bcm/clk-sr.c | 135 +--
 1 file changed, 120 insertions(+), 15 deletions(-)

diff --git a/drivers/clk/bcm/clk-sr.c b/drivers/clk/bcm/clk-sr.c
index adc74f4..7b9efc0 100644
--- a/drivers/clk/bcm/clk-sr.c
+++ b/drivers/clk/bcm/clk-sr.c
@@ -56,8 +56,8 @@ static const struct iproc_pll_ctrl sr_genpll0 = {
 };
 
 static const struct iproc_clk_ctrl sr_genpll0_clk[] = {
-   [BCM_SR_GENPLL0_SATA_CLK] = {
-   .channel = BCM_SR_GENPLL0_SATA_CLK,
+   [BCM_SR_GENPLL0_125M_CLK] = {
+   .channel = BCM_SR_GENPLL0_125M_CLK,
.flags = IPROC_CLK_AON,
.enable = ENABLE_VAL(0x4, 6, 0, 12),
.mdiv = REG_VAL(0x18, 0, 9),
@@ -102,6 +102,65 @@ static int sr_genpll0_clk_init(struct platform_device 
*pdev)
return 0;
 }
 
+static const struct iproc_pll_ctrl sr_genpll2 = {
+   .flags = IPROC_CLK_AON | IPROC_CLK_PLL_HAS_NDIV_FRAC |
+   IPROC_CLK_PLL_NEEDS_SW_CFG,
+   .aon = AON_VAL(0x0, 1, 13, 12),
+   .reset = RESET_VAL(0x0, 12, 11),
+   .dig_filter = DF_VAL(0x0, 4, 3, 0, 4, 7, 3),
+   .sw_ctrl = SW_CTRL_VAL(0x10, 31),
+   .ndiv_int = REG_VAL(0x10, 20, 10),
+   .ndiv_frac = REG_VAL(0x10, 0, 20),
+   .pdiv = REG_VAL(0x14, 0, 4),
+   .status = REG_VAL(0x30, 12, 1),
+};
+
+static const struct iproc_clk_ctrl sr_genpll2_clk[] = {
+   [BCM_SR_GENPLL2_NIC_CLK] = {
+   .channel = BCM_SR_GENPLL2_NIC_CLK,
+   .flags = IPROC_CLK_AON,
+   .enable = ENABLE_VAL(0x4, 6, 0, 12),
+   .mdiv = REG_VAL(0x18, 0, 9),
+   },
+   [BCM_SR_GENPLL2_TS_500_CLK] = {
+   .channel = BCM_SR_GENPLL2_TS_500_CLK,
+   .flags = IPROC_CLK_AON,
+   .enable = ENABLE_VAL(0x4, 7, 1, 13),
+   .mdiv = REG_VAL(0x18, 10, 9),
+   },
+   [BCM_SR_GENPLL2_125_NITRO_CLK] = {
+   .channel = BCM_SR_GENPLL2_125_NITRO_CLK,
+   .flags = IPROC_CLK_AON,
+   .enable = ENABLE_VAL(0x4, 8, 2, 14),
+   .mdiv = REG_VAL(0x18, 20, 9),
+   },
+   [BCM_SR_GENPLL2_CHIMP_CLK] = {
+   .channel = BCM_SR_GENPLL2_CHIMP_CLK,
+   .flags = IPROC_CLK_AON,
+   .enable = ENABLE_VAL(0x4, 9, 3, 15),
+   .mdiv = REG_VAL(0x1c, 0, 9),
+   },
+   [BCM_SR_GENPLL2_NIC_FLASH_CLK] = {
+   .channel = BCM_SR_GENPLL2_NIC_FLASH_CLK,
+   .flags = IPROC_CLK_AON,
+   .enable = ENABLE_VAL(0x4, 10, 4, 16),
+   .mdiv = REG_VAL(0x1c, 10, 9),
+   },
+   [BCM_SR_GENPLL2_FS4_CLK] = {
+   .channel = BCM_SR_GENPLL2_FS4_CLK,
+   .enable = ENABLE_VAL(0x4, 11, 5, 17),
+   .mdiv = REG_VAL(0x1c, 20, 9),
+   },
+};
+
+static int sr_genpll2_clk_init(struct platform_device *pdev)
+{
+   iproc_pll_clk_setup(pdev->dev.of_node,
+   _genpll2, NULL, 0, sr_genpll2_clk,
+   ARRAY_SIZE(sr_genpll2_clk));
+   return 0;
+}
+
 static const struct iproc_pll_ctrl sr_genpll3 = {
.flags = IPROC_CLK_AON | IPROC_CLK_PLL_HAS_NDIV_FRAC |
IPROC_CLK_PLL_NEEDS_SW_CFG,
@@ -157,6 +216,30 @@ static const struct iproc_clk_ctrl sr_genpll4_clk[] = {
.enable = ENABLE_VAL(0x4, 6, 0, 12),
.mdiv = REG_VAL(0x18, 0, 9),
},
+   [BCM_SR_GENPLL4_TPIU_PLL_CLK] = {
+   .channel = BCM_SR_GENPLL4_TPIU_PLL_CLK,
+   .flags = IPROC_CLK_AON,
+   .enable = ENABLE_VAL(0x4, 7, 1, 13),
+   .mdiv = REG_VAL(0x18, 10, 9),
+   },
+   [BCM_SR_GENPLL4_NOC_CLK] = {
+   .channel = BCM_SR_GENPLL4_NOC_CLK,
+   .flags = IPROC_CLK_AON,
+   .enable = ENABLE_VAL(0x4, 8, 2, 14),
+   .mdiv = REG_VAL(0x18, 20, 9),
+   },
+   [BCM_SR_GENPLL4_CHCLK_FS4_CLK] = {
+   .channel = BCM_SR_GENPLL4_CHCLK_FS4_CLK,
+   .flags = IPROC_CLK_AON,
+   .enable = ENABLE_VAL(0x4, 9, 3, 15),
+   .mdiv = REG_VAL(0x1c, 0, 9),
+   },
+   [BCM_SR_GENPLL4_BRIDGE_FSCPU_CLK] = {
+   .channel = BCM_SR_GENPLL4_BRIDGE_FSCPU_CLK,
+   .flags = IPROC_CLK_AON,
+   .enable = ENABLE_VAL(0x4, 10, 4, 16),
+   .mdiv = REG_VAL(0x1c, 10, 9),
+   },
 };
 
 static int sr_genpll4_clk_init(struct platform_device *pdev)
@@ -181,18 +264,21 @@ static const struct iproc_pll_ctrl sr_genpll5 = {
 };
 
 static const struct iproc_clk_ctrl sr_genpll5_clk[] = {
-   [BCM_SR_GENPLL5_FS_CLK] = {
-   .channel = BCM_SR_GENPLL5_FS_CLK,
-   .flags = IPROC_CLK_AON,
+   [BCM_SR_GENPLL5_FS4_HF_CLK] = {
+

[PATCH v3 0/3] Update Broadcom Stingray clock entries

2018-06-01 Thread Ray Jui
This patch series updates Broadcom Stingray clock entries so they match the
latest ASIC datasheet

This patch series is based off v4.17-rc5 and is available on GIHUB:
repo: https://github.com/Broadcom/arm64-linux.git
branch: sr-clk-v3

Changes since v2:
 - Move dt-binding header change to the same patch with the binding doc
update

Changes since v1:
 - Fix patch author to Pramod Kumar on all 3 patches
 - Fix patch subject spelling error on patch 2/3

Pramod Kumar (3):
  dt-bindings: clk: Update Stingray binding doc
  clk: bcm: Update and add Stingray clock entries
  arm64: dts: Update Stingray clock DT nodes

 .../bindings/clock/brcm,iproc-clocks.txt   |  26 ++--
 .../boot/dts/broadcom/stingray/stingray-clock.dtsi |  26 ++--
 drivers/clk/bcm/clk-sr.c   | 135 ++---
 include/dt-bindings/clock/bcm-sr.h |  24 ++--
 4 files changed, 170 insertions(+), 41 deletions(-)

-- 
2.1.4



[PATCH v3 3/3] arm64: dts: Update Stingray clock DT nodes

2018-06-01 Thread Ray Jui
From: Pramod Kumar 

Update clock output names in the Stingray clock DT nodes so they match
the binding document and the latest ASIC datasheet. Also add entries
for LCPLL2

Signed-off-by: Pramod Kumar 
Signed-off-by: Ray Jui 
---
 .../boot/dts/broadcom/stingray/stingray-clock.dtsi | 26 --
 1 file changed, 19 insertions(+), 7 deletions(-)

diff --git a/arch/arm64/boot/dts/broadcom/stingray/stingray-clock.dtsi 
b/arch/arm64/boot/dts/broadcom/stingray/stingray-clock.dtsi
index 3a4d452..10a106a 100644
--- a/arch/arm64/boot/dts/broadcom/stingray/stingray-clock.dtsi
+++ b/arch/arm64/boot/dts/broadcom/stingray/stingray-clock.dtsi
@@ -52,12 +52,24 @@
reg = <0x0001d104 0x32>,
  <0x0001c854 0x4>;
clocks = <>;
-   clock-output-names = "genpll0", "clk_125", "clk_scr",
+   clock-output-names = "genpll0", "clk_125m", "clk_scr",
 "clk_250", "clk_pcie_axi",
 "clk_paxc_axi_x2",
 "clk_paxc_axi";
};
 
+   genpll2: genpll2@1d1ac {
+   #clock-cells = <1>;
+   compatible = "brcm,sr-genpll2";
+   reg = <0x0001d1ac 0x32>,
+ <0x0001c854 0x4>;
+   clocks = <>;
+   clock-output-names = "genpll2", "clk_nic",
+"clk_ts_500_ref", "clk_125_nitro",
+"clk_chimp", "clk_nic_flash",
+"clk_fs";
+   };
+
genpll3: genpll3@1d1e0 {
#clock-cells = <1>;
compatible = "brcm,sr-genpll3";
@@ -75,8 +87,8 @@
  <0x0001c854 0x4>;
clocks = <>;
clock-output-names = "genpll4", "clk_ccn",
-"clk_tpiu_pll", "noc_clk",
-"pll_chclk_fs4",
+"clk_tpiu_pll", "clk_noc",
+"clk_chclk_fs4",
 "clk_bridge_fscpu";
};
 
@@ -86,8 +98,8 @@
reg = <0x0001d248 0x32>,
  <0x0001c870 0x4>;
clocks = <>;
-   clock-output-names = "genpll5", "fs4_hf_clk",
-"crypto_ae_clk", "raid_ae_clk";
+   clock-output-names = "genpll5", "clk_fs4_hf",
+"clk_crypto_ae", "clk_raid_ae";
};
 
lcpll0: lcpll0@1d0c4 {
@@ -107,9 +119,9 @@
reg = <0x0001d138 0x3c>,
  <0x0001c870 0x4>;
clocks = <>;
-   clock-output-names = "lcpll1", "clk_wanpn",
+   clock-output-names = "lcpll1", "clk_wan",
 "clk_usb_ref",
-"timesync_evt_clk";
+"clk_crmu_ts";
};
 
hsls_clk: hsls_clk {
-- 
2.1.4



[PATCH v3 2/3] clk: bcm: Update and add Stingray clock entries

2018-06-01 Thread Ray Jui
From: Pramod Kumar 

Update and add Stingray clock definitions and tables so they match the
binding document and the latest ASIC datasheet

Signed-off-by: Pramod Kumar 
Signed-off-by: Ray Jui 
---
 drivers/clk/bcm/clk-sr.c | 135 +--
 1 file changed, 120 insertions(+), 15 deletions(-)

diff --git a/drivers/clk/bcm/clk-sr.c b/drivers/clk/bcm/clk-sr.c
index adc74f4..7b9efc0 100644
--- a/drivers/clk/bcm/clk-sr.c
+++ b/drivers/clk/bcm/clk-sr.c
@@ -56,8 +56,8 @@ static const struct iproc_pll_ctrl sr_genpll0 = {
 };
 
 static const struct iproc_clk_ctrl sr_genpll0_clk[] = {
-   [BCM_SR_GENPLL0_SATA_CLK] = {
-   .channel = BCM_SR_GENPLL0_SATA_CLK,
+   [BCM_SR_GENPLL0_125M_CLK] = {
+   .channel = BCM_SR_GENPLL0_125M_CLK,
.flags = IPROC_CLK_AON,
.enable = ENABLE_VAL(0x4, 6, 0, 12),
.mdiv = REG_VAL(0x18, 0, 9),
@@ -102,6 +102,65 @@ static int sr_genpll0_clk_init(struct platform_device 
*pdev)
return 0;
 }
 
+static const struct iproc_pll_ctrl sr_genpll2 = {
+   .flags = IPROC_CLK_AON | IPROC_CLK_PLL_HAS_NDIV_FRAC |
+   IPROC_CLK_PLL_NEEDS_SW_CFG,
+   .aon = AON_VAL(0x0, 1, 13, 12),
+   .reset = RESET_VAL(0x0, 12, 11),
+   .dig_filter = DF_VAL(0x0, 4, 3, 0, 4, 7, 3),
+   .sw_ctrl = SW_CTRL_VAL(0x10, 31),
+   .ndiv_int = REG_VAL(0x10, 20, 10),
+   .ndiv_frac = REG_VAL(0x10, 0, 20),
+   .pdiv = REG_VAL(0x14, 0, 4),
+   .status = REG_VAL(0x30, 12, 1),
+};
+
+static const struct iproc_clk_ctrl sr_genpll2_clk[] = {
+   [BCM_SR_GENPLL2_NIC_CLK] = {
+   .channel = BCM_SR_GENPLL2_NIC_CLK,
+   .flags = IPROC_CLK_AON,
+   .enable = ENABLE_VAL(0x4, 6, 0, 12),
+   .mdiv = REG_VAL(0x18, 0, 9),
+   },
+   [BCM_SR_GENPLL2_TS_500_CLK] = {
+   .channel = BCM_SR_GENPLL2_TS_500_CLK,
+   .flags = IPROC_CLK_AON,
+   .enable = ENABLE_VAL(0x4, 7, 1, 13),
+   .mdiv = REG_VAL(0x18, 10, 9),
+   },
+   [BCM_SR_GENPLL2_125_NITRO_CLK] = {
+   .channel = BCM_SR_GENPLL2_125_NITRO_CLK,
+   .flags = IPROC_CLK_AON,
+   .enable = ENABLE_VAL(0x4, 8, 2, 14),
+   .mdiv = REG_VAL(0x18, 20, 9),
+   },
+   [BCM_SR_GENPLL2_CHIMP_CLK] = {
+   .channel = BCM_SR_GENPLL2_CHIMP_CLK,
+   .flags = IPROC_CLK_AON,
+   .enable = ENABLE_VAL(0x4, 9, 3, 15),
+   .mdiv = REG_VAL(0x1c, 0, 9),
+   },
+   [BCM_SR_GENPLL2_NIC_FLASH_CLK] = {
+   .channel = BCM_SR_GENPLL2_NIC_FLASH_CLK,
+   .flags = IPROC_CLK_AON,
+   .enable = ENABLE_VAL(0x4, 10, 4, 16),
+   .mdiv = REG_VAL(0x1c, 10, 9),
+   },
+   [BCM_SR_GENPLL2_FS4_CLK] = {
+   .channel = BCM_SR_GENPLL2_FS4_CLK,
+   .enable = ENABLE_VAL(0x4, 11, 5, 17),
+   .mdiv = REG_VAL(0x1c, 20, 9),
+   },
+};
+
+static int sr_genpll2_clk_init(struct platform_device *pdev)
+{
+   iproc_pll_clk_setup(pdev->dev.of_node,
+   _genpll2, NULL, 0, sr_genpll2_clk,
+   ARRAY_SIZE(sr_genpll2_clk));
+   return 0;
+}
+
 static const struct iproc_pll_ctrl sr_genpll3 = {
.flags = IPROC_CLK_AON | IPROC_CLK_PLL_HAS_NDIV_FRAC |
IPROC_CLK_PLL_NEEDS_SW_CFG,
@@ -157,6 +216,30 @@ static const struct iproc_clk_ctrl sr_genpll4_clk[] = {
.enable = ENABLE_VAL(0x4, 6, 0, 12),
.mdiv = REG_VAL(0x18, 0, 9),
},
+   [BCM_SR_GENPLL4_TPIU_PLL_CLK] = {
+   .channel = BCM_SR_GENPLL4_TPIU_PLL_CLK,
+   .flags = IPROC_CLK_AON,
+   .enable = ENABLE_VAL(0x4, 7, 1, 13),
+   .mdiv = REG_VAL(0x18, 10, 9),
+   },
+   [BCM_SR_GENPLL4_NOC_CLK] = {
+   .channel = BCM_SR_GENPLL4_NOC_CLK,
+   .flags = IPROC_CLK_AON,
+   .enable = ENABLE_VAL(0x4, 8, 2, 14),
+   .mdiv = REG_VAL(0x18, 20, 9),
+   },
+   [BCM_SR_GENPLL4_CHCLK_FS4_CLK] = {
+   .channel = BCM_SR_GENPLL4_CHCLK_FS4_CLK,
+   .flags = IPROC_CLK_AON,
+   .enable = ENABLE_VAL(0x4, 9, 3, 15),
+   .mdiv = REG_VAL(0x1c, 0, 9),
+   },
+   [BCM_SR_GENPLL4_BRIDGE_FSCPU_CLK] = {
+   .channel = BCM_SR_GENPLL4_BRIDGE_FSCPU_CLK,
+   .flags = IPROC_CLK_AON,
+   .enable = ENABLE_VAL(0x4, 10, 4, 16),
+   .mdiv = REG_VAL(0x1c, 10, 9),
+   },
 };
 
 static int sr_genpll4_clk_init(struct platform_device *pdev)
@@ -181,18 +264,21 @@ static const struct iproc_pll_ctrl sr_genpll5 = {
 };
 
 static const struct iproc_clk_ctrl sr_genpll5_clk[] = {
-   [BCM_SR_GENPLL5_FS_CLK] = {
-   .channel = BCM_SR_GENPLL5_FS_CLK,
-   .flags = IPROC_CLK_AON,
+   [BCM_SR_GENPLL5_FS4_HF_CLK] = {
+

Re: [PATCH 2/3] clk: bcm: Update and add tingray clock entries

2018-06-01 Thread Ray Jui




On 6/1/2018 12:02 PM, Rob Herring wrote:

On Fri, Jun 1, 2018 at 12:56 PM, Ray Jui  wrote:

Hi Rob,

On 5/31/2018 9:25 AM, Rob Herring wrote:


On Fri, May 25, 2018 at 09:45:16AM -0700, Ray Jui wrote:


Update and add Stingray clock definitions and tables so they match the
binding document and the latest ASIC datasheet

Signed-off-by: Pramod Kumar 
Signed-off-by: Ray Jui 
---
   drivers/clk/bcm/clk-sr.c   | 135
-
   include/dt-bindings/clock/bcm-sr.h |  24 +--



This goes in the 1st patch.



Please help to confirm. You want 1st patch and 2nd patch to be combined into
a single patch?


No. include/dt-bindings/* is part of the DT binding, so it goes with
patch 1. The driver in patch 2.

Rob



Okay got it. Thanks!


Re: [PATCH 2/3] clk: bcm: Update and add tingray clock entries

2018-06-01 Thread Ray Jui




On 6/1/2018 12:02 PM, Rob Herring wrote:

On Fri, Jun 1, 2018 at 12:56 PM, Ray Jui  wrote:

Hi Rob,

On 5/31/2018 9:25 AM, Rob Herring wrote:


On Fri, May 25, 2018 at 09:45:16AM -0700, Ray Jui wrote:


Update and add Stingray clock definitions and tables so they match the
binding document and the latest ASIC datasheet

Signed-off-by: Pramod Kumar 
Signed-off-by: Ray Jui 
---
   drivers/clk/bcm/clk-sr.c   | 135
-
   include/dt-bindings/clock/bcm-sr.h |  24 +--



This goes in the 1st patch.



Please help to confirm. You want 1st patch and 2nd patch to be combined into
a single patch?


No. include/dt-bindings/* is part of the DT binding, so it goes with
patch 1. The driver in patch 2.

Rob



Okay got it. Thanks!


Re: [PATCH] KVM: VMX: Optimize tscdeadline timer latency

2018-06-01 Thread Wanpeng Li
On Wed, 30 May 2018 at 01:08, Paolo Bonzini  wrote:
>
> On 29/05/2018 16:31, Radim Krčmář wrote:
> > 2018-05-29 16:23+0200, Radim Krčmář:
> >> 2018-05-29 14:53+0800, Wanpeng Li:
> >>> From: Wanpeng Li 
> >>>
> >>> 'Commit d0659d946be0 ("KVM: x86: add option to advance tscdeadline
> >>> hrtimer expiration")' advances the tscdeadline (the timer is emulated
> >>> by hrtimer) expiration in order that the latency which is incurred
> >>> by hypervisor (apic_timer_fn -> vmentry) can be avoided. This patch
> >>> adds the advance tscdeadline expiration support to which the tscdeadline
> >>> timer is emulated by VMX preemption timer to reduce the hypervisor
> >>> lantency (handle_preemption_timer -> vmentry). clockevents infrastruture
> >>> can program minimum delay if hrtimer feeds a expiration in the past,
> >>> we set delta_tsc to 1(which will be converted to 0 before vmentry)
> >>> which can lead to an immediately vmexit when delta_tsc is not bigger
> >>> than advance ns.
> >>>
> >>> This patch can reduce ~63% latency (~4450 cycles to ~1660 cycles on
> >>> a haswell desktop) for kvm-unit-tests/tscdeadline_latency when testing
> >>> busy waits.
> >>>
> >>> Cc: Paolo Bonzini 
> >>> Cc: Radim Krčmář 
> >>> Signed-off-by: Wanpeng Li 
> >>> ---
> >>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> >>> @@ -12444,6 +12444,12 @@ static int vmx_set_hv_timer(struct kvm_vcpu 
> >>> *vcpu, u64 guest_deadline_tsc)
> >>> tscl = rdtsc();
> >>> guest_tscl = kvm_read_l1_tsc(vcpu, tscl);
> >>> delta_tsc = max(guest_deadline_tsc, guest_tscl) - guest_tscl;
> >>> +   lapic_timer_advance_cycles = nsec_to_cycles(vcpu, 
> >>> lapic_timer_advance_ns);
> >>> +   if (delta_tsc > lapic_timer_advance_cycles)
> >>> +   delta_tsc -= lapic_timer_advance_cycles;
> >>> +   else
> >>> +   delta_tsc = 1;
> >>
> >> Why don't we just "return 1" to say that the timer has expired?
> >
> > This case might be rare, so setting delta_tsc = 0 would be safer.
>
> Queued with this change.  Indeed this case matches vmx_arm_hv_timer so

The patch description should also be updated in kvm/queue I think.

Regards,
Wanpeng Li


Re: [PATCH] KVM: VMX: Optimize tscdeadline timer latency

2018-06-01 Thread Wanpeng Li
On Wed, 30 May 2018 at 01:08, Paolo Bonzini  wrote:
>
> On 29/05/2018 16:31, Radim Krčmář wrote:
> > 2018-05-29 16:23+0200, Radim Krčmář:
> >> 2018-05-29 14:53+0800, Wanpeng Li:
> >>> From: Wanpeng Li 
> >>>
> >>> 'Commit d0659d946be0 ("KVM: x86: add option to advance tscdeadline
> >>> hrtimer expiration")' advances the tscdeadline (the timer is emulated
> >>> by hrtimer) expiration in order that the latency which is incurred
> >>> by hypervisor (apic_timer_fn -> vmentry) can be avoided. This patch
> >>> adds the advance tscdeadline expiration support to which the tscdeadline
> >>> timer is emulated by VMX preemption timer to reduce the hypervisor
> >>> lantency (handle_preemption_timer -> vmentry). clockevents infrastruture
> >>> can program minimum delay if hrtimer feeds a expiration in the past,
> >>> we set delta_tsc to 1(which will be converted to 0 before vmentry)
> >>> which can lead to an immediately vmexit when delta_tsc is not bigger
> >>> than advance ns.
> >>>
> >>> This patch can reduce ~63% latency (~4450 cycles to ~1660 cycles on
> >>> a haswell desktop) for kvm-unit-tests/tscdeadline_latency when testing
> >>> busy waits.
> >>>
> >>> Cc: Paolo Bonzini 
> >>> Cc: Radim Krčmář 
> >>> Signed-off-by: Wanpeng Li 
> >>> ---
> >>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> >>> @@ -12444,6 +12444,12 @@ static int vmx_set_hv_timer(struct kvm_vcpu 
> >>> *vcpu, u64 guest_deadline_tsc)
> >>> tscl = rdtsc();
> >>> guest_tscl = kvm_read_l1_tsc(vcpu, tscl);
> >>> delta_tsc = max(guest_deadline_tsc, guest_tscl) - guest_tscl;
> >>> +   lapic_timer_advance_cycles = nsec_to_cycles(vcpu, 
> >>> lapic_timer_advance_ns);
> >>> +   if (delta_tsc > lapic_timer_advance_cycles)
> >>> +   delta_tsc -= lapic_timer_advance_cycles;
> >>> +   else
> >>> +   delta_tsc = 1;
> >>
> >> Why don't we just "return 1" to say that the timer has expired?
> >
> > This case might be rare, so setting delta_tsc = 0 would be safer.
>
> Queued with this change.  Indeed this case matches vmx_arm_hv_timer so

The patch description should also be updated in kvm/queue I think.

Regards,
Wanpeng Li


[PATCH v2] of: platform: stop accessing invalid dev in of_platform_device_destroy

2018-06-01 Thread Srinivas Kandagatla
Immediately after the platform_device_unregister() the device will be cleaned 
up.
Accessing the freed pointer immediately after that will crash the system.

Found this bug when kernel is built with CONFIG_PAGE_POISONING and testing
loading/unloading audio drivers in a loop on Qcom platforms.

Fix this by removing accessing the dev pointer.
Below is the carsh trace:

Unable to handle kernel paging request at virtual address 6b6b6b6b6b6c03
Mem abort info:
  ESR = 0x9621
  Exception class = DABT (current EL), IL = 32 bits
  SET = 0, FnV = 0
  EA = 0, S1PTW = 0
Data abort info:
  ISV = 0, ISS = 0x0021
  CM = 0, WnR = 0
[006b6b6b6b6b6c03] address between user and kernel address ranges
Internal error: Oops: 9621 [#1] PREEMPT SMP
Modules linked in:
CPU: 2 PID: 1784 Comm: sh Tainted: GW 
4.17.0-rc7-02230-ge3a63a7ef641-dirty #204
Hardware name: Qualcomm Technologies, Inc. APQ 8016 SBC (DT)
pstate: 8005 (Nzcv daif -PAN -UAO)
pc : clear_bit+0x18/0x2c
lr : of_platform_device_destroy+0x64/0xb8
sp : 0c9c3930
x29: 0c9c3930 x28: 80003d39b200
x27: 08bb1000 x26: 0040
x25: 0124 x24: 80003a9a3080
x23: 0060 x22: 0939f518
x21: 80003aa79e98 x20: 80003aa3dae0
x19: 80003aa3c890 x18: 89feb794
x17:  x16: 
x15: 89feb790 x14: 
x13: 80003a058778 x12: 80003a058728
x11: 80003a058750 x10: 
x9 : 0006 x8 : 80003a825988
x7 :  x6 : 0001
x5 :  x4 : 0001
x3 : 0008 x2 : 0001
x1 : 6b6b6b6b6b6b6c03 x0 : 
Process sh (pid: 1784, stack limit = 0x(ptrval))
Call trace:
 clear_bit+0x18/0x2c
 q6afe_remove+0x20/0x38
 apr_device_remove+0x30/0x70
 device_release_driver_internal+0x170/0x208
 device_release_driver+0x14/0x20
 bus_remove_device+0xcc/0x150
 device_del+0x10c/0x310
 device_unregister+0x1c/0x70
 apr_remove_device+0xc/0x18
 device_for_each_child+0x50/0x80
 apr_remove+0x18/0x20
 rpmsg_dev_remove+0x38/0x68
 device_release_driver_internal+0x170/0x208
 device_release_driver+0x14/0x20
 bus_remove_device+0xcc/0x150
 device_del+0x10c/0x310
 device_unregister+0x1c/0x70
 qcom_smd_remove_device+0xc/0x18
 device_for_each_child+0x50/0x80
 qcom_smd_unregister_edge+0x3c/0x70
 smd_subdev_remove+0x18/0x28
 rproc_stop+0x48/0xd8
 rproc_shutdown+0x60/0xe8
 state_store+0xbc/0xf8
 dev_attr_store+0x18/0x28
 sysfs_kf_write+0x3c/0x50
 kernfs_fop_write+0x118/0x1e0
 __vfs_write+0x18/0x110
 vfs_write+0xa4/0x1a8
 ksys_write+0x48/0xb0
 sys_write+0xc/0x18
 el0_svc_naked+0x30/0x34
Code: d2800022 8b400c21 f9800031 9ac32043 (c85f7c22)

Signed-off-by: Srinivas Kandagatla 
---
Changes since v1:
- fixed issue while reprobing.

 drivers/of/platform.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/of/platform.c b/drivers/of/platform.c
index c00d81dfac0b..84c5c899187b 100644
--- a/drivers/of/platform.c
+++ b/drivers/of/platform.c
@@ -529,10 +529,13 @@ arch_initcall_sync(of_platform_default_populate_init);
 
 int of_platform_device_destroy(struct device *dev, void *data)
 {
+   struct device_node *np;
+
/* Do not touch devices not populated from the device tree */
if (!dev->of_node || !of_node_check_flag(dev->of_node, OF_POPULATED))
return 0;
 
+   np = dev->of_node;
/* Recurse for any nodes that were treated as busses */
if (of_node_check_flag(dev->of_node, OF_POPULATED_BUS))
device_for_each_child(dev, NULL, of_platform_device_destroy);
@@ -544,8 +547,8 @@ int of_platform_device_destroy(struct device *dev, void 
*data)
amba_device_unregister(to_amba_device(dev));
 #endif
 
-   of_node_clear_flag(dev->of_node, OF_POPULATED);
-   of_node_clear_flag(dev->of_node, OF_POPULATED_BUS);
+   of_node_clear_flag(np, OF_POPULATED);
+   of_node_clear_flag(np, OF_POPULATED_BUS);
return 0;
 }
 EXPORT_SYMBOL_GPL(of_platform_device_destroy);
-- 
2.16.2



[PATCH v2] of: platform: stop accessing invalid dev in of_platform_device_destroy

2018-06-01 Thread Srinivas Kandagatla
Immediately after the platform_device_unregister() the device will be cleaned 
up.
Accessing the freed pointer immediately after that will crash the system.

Found this bug when kernel is built with CONFIG_PAGE_POISONING and testing
loading/unloading audio drivers in a loop on Qcom platforms.

Fix this by removing accessing the dev pointer.
Below is the carsh trace:

Unable to handle kernel paging request at virtual address 6b6b6b6b6b6c03
Mem abort info:
  ESR = 0x9621
  Exception class = DABT (current EL), IL = 32 bits
  SET = 0, FnV = 0
  EA = 0, S1PTW = 0
Data abort info:
  ISV = 0, ISS = 0x0021
  CM = 0, WnR = 0
[006b6b6b6b6b6c03] address between user and kernel address ranges
Internal error: Oops: 9621 [#1] PREEMPT SMP
Modules linked in:
CPU: 2 PID: 1784 Comm: sh Tainted: GW 
4.17.0-rc7-02230-ge3a63a7ef641-dirty #204
Hardware name: Qualcomm Technologies, Inc. APQ 8016 SBC (DT)
pstate: 8005 (Nzcv daif -PAN -UAO)
pc : clear_bit+0x18/0x2c
lr : of_platform_device_destroy+0x64/0xb8
sp : 0c9c3930
x29: 0c9c3930 x28: 80003d39b200
x27: 08bb1000 x26: 0040
x25: 0124 x24: 80003a9a3080
x23: 0060 x22: 0939f518
x21: 80003aa79e98 x20: 80003aa3dae0
x19: 80003aa3c890 x18: 89feb794
x17:  x16: 
x15: 89feb790 x14: 
x13: 80003a058778 x12: 80003a058728
x11: 80003a058750 x10: 
x9 : 0006 x8 : 80003a825988
x7 :  x6 : 0001
x5 :  x4 : 0001
x3 : 0008 x2 : 0001
x1 : 6b6b6b6b6b6b6c03 x0 : 
Process sh (pid: 1784, stack limit = 0x(ptrval))
Call trace:
 clear_bit+0x18/0x2c
 q6afe_remove+0x20/0x38
 apr_device_remove+0x30/0x70
 device_release_driver_internal+0x170/0x208
 device_release_driver+0x14/0x20
 bus_remove_device+0xcc/0x150
 device_del+0x10c/0x310
 device_unregister+0x1c/0x70
 apr_remove_device+0xc/0x18
 device_for_each_child+0x50/0x80
 apr_remove+0x18/0x20
 rpmsg_dev_remove+0x38/0x68
 device_release_driver_internal+0x170/0x208
 device_release_driver+0x14/0x20
 bus_remove_device+0xcc/0x150
 device_del+0x10c/0x310
 device_unregister+0x1c/0x70
 qcom_smd_remove_device+0xc/0x18
 device_for_each_child+0x50/0x80
 qcom_smd_unregister_edge+0x3c/0x70
 smd_subdev_remove+0x18/0x28
 rproc_stop+0x48/0xd8
 rproc_shutdown+0x60/0xe8
 state_store+0xbc/0xf8
 dev_attr_store+0x18/0x28
 sysfs_kf_write+0x3c/0x50
 kernfs_fop_write+0x118/0x1e0
 __vfs_write+0x18/0x110
 vfs_write+0xa4/0x1a8
 ksys_write+0x48/0xb0
 sys_write+0xc/0x18
 el0_svc_naked+0x30/0x34
Code: d2800022 8b400c21 f9800031 9ac32043 (c85f7c22)

Signed-off-by: Srinivas Kandagatla 
---
Changes since v1:
- fixed issue while reprobing.

 drivers/of/platform.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/of/platform.c b/drivers/of/platform.c
index c00d81dfac0b..84c5c899187b 100644
--- a/drivers/of/platform.c
+++ b/drivers/of/platform.c
@@ -529,10 +529,13 @@ arch_initcall_sync(of_platform_default_populate_init);
 
 int of_platform_device_destroy(struct device *dev, void *data)
 {
+   struct device_node *np;
+
/* Do not touch devices not populated from the device tree */
if (!dev->of_node || !of_node_check_flag(dev->of_node, OF_POPULATED))
return 0;
 
+   np = dev->of_node;
/* Recurse for any nodes that were treated as busses */
if (of_node_check_flag(dev->of_node, OF_POPULATED_BUS))
device_for_each_child(dev, NULL, of_platform_device_destroy);
@@ -544,8 +547,8 @@ int of_platform_device_destroy(struct device *dev, void 
*data)
amba_device_unregister(to_amba_device(dev));
 #endif
 
-   of_node_clear_flag(dev->of_node, OF_POPULATED);
-   of_node_clear_flag(dev->of_node, OF_POPULATED_BUS);
+   of_node_clear_flag(np, OF_POPULATED);
+   of_node_clear_flag(np, OF_POPULATED_BUS);
return 0;
 }
 EXPORT_SYMBOL_GPL(of_platform_device_destroy);
-- 
2.16.2



Re: [PATCH -tip v4 24/27] bpf: error-inject: kprobes: Clear current_kprobe and enable preempt in kprobe

2018-06-01 Thread Masami Hiramatsu
On Thu, 31 May 2018 16:25:38 +0530
"Naveen N. Rao"  wrote:

> Masami Hiramatsu wrote:
> > Clear current_kprobe and enable preemption in kprobe
> > even if pre_handler returns !0.
> > 
> > This simplifies function override using kprobes.
> > 
> > Jprobe used to require to keep the preemption disabled and
> > keep current_kprobe until it returned to original function
> > entry. For this reason kprobe_int3_handler() and similar
> > arch dependent kprobe handers checks pre_handler result
> > and exit without enabling preemption if the result is !0.
> > 
> > After removing the jprobe, Kprobes does not need to
> > keep preempt disabled even if user handler returns !0
> > anymore.
> 
> I think the reason jprobes did it that way is to address architecture 
> specific requirements when changing a function. So, without that 
> infrastructure, I am not sure if we will be able to claim support for 
> over-riding functions with kprobes. I am not sure if we want to claim 
> that, but this is something we need to be clear on.

Really? as far as I can see, there seems no such architecture.
The keeping preempt disabled is corresponding to keeping current_kprobe
since the current_kprobe is per-cpu. This means if it is preempted
before hitting break_handler and changed cpu core, we missed to
handle current_kprobe and goes to panic. But if we don't need
such "break back" (removing break_handler), we don't need to
keep current_kprobe (because it is not handled afterwards).

Anyway, changing function execution path is a "one-way" change.
We don't have a chance to fixup that disabled preemption and current_kprobe
after returning to the new function. So current error-inject clears
current_kprobe and enable preemption before returning !0 from its
kprobe pre_handler.

This is just moving such needless operation from user-pre_handler to
kprobes itself. 

> For powerpc, the current function override in error-inject works fine 
> since the new function does nothing. But, if anyone wants to do more 
> work in the replacement function, it won't work with the current 
> approach.

If you are considering about TOC change etc. yes, it depends on
the archtecture. As far as I know IA64 and powerpc will not allow
to support changing execution path without special care.
Other "flat and simple" function call architectures like x86, arm
can change execution path without special care. 

Anyway that is not related to this change. This is just a
cleanup in total, something like re-balancing the operation.

> > But since the function override handler in error-inject
> > and bpf is also returns !0 if it overrides a function,
> > to balancing the preempt count, it enables preemption
> > and reset current kprobe by itself.
> > 
> > That is a bad design that is very buggy. This fixes
> > such unbalanced preempt-count and current_kprobes setting
> > in kprobes, bpf and error-inject.
> > 
> > Signed-off-by: Masami Hiramatsu 
> > ---
> >  arch/arc/kernel/kprobes.c  |5 +++--
> >  arch/arm/probes/kprobes/core.c |   10 +-
> >  arch/arm64/kernel/probes/kprobes.c |   10 +-
> >  arch/ia64/kernel/kprobes.c |   13 -
> >  arch/mips/kernel/kprobes.c |4 ++--
> >  arch/powerpc/kernel/kprobes.c  |7 +--
> 
> I think you should also update arch/powerpc/kernel/kprobes-ftrace.c

Ah, good catch!! I'll fix that.

Thank you!

-- 
Masami Hiramatsu 


Re: [PATCH -tip v4 24/27] bpf: error-inject: kprobes: Clear current_kprobe and enable preempt in kprobe

2018-06-01 Thread Masami Hiramatsu
On Thu, 31 May 2018 16:25:38 +0530
"Naveen N. Rao"  wrote:

> Masami Hiramatsu wrote:
> > Clear current_kprobe and enable preemption in kprobe
> > even if pre_handler returns !0.
> > 
> > This simplifies function override using kprobes.
> > 
> > Jprobe used to require to keep the preemption disabled and
> > keep current_kprobe until it returned to original function
> > entry. For this reason kprobe_int3_handler() and similar
> > arch dependent kprobe handers checks pre_handler result
> > and exit without enabling preemption if the result is !0.
> > 
> > After removing the jprobe, Kprobes does not need to
> > keep preempt disabled even if user handler returns !0
> > anymore.
> 
> I think the reason jprobes did it that way is to address architecture 
> specific requirements when changing a function. So, without that 
> infrastructure, I am not sure if we will be able to claim support for 
> over-riding functions with kprobes. I am not sure if we want to claim 
> that, but this is something we need to be clear on.

Really? as far as I can see, there seems no such architecture.
The keeping preempt disabled is corresponding to keeping current_kprobe
since the current_kprobe is per-cpu. This means if it is preempted
before hitting break_handler and changed cpu core, we missed to
handle current_kprobe and goes to panic. But if we don't need
such "break back" (removing break_handler), we don't need to
keep current_kprobe (because it is not handled afterwards).

Anyway, changing function execution path is a "one-way" change.
We don't have a chance to fixup that disabled preemption and current_kprobe
after returning to the new function. So current error-inject clears
current_kprobe and enable preemption before returning !0 from its
kprobe pre_handler.

This is just moving such needless operation from user-pre_handler to
kprobes itself. 

> For powerpc, the current function override in error-inject works fine 
> since the new function does nothing. But, if anyone wants to do more 
> work in the replacement function, it won't work with the current 
> approach.

If you are considering about TOC change etc. yes, it depends on
the archtecture. As far as I know IA64 and powerpc will not allow
to support changing execution path without special care.
Other "flat and simple" function call architectures like x86, arm
can change execution path without special care. 

Anyway that is not related to this change. This is just a
cleanup in total, something like re-balancing the operation.

> > But since the function override handler in error-inject
> > and bpf is also returns !0 if it overrides a function,
> > to balancing the preempt count, it enables preemption
> > and reset current kprobe by itself.
> > 
> > That is a bad design that is very buggy. This fixes
> > such unbalanced preempt-count and current_kprobes setting
> > in kprobes, bpf and error-inject.
> > 
> > Signed-off-by: Masami Hiramatsu 
> > ---
> >  arch/arc/kernel/kprobes.c  |5 +++--
> >  arch/arm/probes/kprobes/core.c |   10 +-
> >  arch/arm64/kernel/probes/kprobes.c |   10 +-
> >  arch/ia64/kernel/kprobes.c |   13 -
> >  arch/mips/kernel/kprobes.c |4 ++--
> >  arch/powerpc/kernel/kprobes.c  |7 +--
> 
> I think you should also update arch/powerpc/kernel/kprobes-ftrace.c

Ah, good catch!! I'll fix that.

Thank you!

-- 
Masami Hiramatsu 


Re: [PATCH] of: platform: stop accessing invalid dev in of_platform_device_destroy

2018-06-01 Thread Srinivas Kandagatla




On 01/06/18 23:58, Srinivas Kandagatla wrote:
  
-	of_node_clear_flag(dev->of_node, OF_POPULATED);

-   of_node_clear_flag(dev->of_node, OF_POPULATED_BUS);
This change seems to have a side effect during re-probing. I will dig in 
bit more to see how best it can be fixed.


thanks,
srini


[PATCH] rpmsg: smd: do not use mananged resources for endpoints and channels

2018-06-01 Thread Srinivas Kandagatla
All the managed resources would be freed by the time release function
is invoked. Handling such memory in qcom_smd_edge_release() would do
bad things.

Found this issue while testing Audio usecase where the dsp is started up
and shutdown in a loop.

This patch fixes this issue by using simple kzalloc for allocating
channel->name and channel which is then freed in qcom_smd_edge_release().

Without this patch restarting a remoteproc would crash the system.
Fixes: 53e2822e56c7 ("rpmsg: Introduce Qualcomm SMD backend")
Cc: 
Signed-off-by: Srinivas Kandagatla 
---
 drivers/rpmsg/qcom_smd.c | 18 ++
 1 file changed, 10 insertions(+), 8 deletions(-)

diff --git a/drivers/rpmsg/qcom_smd.c b/drivers/rpmsg/qcom_smd.c
index 5ce9bf7b897d..49d3838dfc3d 100644
--- a/drivers/rpmsg/qcom_smd.c
+++ b/drivers/rpmsg/qcom_smd.c
@@ -1100,12 +1100,12 @@ static struct qcom_smd_channel 
*qcom_smd_create_channel(struct qcom_smd_edge *ed
void *info;
int ret;
 
-   channel = devm_kzalloc(>dev, sizeof(*channel), GFP_KERNEL);
+   channel = kzalloc(sizeof(*channel), GFP_KERNEL);
if (!channel)
return ERR_PTR(-ENOMEM);
 
channel->edge = edge;
-   channel->name = devm_kstrdup(>dev, name, GFP_KERNEL);
+   channel->name = kstrdup(name, GFP_KERNEL);
if (!channel->name)
return ERR_PTR(-ENOMEM);
 
@@ -1156,8 +1156,8 @@ static struct qcom_smd_channel 
*qcom_smd_create_channel(struct qcom_smd_edge *ed
return channel;
 
 free_name_and_channel:
-   devm_kfree(>dev, channel->name);
-   devm_kfree(>dev, channel);
+   kfree(channel->name);
+   kfree(channel);
 
return ERR_PTR(ret);
 }
@@ -1380,11 +1380,13 @@ static void qcom_smd_edge_release(struct device *dev)
 {
struct qcom_smd_channel *channel;
struct qcom_smd_edge *edge = to_smd_edge(dev);
+   struct list_head *this, *tmp;
 
-   list_for_each_entry(channel, >channels, list) {
-   SET_RX_CHANNEL_INFO(channel, state, SMD_CHANNEL_CLOSED);
-   SET_RX_CHANNEL_INFO(channel, head, 0);
-   SET_RX_CHANNEL_INFO(channel, tail, 0);
+   list_for_each_safe(this, tmp, >channels) {
+   channel = list_entry(this, struct qcom_smd_channel, list);
+   list_del(>list);
+   kfree(channel->name);
+   kfree(channel);
}
 
kfree(edge);
-- 
2.16.2



Re: [PATCH] of: platform: stop accessing invalid dev in of_platform_device_destroy

2018-06-01 Thread Srinivas Kandagatla




On 01/06/18 23:58, Srinivas Kandagatla wrote:
  
-	of_node_clear_flag(dev->of_node, OF_POPULATED);

-   of_node_clear_flag(dev->of_node, OF_POPULATED_BUS);
This change seems to have a side effect during re-probing. I will dig in 
bit more to see how best it can be fixed.


thanks,
srini


[PATCH] rpmsg: smd: do not use mananged resources for endpoints and channels

2018-06-01 Thread Srinivas Kandagatla
All the managed resources would be freed by the time release function
is invoked. Handling such memory in qcom_smd_edge_release() would do
bad things.

Found this issue while testing Audio usecase where the dsp is started up
and shutdown in a loop.

This patch fixes this issue by using simple kzalloc for allocating
channel->name and channel which is then freed in qcom_smd_edge_release().

Without this patch restarting a remoteproc would crash the system.
Fixes: 53e2822e56c7 ("rpmsg: Introduce Qualcomm SMD backend")
Cc: 
Signed-off-by: Srinivas Kandagatla 
---
 drivers/rpmsg/qcom_smd.c | 18 ++
 1 file changed, 10 insertions(+), 8 deletions(-)

diff --git a/drivers/rpmsg/qcom_smd.c b/drivers/rpmsg/qcom_smd.c
index 5ce9bf7b897d..49d3838dfc3d 100644
--- a/drivers/rpmsg/qcom_smd.c
+++ b/drivers/rpmsg/qcom_smd.c
@@ -1100,12 +1100,12 @@ static struct qcom_smd_channel 
*qcom_smd_create_channel(struct qcom_smd_edge *ed
void *info;
int ret;
 
-   channel = devm_kzalloc(>dev, sizeof(*channel), GFP_KERNEL);
+   channel = kzalloc(sizeof(*channel), GFP_KERNEL);
if (!channel)
return ERR_PTR(-ENOMEM);
 
channel->edge = edge;
-   channel->name = devm_kstrdup(>dev, name, GFP_KERNEL);
+   channel->name = kstrdup(name, GFP_KERNEL);
if (!channel->name)
return ERR_PTR(-ENOMEM);
 
@@ -1156,8 +1156,8 @@ static struct qcom_smd_channel 
*qcom_smd_create_channel(struct qcom_smd_edge *ed
return channel;
 
 free_name_and_channel:
-   devm_kfree(>dev, channel->name);
-   devm_kfree(>dev, channel);
+   kfree(channel->name);
+   kfree(channel);
 
return ERR_PTR(ret);
 }
@@ -1380,11 +1380,13 @@ static void qcom_smd_edge_release(struct device *dev)
 {
struct qcom_smd_channel *channel;
struct qcom_smd_edge *edge = to_smd_edge(dev);
+   struct list_head *this, *tmp;
 
-   list_for_each_entry(channel, >channels, list) {
-   SET_RX_CHANNEL_INFO(channel, state, SMD_CHANNEL_CLOSED);
-   SET_RX_CHANNEL_INFO(channel, head, 0);
-   SET_RX_CHANNEL_INFO(channel, tail, 0);
+   list_for_each_safe(this, tmp, >channels) {
+   channel = list_entry(this, struct qcom_smd_channel, list);
+   list_del(>list);
+   kfree(channel->name);
+   kfree(channel);
}
 
kfree(edge);
-- 
2.16.2



Re: [PATCH] autofs: make autofs4 and autofs mutually exclusive

2018-06-01 Thread Andrew Morton
On Fri, 01 Jun 2018 16:42:36 +0800 Ian Kent  wrote:

> The resulting "autofs-create-autofs-kconfig-and-makefile.patch" should
> now look like

Got it, thanks ;)


Re: [PATCH] autofs: make autofs4 and autofs mutually exclusive

2018-06-01 Thread Andrew Morton
On Fri, 01 Jun 2018 16:42:36 +0800 Ian Kent  wrote:

> The resulting "autofs-create-autofs-kconfig-and-makefile.patch" should
> now look like

Got it, thanks ;)


Re: [PATCH v2]: perf record: enable arbitrary event names thru name= modifier

2018-06-01 Thread Andi Kleen
On Thu, May 31, 2018 at 05:08:12PM +0300, Alexey Budankov wrote:
> 
> Enable complex event names containing [.:=,] symbols to be encoded into Perf 
> trace using name= modifier e.g. like this:
> 
> perf record -e 
> cpu/name=\'OFFCORE_RESPONSE:request=DEMAND_RFO:response=L3_HIT.SNOOP_HITM\',\
>   period=0x3567e0,event=0x3c,cmask=0x1/Duk ./futex
> 
> Below is how it looks like in the report output. Please note explicit escaped 
> quoting at cmdline string in the header so that thestring can be directly 
> reused
> for another collection in shell:

Patch looks good, but new syntax also needs to be documented in some manpage
(e.g. perf list)

-Andi


[PATCH] of: platform: stop accessing invalid dev in of_platform_device_destroy

2018-06-01 Thread Srinivas Kandagatla
Immediately after the platform_device_unregister() the device will be cleaned 
up.
Accessing the freed pointer immediately after that will crash the system.

Found this bug when kernel is built with CONFIG_PAGE_POISONING and testing
loading/unloading audio drivers in a loop on Qcom platforms.

Fix this by removing accessing the dev pointer.
Below is the carsh trace:

Unable to handle kernel paging request at virtual address 6b6b6b6b6b6c03
Mem abort info:
  ESR = 0x9621
  Exception class = DABT (current EL), IL = 32 bits
  SET = 0, FnV = 0
  EA = 0, S1PTW = 0
Data abort info:
  ISV = 0, ISS = 0x0021
  CM = 0, WnR = 0
[006b6b6b6b6b6c03] address between user and kernel address ranges
Internal error: Oops: 9621 [#1] PREEMPT SMP
Modules linked in:
CPU: 2 PID: 1784 Comm: sh Tainted: GW 
4.17.0-rc7-02230-ge3a63a7ef641-dirty #204
Hardware name: Qualcomm Technologies, Inc. APQ 8016 SBC (DT)
pstate: 8005 (Nzcv daif -PAN -UAO)
pc : clear_bit+0x18/0x2c
lr : of_platform_device_destroy+0x64/0xb8
sp : 0c9c3930
x29: 0c9c3930 x28: 80003d39b200
x27: 08bb1000 x26: 0040
x25: 0124 x24: 80003a9a3080
x23: 0060 x22: 0939f518
x21: 80003aa79e98 x20: 80003aa3dae0
x19: 80003aa3c890 x18: 89feb794
x17:  x16: 
x15: 89feb790 x14: 
x13: 80003a058778 x12: 80003a058728
x11: 80003a058750 x10: 
x9 : 0006 x8 : 80003a825988
x7 :  x6 : 0001
x5 :  x4 : 0001
x3 : 0008 x2 : 0001
x1 : 6b6b6b6b6b6b6c03 x0 : 
Process sh (pid: 1784, stack limit = 0x(ptrval))
Call trace:
 clear_bit+0x18/0x2c
 q6afe_remove+0x20/0x38
 apr_device_remove+0x30/0x70
 device_release_driver_internal+0x170/0x208
 device_release_driver+0x14/0x20
 bus_remove_device+0xcc/0x150
 device_del+0x10c/0x310
 device_unregister+0x1c/0x70
 apr_remove_device+0xc/0x18
 device_for_each_child+0x50/0x80
 apr_remove+0x18/0x20
 rpmsg_dev_remove+0x38/0x68
 device_release_driver_internal+0x170/0x208
 device_release_driver+0x14/0x20
 bus_remove_device+0xcc/0x150
 device_del+0x10c/0x310
 device_unregister+0x1c/0x70
 qcom_smd_remove_device+0xc/0x18
 device_for_each_child+0x50/0x80
 qcom_smd_unregister_edge+0x3c/0x70
 smd_subdev_remove+0x18/0x28
 rproc_stop+0x48/0xd8
 rproc_shutdown+0x60/0xe8
 state_store+0xbc/0xf8
 dev_attr_store+0x18/0x28
 sysfs_kf_write+0x3c/0x50
 kernfs_fop_write+0x118/0x1e0
 __vfs_write+0x18/0x110
 vfs_write+0xa4/0x1a8
 ksys_write+0x48/0xb0
 sys_write+0xc/0x18
 el0_svc_naked+0x30/0x34
Code: d2800022 8b400c21 f9800031 9ac32043 (c85f7c22)
---[ end trace 32020935775616a2 ]---

Signed-off-by: Srinivas Kandagatla 
---
 drivers/of/platform.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/of/platform.c b/drivers/of/platform.c
index c00d81dfac0b..78c32b93c0e7 100644
--- a/drivers/of/platform.c
+++ b/drivers/of/platform.c
@@ -544,8 +544,6 @@ int of_platform_device_destroy(struct device *dev, void 
*data)
amba_device_unregister(to_amba_device(dev));
 #endif
 
-   of_node_clear_flag(dev->of_node, OF_POPULATED);
-   of_node_clear_flag(dev->of_node, OF_POPULATED_BUS);
return 0;
 }
 EXPORT_SYMBOL_GPL(of_platform_device_destroy);
-- 
2.16.2



Re: [PATCH v2]: perf record: enable arbitrary event names thru name= modifier

2018-06-01 Thread Andi Kleen
On Thu, May 31, 2018 at 05:08:12PM +0300, Alexey Budankov wrote:
> 
> Enable complex event names containing [.:=,] symbols to be encoded into Perf 
> trace using name= modifier e.g. like this:
> 
> perf record -e 
> cpu/name=\'OFFCORE_RESPONSE:request=DEMAND_RFO:response=L3_HIT.SNOOP_HITM\',\
>   period=0x3567e0,event=0x3c,cmask=0x1/Duk ./futex
> 
> Below is how it looks like in the report output. Please note explicit escaped 
> quoting at cmdline string in the header so that thestring can be directly 
> reused
> for another collection in shell:

Patch looks good, but new syntax also needs to be documented in some manpage
(e.g. perf list)

-Andi


[PATCH] of: platform: stop accessing invalid dev in of_platform_device_destroy

2018-06-01 Thread Srinivas Kandagatla
Immediately after the platform_device_unregister() the device will be cleaned 
up.
Accessing the freed pointer immediately after that will crash the system.

Found this bug when kernel is built with CONFIG_PAGE_POISONING and testing
loading/unloading audio drivers in a loop on Qcom platforms.

Fix this by removing accessing the dev pointer.
Below is the carsh trace:

Unable to handle kernel paging request at virtual address 6b6b6b6b6b6c03
Mem abort info:
  ESR = 0x9621
  Exception class = DABT (current EL), IL = 32 bits
  SET = 0, FnV = 0
  EA = 0, S1PTW = 0
Data abort info:
  ISV = 0, ISS = 0x0021
  CM = 0, WnR = 0
[006b6b6b6b6b6c03] address between user and kernel address ranges
Internal error: Oops: 9621 [#1] PREEMPT SMP
Modules linked in:
CPU: 2 PID: 1784 Comm: sh Tainted: GW 
4.17.0-rc7-02230-ge3a63a7ef641-dirty #204
Hardware name: Qualcomm Technologies, Inc. APQ 8016 SBC (DT)
pstate: 8005 (Nzcv daif -PAN -UAO)
pc : clear_bit+0x18/0x2c
lr : of_platform_device_destroy+0x64/0xb8
sp : 0c9c3930
x29: 0c9c3930 x28: 80003d39b200
x27: 08bb1000 x26: 0040
x25: 0124 x24: 80003a9a3080
x23: 0060 x22: 0939f518
x21: 80003aa79e98 x20: 80003aa3dae0
x19: 80003aa3c890 x18: 89feb794
x17:  x16: 
x15: 89feb790 x14: 
x13: 80003a058778 x12: 80003a058728
x11: 80003a058750 x10: 
x9 : 0006 x8 : 80003a825988
x7 :  x6 : 0001
x5 :  x4 : 0001
x3 : 0008 x2 : 0001
x1 : 6b6b6b6b6b6b6c03 x0 : 
Process sh (pid: 1784, stack limit = 0x(ptrval))
Call trace:
 clear_bit+0x18/0x2c
 q6afe_remove+0x20/0x38
 apr_device_remove+0x30/0x70
 device_release_driver_internal+0x170/0x208
 device_release_driver+0x14/0x20
 bus_remove_device+0xcc/0x150
 device_del+0x10c/0x310
 device_unregister+0x1c/0x70
 apr_remove_device+0xc/0x18
 device_for_each_child+0x50/0x80
 apr_remove+0x18/0x20
 rpmsg_dev_remove+0x38/0x68
 device_release_driver_internal+0x170/0x208
 device_release_driver+0x14/0x20
 bus_remove_device+0xcc/0x150
 device_del+0x10c/0x310
 device_unregister+0x1c/0x70
 qcom_smd_remove_device+0xc/0x18
 device_for_each_child+0x50/0x80
 qcom_smd_unregister_edge+0x3c/0x70
 smd_subdev_remove+0x18/0x28
 rproc_stop+0x48/0xd8
 rproc_shutdown+0x60/0xe8
 state_store+0xbc/0xf8
 dev_attr_store+0x18/0x28
 sysfs_kf_write+0x3c/0x50
 kernfs_fop_write+0x118/0x1e0
 __vfs_write+0x18/0x110
 vfs_write+0xa4/0x1a8
 ksys_write+0x48/0xb0
 sys_write+0xc/0x18
 el0_svc_naked+0x30/0x34
Code: d2800022 8b400c21 f9800031 9ac32043 (c85f7c22)
---[ end trace 32020935775616a2 ]---

Signed-off-by: Srinivas Kandagatla 
---
 drivers/of/platform.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/of/platform.c b/drivers/of/platform.c
index c00d81dfac0b..78c32b93c0e7 100644
--- a/drivers/of/platform.c
+++ b/drivers/of/platform.c
@@ -544,8 +544,6 @@ int of_platform_device_destroy(struct device *dev, void 
*data)
amba_device_unregister(to_amba_device(dev));
 #endif
 
-   of_node_clear_flag(dev->of_node, OF_POPULATED);
-   of_node_clear_flag(dev->of_node, OF_POPULATED_BUS);
return 0;
 }
 EXPORT_SYMBOL_GPL(of_platform_device_destroy);
-- 
2.16.2



[PATCH] ASoC: dapm: delete dapm_kcontrol_data paths entry before freeing

2018-06-01 Thread Srinivas Kandagatla
dapm_kcontrol_data is freed as part of dapm_kcontrol_free(), leaving the
paths list pointer dangling in the list.

This leads to system crash when we try to unload and reload sound card.
I hit this bug during ADSP crash/reboot test case on Dragon board DB410c.

Below is the kernel BUG with SLAB Poisoning

=
BUG kmalloc-128 (Tainted: GW): Poison overwritten
-

Disabling lock debugging due to kernel taint
INFO: 0x80003cf1c310-0x80003cf1c31f. First byte 0x10 instead of 0x6b
INFO: Allocated in dapm_kcontrol_data_alloc.isra.37+0x34/0x2a8 age=6929 cpu=0 
pid=50
 __slab_alloc.isra.24+0x24/0x38
 kmem_cache_alloc+0x190/0x1d8
 dapm_kcontrol_data_alloc.isra.37+0x34/0x2a8
 dapm_create_or_share_kcontrol+0x1d4/0x290
 snd_soc_dapm_new_widgets+0x410/0x568
 snd_soc_register_card+0xa58/0xcd0
 apq8016_sbc_bind+0x31c/0x458
 try_to_bring_up_master+0x204/0x2e8
 component_add+0x94/0x178
 q6pcm_routing_probe+0x38/0x48
 platform_drv_probe+0x58/0xb8
 driver_probe_device+0x324/0x478
 __device_attach_driver+0xa8/0x160
 bus_for_each_drv+0x48/0x98
 __device_attach+0xc0/0x158
 device_initial_probe+0x10/0x18
INFO: Freed in dapm_kcontrol_free+0x40/0x50 age=3135 cpu=1 pid=1792
 kfree+0x1bc/0x1d0
 dapm_kcontrol_free+0x40/0x50
 snd_ctl_free_one+0x20/0x38
 snd_ctl_remove+0xf0/0x108
 snd_ctl_dev_free+0x3c/0x70
 __snd_device_free+0x50/0x88
 snd_device_free_all+0x2c/0x50
 release_card_device+0x1c/0x78
 device_release+0x34/0x98
 kobject_put+0x90/0x1f0
 put_device+0x14/0x20
 snd_card_free+0x54/0x70
 snd_soc_unregister_card+0x84/0x138
 snd_soc_unregister_component+0xa4/0xd0
 q6routing_dai_unbind+0x44/0x78
 component_unbind.isra.4+0x28/0x50
INFO: Slab 0x7ef3c700 objects=25 used=0 fp=0x80003cf1fc80 
flags=0xfffc0008100
INFO: Object 0x(ptrval) @offset=768 fp=0x(ptrval)

Redzone (ptrval): bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb  

Redzone (ptrval): bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb  

Redzone (ptrval): bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb  

Redzone (ptrval): bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb  

Redzone (ptrval): bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb  

Redzone (ptrval): bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb  

Redzone (ptrval): bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb  

Redzone (ptrval): bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb  

Object (ptrval): 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  

Object (ptrval): 10 c3 f1 3c 00 80 ff ff 10 c3 f1 3c 00 80 ff ff  
...<...<
Object (ptrval): 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  

Object (ptrval): 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  

Object (ptrval): 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  

Object (ptrval): 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  

Object (ptrval): 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  

Object (ptrval): 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b a5  
kkk.
Redzone (ptrval): bb bb bb bb bb bb bb bb  

Padding (ptrval): 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a  

Padding (ptrval): 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a  

Padding (ptrval): 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a  

Padding (ptrval): 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a  

CPU: 1 PID: 1792 Comm: sh Tainted: GB   W 
4.17.0-rc7-02229-gb429ee402d16-dirty #202
Hardware name: Qualcomm Technologies, Inc. APQ 8016 SBC (DT)
Call trace:
 dump_backtrace+0x0/0x1b0
 show_stack+0x14/0x20
 dump_stack+0x9c/0xbc
 print_trailer+0x124/0x1d8
 check_bytes_and_report+0xe8/0x120
 check_object+0x24c/0x288
 __free_slab+0x9c/0x2f0
 discard_slab+0x60/0x88
 __slab_free+0x35c/0x3e8
 kfree+0x1bc/0x1d0
 snd_soc_dapm_free_widget+0xac/0xd0
 snd_soc_dapm_free+0x64/0xb8
 soc_remove_component+0x50/0x80
 soc_remove_dai_links+0x110/0x208
 snd_soc_unregister_card+0x9c/0x138
 snd_soc_unregister_component+0xa4/0xd0
 q6routing_dai_unbind+0x44/0x78
 component_unbind.isra.4+0x28/0x50
 component_unbind_all+0xc0/0xe8
 apq8016_sbc_unbind+0x50/0xa0
 take_down_master+0x24/0x48
 component_del+0x90/0x130
 q6afe_dai_dev_remove+0x40/0x68
 platform_drv_remove+0x24/0x50
 device_release_driver_internal+0x170/0x208
 device_release_driver+0x14/0x20
 bus_remove_device+0xcc/0x150
 device_del+0x10c/0x310
 platform_device_del.part.3+0x24/0x90
 platform_device_unregister+0x18/0x30
 

[PATCH] ASoC: dapm: delete dapm_kcontrol_data paths entry before freeing

2018-06-01 Thread Srinivas Kandagatla
dapm_kcontrol_data is freed as part of dapm_kcontrol_free(), leaving the
paths list pointer dangling in the list.

This leads to system crash when we try to unload and reload sound card.
I hit this bug during ADSP crash/reboot test case on Dragon board DB410c.

Below is the kernel BUG with SLAB Poisoning

=
BUG kmalloc-128 (Tainted: GW): Poison overwritten
-

Disabling lock debugging due to kernel taint
INFO: 0x80003cf1c310-0x80003cf1c31f. First byte 0x10 instead of 0x6b
INFO: Allocated in dapm_kcontrol_data_alloc.isra.37+0x34/0x2a8 age=6929 cpu=0 
pid=50
 __slab_alloc.isra.24+0x24/0x38
 kmem_cache_alloc+0x190/0x1d8
 dapm_kcontrol_data_alloc.isra.37+0x34/0x2a8
 dapm_create_or_share_kcontrol+0x1d4/0x290
 snd_soc_dapm_new_widgets+0x410/0x568
 snd_soc_register_card+0xa58/0xcd0
 apq8016_sbc_bind+0x31c/0x458
 try_to_bring_up_master+0x204/0x2e8
 component_add+0x94/0x178
 q6pcm_routing_probe+0x38/0x48
 platform_drv_probe+0x58/0xb8
 driver_probe_device+0x324/0x478
 __device_attach_driver+0xa8/0x160
 bus_for_each_drv+0x48/0x98
 __device_attach+0xc0/0x158
 device_initial_probe+0x10/0x18
INFO: Freed in dapm_kcontrol_free+0x40/0x50 age=3135 cpu=1 pid=1792
 kfree+0x1bc/0x1d0
 dapm_kcontrol_free+0x40/0x50
 snd_ctl_free_one+0x20/0x38
 snd_ctl_remove+0xf0/0x108
 snd_ctl_dev_free+0x3c/0x70
 __snd_device_free+0x50/0x88
 snd_device_free_all+0x2c/0x50
 release_card_device+0x1c/0x78
 device_release+0x34/0x98
 kobject_put+0x90/0x1f0
 put_device+0x14/0x20
 snd_card_free+0x54/0x70
 snd_soc_unregister_card+0x84/0x138
 snd_soc_unregister_component+0xa4/0xd0
 q6routing_dai_unbind+0x44/0x78
 component_unbind.isra.4+0x28/0x50
INFO: Slab 0x7ef3c700 objects=25 used=0 fp=0x80003cf1fc80 
flags=0xfffc0008100
INFO: Object 0x(ptrval) @offset=768 fp=0x(ptrval)

Redzone (ptrval): bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb  

Redzone (ptrval): bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb  

Redzone (ptrval): bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb  

Redzone (ptrval): bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb  

Redzone (ptrval): bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb  

Redzone (ptrval): bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb  

Redzone (ptrval): bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb  

Redzone (ptrval): bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb  

Object (ptrval): 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  

Object (ptrval): 10 c3 f1 3c 00 80 ff ff 10 c3 f1 3c 00 80 ff ff  
...<...<
Object (ptrval): 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  

Object (ptrval): 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  

Object (ptrval): 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  

Object (ptrval): 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  

Object (ptrval): 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  

Object (ptrval): 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b a5  
kkk.
Redzone (ptrval): bb bb bb bb bb bb bb bb  

Padding (ptrval): 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a  

Padding (ptrval): 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a  

Padding (ptrval): 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a  

Padding (ptrval): 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a  

CPU: 1 PID: 1792 Comm: sh Tainted: GB   W 
4.17.0-rc7-02229-gb429ee402d16-dirty #202
Hardware name: Qualcomm Technologies, Inc. APQ 8016 SBC (DT)
Call trace:
 dump_backtrace+0x0/0x1b0
 show_stack+0x14/0x20
 dump_stack+0x9c/0xbc
 print_trailer+0x124/0x1d8
 check_bytes_and_report+0xe8/0x120
 check_object+0x24c/0x288
 __free_slab+0x9c/0x2f0
 discard_slab+0x60/0x88
 __slab_free+0x35c/0x3e8
 kfree+0x1bc/0x1d0
 snd_soc_dapm_free_widget+0xac/0xd0
 snd_soc_dapm_free+0x64/0xb8
 soc_remove_component+0x50/0x80
 soc_remove_dai_links+0x110/0x208
 snd_soc_unregister_card+0x9c/0x138
 snd_soc_unregister_component+0xa4/0xd0
 q6routing_dai_unbind+0x44/0x78
 component_unbind.isra.4+0x28/0x50
 component_unbind_all+0xc0/0xe8
 apq8016_sbc_unbind+0x50/0xa0
 take_down_master+0x24/0x48
 component_del+0x90/0x130
 q6afe_dai_dev_remove+0x40/0x68
 platform_drv_remove+0x24/0x50
 device_release_driver_internal+0x170/0x208
 device_release_driver+0x14/0x20
 bus_remove_device+0xcc/0x150
 device_del+0x10c/0x310
 platform_device_del.part.3+0x24/0x90
 platform_device_unregister+0x18/0x30
 

Re: [PATCH] x86,switch_mm: skip atomic operations for init_mm

2018-06-01 Thread Rik van Riel
On Fri, 1 Jun 2018 14:21:58 -0700
Andy Lutomirski  wrote:

> Hmm.  I wonder if there's a more clever data structure than a bitmap
> that we could be using here.  Each CPU only ever needs to be in one
> mm's cpumask, and each cpu only ever changes its own state in the
> bitmask.  And writes are much less common than reads for most
> workloads.

It would be easy enough to add an mm_struct pointer to the
per-cpu tlbstate struct, and iterate over those.

However, that would be an orthogonal change to optimizing
lazy TLB mode. 

Does the (untested) patch below make sense as a potential
improvement to the lazy TLB heuristic?

---8<---
Subject: x86,tlb: workload dependent per CPU lazy TLB switch

Lazy TLB mode is a tradeoff between flushing the TLB and touching
the mm_cpumask(_mm) at context switch time, versus potentially
incurring a remote TLB flush IPI while in lazy TLB mode.

Whether this pays off is likely to be workload dependent more than
anything else. However, the current heuristic keys off hardware type.

This patch changes the lazy TLB mode heuristic to a dynamic, per-CPU
decision, dependent on whether we recently received a remote TLB
shootdown while in lazy TLB mode.

This is a very simple heuristic. When a CPU receives a remote TLB
shootdown IPI while in lazy TLB mode, a counter in the same cache
line is set to 16. Every time we skip lazy TLB mode, the counter
is decremented.

While the counter is zero (no recent TLB flush IPIs), allow lazy TLB mode.

Signed-off-by: Rik van Riel 
---
 arch/x86/include/asm/tlbflush.h | 32 
 arch/x86/mm/tlb.c   |  4 
 2 files changed, 20 insertions(+), 16 deletions(-)

diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h
index 6690cd3fc8b1..f06a934e317d 100644
--- a/arch/x86/include/asm/tlbflush.h
+++ b/arch/x86/include/asm/tlbflush.h
@@ -148,22 +148,6 @@ static inline unsigned long build_cr3_noflush(pgd_t *pgd, 
u16 asid)
 #define __flush_tlb_one_user(addr) __native_flush_tlb_one_user(addr)
 #endif
 
-static inline bool tlb_defer_switch_to_init_mm(void)
-{
-   /*
-* If we have PCID, then switching to init_mm is reasonably
-* fast.  If we don't have PCID, then switching to init_mm is
-* quite slow, so we try to defer it in the hopes that we can
-* avoid it entirely.  The latter approach runs the risk of
-* receiving otherwise unnecessary IPIs.
-*
-* This choice is just a heuristic.  The tlb code can handle this
-* function returning true or false regardless of whether we have
-* PCID.
-*/
-   return !static_cpu_has(X86_FEATURE_PCID);
-}
-
 struct tlb_context {
u64 ctx_id;
u64 tlb_gen;
@@ -179,6 +163,7 @@ struct tlb_state {
struct mm_struct *loaded_mm;
u16 loaded_mm_asid;
u16 next_asid;
+   u16 flushed_while_lazy;
/* last user mm's ctx id */
u64 last_ctx_id;
 
@@ -246,6 +231,21 @@ struct tlb_state {
 };
 DECLARE_PER_CPU_SHARED_ALIGNED(struct tlb_state, cpu_tlbstate);
 
+static inline bool tlb_defer_switch_to_init_mm(void)
+{
+   /*
+* If the CPU recently received a TLB flush IPI while in lazy
+* TLB mode, do a straight switch to the idle task, and skip
+* lazy TLB mode for now.
+*/
+   if (this_cpu_read(cpu_tlbstate.flushed_while_lazy)) {
+   this_cpu_dec(cpu_tlbstate.flushed_while_lazy);
+   return false;
+   }
+
+   return true;
+}
+
 /* Initialize cr4 shadow for this CPU. */
 static inline void cr4_init_shadow(void)
 {
diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
index e055d1a06699..d8b0b7b236f3 100644
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -454,7 +454,11 @@ static void flush_tlb_func_common(const struct 
flush_tlb_info *f,
 * paging-structure cache to avoid speculatively reading
 * garbage into our TLB.  Since switching to init_mm is barely
 * slower than a minimal flush, just switch to init_mm.
+*
+* Skip lazy TLB mode for the next 16 context switches,
+* in case more TLB flush IPIs are coming.
 */
+   this_cpu_write(cpu_tlbstate.flushed_while_lazy, 16);
switch_mm_irqs_off(NULL, _mm, NULL);
return;
}



Re: [PATCH] x86,switch_mm: skip atomic operations for init_mm

2018-06-01 Thread Rik van Riel
On Fri, 1 Jun 2018 14:21:58 -0700
Andy Lutomirski  wrote:

> Hmm.  I wonder if there's a more clever data structure than a bitmap
> that we could be using here.  Each CPU only ever needs to be in one
> mm's cpumask, and each cpu only ever changes its own state in the
> bitmask.  And writes are much less common than reads for most
> workloads.

It would be easy enough to add an mm_struct pointer to the
per-cpu tlbstate struct, and iterate over those.

However, that would be an orthogonal change to optimizing
lazy TLB mode. 

Does the (untested) patch below make sense as a potential
improvement to the lazy TLB heuristic?

---8<---
Subject: x86,tlb: workload dependent per CPU lazy TLB switch

Lazy TLB mode is a tradeoff between flushing the TLB and touching
the mm_cpumask(_mm) at context switch time, versus potentially
incurring a remote TLB flush IPI while in lazy TLB mode.

Whether this pays off is likely to be workload dependent more than
anything else. However, the current heuristic keys off hardware type.

This patch changes the lazy TLB mode heuristic to a dynamic, per-CPU
decision, dependent on whether we recently received a remote TLB
shootdown while in lazy TLB mode.

This is a very simple heuristic. When a CPU receives a remote TLB
shootdown IPI while in lazy TLB mode, a counter in the same cache
line is set to 16. Every time we skip lazy TLB mode, the counter
is decremented.

While the counter is zero (no recent TLB flush IPIs), allow lazy TLB mode.

Signed-off-by: Rik van Riel 
---
 arch/x86/include/asm/tlbflush.h | 32 
 arch/x86/mm/tlb.c   |  4 
 2 files changed, 20 insertions(+), 16 deletions(-)

diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h
index 6690cd3fc8b1..f06a934e317d 100644
--- a/arch/x86/include/asm/tlbflush.h
+++ b/arch/x86/include/asm/tlbflush.h
@@ -148,22 +148,6 @@ static inline unsigned long build_cr3_noflush(pgd_t *pgd, 
u16 asid)
 #define __flush_tlb_one_user(addr) __native_flush_tlb_one_user(addr)
 #endif
 
-static inline bool tlb_defer_switch_to_init_mm(void)
-{
-   /*
-* If we have PCID, then switching to init_mm is reasonably
-* fast.  If we don't have PCID, then switching to init_mm is
-* quite slow, so we try to defer it in the hopes that we can
-* avoid it entirely.  The latter approach runs the risk of
-* receiving otherwise unnecessary IPIs.
-*
-* This choice is just a heuristic.  The tlb code can handle this
-* function returning true or false regardless of whether we have
-* PCID.
-*/
-   return !static_cpu_has(X86_FEATURE_PCID);
-}
-
 struct tlb_context {
u64 ctx_id;
u64 tlb_gen;
@@ -179,6 +163,7 @@ struct tlb_state {
struct mm_struct *loaded_mm;
u16 loaded_mm_asid;
u16 next_asid;
+   u16 flushed_while_lazy;
/* last user mm's ctx id */
u64 last_ctx_id;
 
@@ -246,6 +231,21 @@ struct tlb_state {
 };
 DECLARE_PER_CPU_SHARED_ALIGNED(struct tlb_state, cpu_tlbstate);
 
+static inline bool tlb_defer_switch_to_init_mm(void)
+{
+   /*
+* If the CPU recently received a TLB flush IPI while in lazy
+* TLB mode, do a straight switch to the idle task, and skip
+* lazy TLB mode for now.
+*/
+   if (this_cpu_read(cpu_tlbstate.flushed_while_lazy)) {
+   this_cpu_dec(cpu_tlbstate.flushed_while_lazy);
+   return false;
+   }
+
+   return true;
+}
+
 /* Initialize cr4 shadow for this CPU. */
 static inline void cr4_init_shadow(void)
 {
diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
index e055d1a06699..d8b0b7b236f3 100644
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -454,7 +454,11 @@ static void flush_tlb_func_common(const struct 
flush_tlb_info *f,
 * paging-structure cache to avoid speculatively reading
 * garbage into our TLB.  Since switching to init_mm is barely
 * slower than a minimal flush, just switch to init_mm.
+*
+* Skip lazy TLB mode for the next 16 context switches,
+* in case more TLB flush IPIs are coming.
 */
+   this_cpu_write(cpu_tlbstate.flushed_while_lazy, 16);
switch_mm_irqs_off(NULL, _mm, NULL);
return;
}



Re: [PATCH v2 1/2] power: supply: sbs-battery: don't assume MANUFACTURER_DATA formats

2018-06-01 Thread Rhyland Klein
On 6/1/2018 2:31 PM, Brian Norris wrote:
> Hi,
> 
> On Fri, Jun 01, 2018 at 10:34:34AM -0700, Guenter Roeck wrote:
>> On Fri, Jun 01, 2018 at 10:23:59AM -0700, Brian Norris wrote:
>>>  drivers/power/supply/sbs-battery.c | 54 +-
>>>  1 file changed, 46 insertions(+), 8 deletions(-)
>>>
>>> diff --git a/drivers/power/supply/sbs-battery.c 
>>> b/drivers/power/supply/sbs-battery.c
>>> index 83d7b4115857..8dea63464a3f 100644
>>> --- a/drivers/power/supply/sbs-battery.c
>>> +++ b/drivers/power/supply/sbs-battery.c
> ...
> 
>>> @@ -315,6 +320,27 @@ static int sbs_status_correct(struct i2c_client 
>>> *client, int *intval)
>>>  static int sbs_get_battery_presence_and_health(
>>> struct i2c_client *client, enum power_supply_property psp,
>>> union power_supply_propval *val)
>>> +{
>>> +   int ret;
>>> +
>>> +   if (psp == POWER_SUPPLY_PROP_PRESENT) {
>>> +   /* Dummy command; if it succeeds, battery is present. */
>>> +   ret = sbs_read_word_data(client, sbs_data[REG_STATUS].addr);
>>> +   if (ret < 0)
>>> +   val->intval = 0; /* battery disconnected */
>>> +   else
>>> +   val->intval = 1; /* battery present */
>>> +   return 0;
>>> +   } else { /* POWER_SUPPLY_PROP_HEALTH */
>>
>> Static analyzers may complain that else after return is unnecessary.
> 
> I noticed (checkpatch complains) but decided I didn't care. It would be
> worse to promote the 'else' to top-level (things would look asymmetric).
> I suppose I could pull the 'return 0' out, but I'm not sure that would
> make the code any better.

I generally prefer to make checkpatch happy, so I would vote to move the
return outside of the if blocks as a minimal way of making it happy. In
general though,

Acked-by: Rhyland Klein 

> 
>> Other than that
>>
>> Reviewed-by: Guenter Roeck 
>>
>>> +   /* SBS spec doesn't have a general health command. */
>>> +   val->intval = POWER_SUPPLY_HEALTH_UNKNOWN;
>>> +   return 0;
>>> +   }
>>> +}
>>> +
> 
> Brian
> 


-- 
nvpublic


Re: [PATCH v2 1/2] power: supply: sbs-battery: don't assume MANUFACTURER_DATA formats

2018-06-01 Thread Rhyland Klein
On 6/1/2018 2:31 PM, Brian Norris wrote:
> Hi,
> 
> On Fri, Jun 01, 2018 at 10:34:34AM -0700, Guenter Roeck wrote:
>> On Fri, Jun 01, 2018 at 10:23:59AM -0700, Brian Norris wrote:
>>>  drivers/power/supply/sbs-battery.c | 54 +-
>>>  1 file changed, 46 insertions(+), 8 deletions(-)
>>>
>>> diff --git a/drivers/power/supply/sbs-battery.c 
>>> b/drivers/power/supply/sbs-battery.c
>>> index 83d7b4115857..8dea63464a3f 100644
>>> --- a/drivers/power/supply/sbs-battery.c
>>> +++ b/drivers/power/supply/sbs-battery.c
> ...
> 
>>> @@ -315,6 +320,27 @@ static int sbs_status_correct(struct i2c_client 
>>> *client, int *intval)
>>>  static int sbs_get_battery_presence_and_health(
>>> struct i2c_client *client, enum power_supply_property psp,
>>> union power_supply_propval *val)
>>> +{
>>> +   int ret;
>>> +
>>> +   if (psp == POWER_SUPPLY_PROP_PRESENT) {
>>> +   /* Dummy command; if it succeeds, battery is present. */
>>> +   ret = sbs_read_word_data(client, sbs_data[REG_STATUS].addr);
>>> +   if (ret < 0)
>>> +   val->intval = 0; /* battery disconnected */
>>> +   else
>>> +   val->intval = 1; /* battery present */
>>> +   return 0;
>>> +   } else { /* POWER_SUPPLY_PROP_HEALTH */
>>
>> Static analyzers may complain that else after return is unnecessary.
> 
> I noticed (checkpatch complains) but decided I didn't care. It would be
> worse to promote the 'else' to top-level (things would look asymmetric).
> I suppose I could pull the 'return 0' out, but I'm not sure that would
> make the code any better.

I generally prefer to make checkpatch happy, so I would vote to move the
return outside of the if blocks as a minimal way of making it happy. In
general though,

Acked-by: Rhyland Klein 

> 
>> Other than that
>>
>> Reviewed-by: Guenter Roeck 
>>
>>> +   /* SBS spec doesn't have a general health command. */
>>> +   val->intval = POWER_SUPPLY_HEALTH_UNKNOWN;
>>> +   return 0;
>>> +   }
>>> +}
>>> +
> 
> Brian
> 


-- 
nvpublic


Re: [PATCH v3 0/8] gnss: add new GNSS subsystem

2018-06-01 Thread Pavel Machek
On Fri 2018-06-01 18:37:36, H. Nikolaus Schaller wrote:
> Hi Pavel,
> 
> > Am 01.06.2018 um 18:26 schrieb Pavel Machek :
> > 
> > NMEA would not be my first choice, really. I'd propose something very
> > similar to existing /dev/input/eventX, but with wider data types.
> 
> Since even Rome wasn't built in a day, my first choice is NMEA, but I am
> open to anything on higher level to come second.
> 
> What about iio?
> 
> It is quite flexible and extensible and GNSS coordinates are a not very
> different from accelerometer, gyroscope or compass data as all of them
> describe the position and orientation, maybe speed of the device these
> things are built into (at least for mobile devices).

As I said, I do not want to have that discussion at the
moment. Transfering position and orientation is easy, transfering some
other data (such as sattelites used for fix) might be
trickier. Anyway, I'm pretty sure reasonable solution exists; and yes
we could use NMEA.

Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html


signature.asc
Description: Digital signature


Re: [PATCH v3 0/8] gnss: add new GNSS subsystem

2018-06-01 Thread Pavel Machek
On Fri 2018-06-01 18:37:36, H. Nikolaus Schaller wrote:
> Hi Pavel,
> 
> > Am 01.06.2018 um 18:26 schrieb Pavel Machek :
> > 
> > NMEA would not be my first choice, really. I'd propose something very
> > similar to existing /dev/input/eventX, but with wider data types.
> 
> Since even Rome wasn't built in a day, my first choice is NMEA, but I am
> open to anything on higher level to come second.
> 
> What about iio?
> 
> It is quite flexible and extensible and GNSS coordinates are a not very
> different from accelerometer, gyroscope or compass data as all of them
> describe the position and orientation, maybe speed of the device these
> things are built into (at least for mobile devices).

As I said, I do not want to have that discussion at the
moment. Transfering position and orientation is easy, transfering some
other data (such as sattelites used for fix) might be
trickier. Anyway, I'm pretty sure reasonable solution exists; and yes
we could use NMEA.

Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html


signature.asc
Description: Digital signature


Re: [lkp-robot] [tracing/x86] 1c758a2202: aim9.disk_rr.ops_per_sec -12.0% regression

2018-06-01 Thread Steven Rostedt
On Mon, 28 May 2018 19:34:19 +0800
kernel test robot  wrote:

> Greeting,
> 
> FYI, we noticed a -12.0% regression of aim9.disk_rr.ops_per_sec due to commit:
> 
> 
> commit: 1c758a2202a6b4624d0703013a2c6cfa6e7455aa ("tracing/x86: Update 
> syscall trace events to handle new prefixed syscall func names")

How can this commit cause a run time regression. It changes code
in an __init call (that gets removed after boot)??

-- Steve
 

> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
> 
> in testcase: aim9
> on test machine: 144 threads Intel(R) Xeon(R) CPU E7-8890 v3 @ 2.50GHz with 
> 512G memory
> with following parameters:
> 
>   testtime: 300s
>   test: disk_rr
>   cpufreq_governor: performance
> 
> test-description: Suite IX is the "AIM Independent Resource Benchmark:" the 
> famous synthetic benchmark.
> test-url: https://sourceforge.net/projects/aimbench/files/aim-suite9/
> 
> 
> Details are as below:
> -->
>   
> =
> compiler/cpufreq_governor/kconfig/rootfs/tbox_group/test/testcase/testtime:
>   
> gcc-7/performance/x86_64-rhel-7.2/debian-x86_64-2016-08-31.cgz/lkp-hsx04/disk_rr/aim9/300s
> 
> commit: 
>   1cdae042fc ("tracing: Add missing forward declaration")
>   1c758a2202 ("tracing/x86: Update syscall trace events to handle new 
> prefixed syscall func names")
> 
> 1cdae042fc63dd98 1c758a2202a6b4624d0703013a 
>  -- 
>  %stddev %change %stddev
>  \  |\  
> 633310   -12.0% 557268aim9.disk_rr.ops_per_sec
> 244.24+2.2% 249.57aim9.time.system_time
>  55.76-9.6%  50.43aim9.time.user_time
>   8135 ± 39% +73.2%  14091 ± 33%  numa-meminfo.node0.Shmem
>   1328   -11.9%   1171pmeter.performance_per_watt
> 450606 ±  3%  -9.5% 407878 ±  5%  meminfo.Committed_AS
> 406.75 ±173%+300.1%   1627meminfo.Mlocked
>  20124 ±  4%  +8.4%  21819 ±  6%  softirqs.NET_RX
>8237636 ±  6% -15.4%6965294 ±  2%  softirqs.RCU
>   2033 ± 39% +73.0%   3518 ± 33%  numa-vmstat.node0.nr_shmem
>  21.25 ±173%+378.8% 101.75 ± 27%  numa-vmstat.node2.nr_mlock
>  21.25 ±173%+378.8% 101.75 ± 27%  numa-vmstat.node3.nr_mlock
>  9.408e+08 ±  6% +53.3%  1.442e+09 ± 20%  perf-stat.dTLB-load-misses
>  47.39 ± 17% -10.4   36.99 ±  8%  perf-stat.iTLB-load-miss-rate%
>   1279 ± 27% +63.4%   2089 ± 21%  
> perf-stat.instructions-per-iTLB-miss
>  46.73 ±  5%  -5.4   41.33 ±  5%  perf-stat.node-store-miss-rate%
>  19240+1.2%  19474
> proc-vmstat.nr_indirectly_reclaimable
>  18868+4.0%  19628proc-vmstat.nr_slab_reclaimable
>   48395423   -11.8%   42700849proc-vmstat.numa_hit
>   48314997   -11.8%   42620296proc-vmstat.numa_local
>3153408   -12.0%2775642proc-vmstat.pgactivate
>   48365477   -11.8%   42678780proc-vmstat.pgfree
>   3060   +38.9%   4250
> slabinfo.ftrace_event_field.active_objs
>   3060   +38.9%   4250
> slabinfo.ftrace_event_field.num_objs
>   2748 ±  3%  -8.9%   2502 ±  3%  
> slabinfo.sighand_cache.active_objs
>   2763 ±  3%  -8.9%   2517 ±  3%  slabinfo.sighand_cache.num_objs
>   4125 ±  4% -10.3%   3700 ±  2%  
> slabinfo.signal_cache.active_objs
>   4125 ±  4% -10.3%   3700 ±  2%  slabinfo.signal_cache.num_objs
>   1104   +57.3%   1736
> slabinfo.trace_event_file.active_objs
>   1104   +57.3%   1736
> slabinfo.trace_event_file.num_objs
>   0.78 ±  4%  -0.10.67 ±  5%  
> perf-profile.calltrace.cycles-pp.rcu_process_callbacks.__softirqentry_text_start.irq_exit.smp_apic_timer_interrupt.apic_timer_interrupt
>   2.10 ±  2%  -0.12.00
> perf-profile.calltrace.cycles-pp.tick_sched_timer.__hrtimer_run_queues.hrtimer_interrupt.smp_apic_timer_interrupt.apic_timer_interrupt
>   1.89-0.11.79
> perf-profile.calltrace.cycles-pp.tick_sched_handle.tick_sched_timer.__hrtimer_run_queues.hrtimer_interrupt.smp_apic_timer_interrupt
>   1.71 ±  2%  -0.11.60
> perf-profile.calltrace.cycles-pp.update_process_times.tick_sched_handle.tick_sched_timer.__hrtimer_run_queues.hrtimer_interrupt
>   0.53 ± 12%  -0.10.44 ±  4%  
> perf-profile.children.cycles-pp._raw_spin_lock_irqsave
>   0.17 ±  7%  -0.00.15 ±  4%  
> perf-profile.children.cycles-pp.leave_mm
>   0.09 ±  8%  -0.00.08 ± 11%  
> 

Re: [lkp-robot] [tracing/x86] 1c758a2202: aim9.disk_rr.ops_per_sec -12.0% regression

2018-06-01 Thread Steven Rostedt
On Mon, 28 May 2018 19:34:19 +0800
kernel test robot  wrote:

> Greeting,
> 
> FYI, we noticed a -12.0% regression of aim9.disk_rr.ops_per_sec due to commit:
> 
> 
> commit: 1c758a2202a6b4624d0703013a2c6cfa6e7455aa ("tracing/x86: Update 
> syscall trace events to handle new prefixed syscall func names")

How can this commit cause a run time regression. It changes code
in an __init call (that gets removed after boot)??

-- Steve
 

> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
> 
> in testcase: aim9
> on test machine: 144 threads Intel(R) Xeon(R) CPU E7-8890 v3 @ 2.50GHz with 
> 512G memory
> with following parameters:
> 
>   testtime: 300s
>   test: disk_rr
>   cpufreq_governor: performance
> 
> test-description: Suite IX is the "AIM Independent Resource Benchmark:" the 
> famous synthetic benchmark.
> test-url: https://sourceforge.net/projects/aimbench/files/aim-suite9/
> 
> 
> Details are as below:
> -->
>   
> =
> compiler/cpufreq_governor/kconfig/rootfs/tbox_group/test/testcase/testtime:
>   
> gcc-7/performance/x86_64-rhel-7.2/debian-x86_64-2016-08-31.cgz/lkp-hsx04/disk_rr/aim9/300s
> 
> commit: 
>   1cdae042fc ("tracing: Add missing forward declaration")
>   1c758a2202 ("tracing/x86: Update syscall trace events to handle new 
> prefixed syscall func names")
> 
> 1cdae042fc63dd98 1c758a2202a6b4624d0703013a 
>  -- 
>  %stddev %change %stddev
>  \  |\  
> 633310   -12.0% 557268aim9.disk_rr.ops_per_sec
> 244.24+2.2% 249.57aim9.time.system_time
>  55.76-9.6%  50.43aim9.time.user_time
>   8135 ± 39% +73.2%  14091 ± 33%  numa-meminfo.node0.Shmem
>   1328   -11.9%   1171pmeter.performance_per_watt
> 450606 ±  3%  -9.5% 407878 ±  5%  meminfo.Committed_AS
> 406.75 ±173%+300.1%   1627meminfo.Mlocked
>  20124 ±  4%  +8.4%  21819 ±  6%  softirqs.NET_RX
>8237636 ±  6% -15.4%6965294 ±  2%  softirqs.RCU
>   2033 ± 39% +73.0%   3518 ± 33%  numa-vmstat.node0.nr_shmem
>  21.25 ±173%+378.8% 101.75 ± 27%  numa-vmstat.node2.nr_mlock
>  21.25 ±173%+378.8% 101.75 ± 27%  numa-vmstat.node3.nr_mlock
>  9.408e+08 ±  6% +53.3%  1.442e+09 ± 20%  perf-stat.dTLB-load-misses
>  47.39 ± 17% -10.4   36.99 ±  8%  perf-stat.iTLB-load-miss-rate%
>   1279 ± 27% +63.4%   2089 ± 21%  
> perf-stat.instructions-per-iTLB-miss
>  46.73 ±  5%  -5.4   41.33 ±  5%  perf-stat.node-store-miss-rate%
>  19240+1.2%  19474
> proc-vmstat.nr_indirectly_reclaimable
>  18868+4.0%  19628proc-vmstat.nr_slab_reclaimable
>   48395423   -11.8%   42700849proc-vmstat.numa_hit
>   48314997   -11.8%   42620296proc-vmstat.numa_local
>3153408   -12.0%2775642proc-vmstat.pgactivate
>   48365477   -11.8%   42678780proc-vmstat.pgfree
>   3060   +38.9%   4250
> slabinfo.ftrace_event_field.active_objs
>   3060   +38.9%   4250
> slabinfo.ftrace_event_field.num_objs
>   2748 ±  3%  -8.9%   2502 ±  3%  
> slabinfo.sighand_cache.active_objs
>   2763 ±  3%  -8.9%   2517 ±  3%  slabinfo.sighand_cache.num_objs
>   4125 ±  4% -10.3%   3700 ±  2%  
> slabinfo.signal_cache.active_objs
>   4125 ±  4% -10.3%   3700 ±  2%  slabinfo.signal_cache.num_objs
>   1104   +57.3%   1736
> slabinfo.trace_event_file.active_objs
>   1104   +57.3%   1736
> slabinfo.trace_event_file.num_objs
>   0.78 ±  4%  -0.10.67 ±  5%  
> perf-profile.calltrace.cycles-pp.rcu_process_callbacks.__softirqentry_text_start.irq_exit.smp_apic_timer_interrupt.apic_timer_interrupt
>   2.10 ±  2%  -0.12.00
> perf-profile.calltrace.cycles-pp.tick_sched_timer.__hrtimer_run_queues.hrtimer_interrupt.smp_apic_timer_interrupt.apic_timer_interrupt
>   1.89-0.11.79
> perf-profile.calltrace.cycles-pp.tick_sched_handle.tick_sched_timer.__hrtimer_run_queues.hrtimer_interrupt.smp_apic_timer_interrupt
>   1.71 ±  2%  -0.11.60
> perf-profile.calltrace.cycles-pp.update_process_times.tick_sched_handle.tick_sched_timer.__hrtimer_run_queues.hrtimer_interrupt
>   0.53 ± 12%  -0.10.44 ±  4%  
> perf-profile.children.cycles-pp._raw_spin_lock_irqsave
>   0.17 ±  7%  -0.00.15 ±  4%  
> perf-profile.children.cycles-pp.leave_mm
>   0.09 ±  8%  -0.00.08 ± 11%  
> 

Re: [PATCH v4 1/2] regulator: dt-bindings: add QCOM RPMh regulator bindings

2018-06-01 Thread David Collins
Hello Mark,

On 05/31/2018 04:48 AM, Mark Brown wrote:
> On Wed, May 30, 2018 at 04:39:10PM -0700, David Collins wrote:
>> The DRMS modes to use and max allowed current per mode need to be
>> specified at the board level in device tree instead of hard-coded per
>> regulator type in the driver.  There are at least two use cases driving
>> this need: LDOs shared between RPMh client processors and SMPSes requiring
>> PWM mode in certain circumstances.
> 
> Is there really no way to improve the RPM firmware?

This aggregation takes place in a discrete hardware block, not in a
general purpose processor.  Theoretically, future chips could have
redesigned VRM hardware; however, there is no plan to make such a change.


>> Consider the case of a regulator with physical 10 mA LPM max current. Say
>> that modem and application processors each have a load on the regulator
>> that draws 9 mA. If they each respect the 10 mA limit, then they'd each
>> vote for LPM. The VRM block in RPMh hardware will aggregate these requests
> 
> This is, of course, why the regulator API aggregates this stuff based on
> the current not based on having every individual user select a mode.
> 
>> together using a max function which will result in the regulator being set
>> to LPM, even though the total load is 18 mA (which would require high
>> power mode (HPM)). To get around this corner case, a LPM max current of 1
>> uA can be used for all LDO supplies that have non-application processor
>> consumers. Thus, any non-zero regulator_set_load() current request will
>> result in setting the regulator to HPM (which is always safe).
> 
> That's obviously just never going to work well, the best you can do here
> is just pretend that the other components are always operating at full
> power (which I assume all the other components are doing too...) or not
> try to use load based mode switching in the first place.

I will remove the DT-based mode and max load current mapping support.  In
its place, I'll implement hard coded LPM current limits for LDOs of 10 mA
or 30 mA (depending upon subtype) like is done in other regulator drivers.

If we actually encounter an issue caused by the APPS processor and another
RPMh client both voting for LPM when HPM is needed for the combination,
then we can disable load-based mode control for the affected regulator in
DT and configure its initial mode as HPM.


>> The second situation that needs board-level DRMS mode and current limit
>> specification is SMPS regulator AUTO mode to PWM (HPM) mode switching.
>> SMPS regulators should theoretically always be able to operate in AUTO
>> mode as it switches automatically between PWM mode (which can provide the
>> maximum current) and PFM mode (which supports lower current but has higher
>> efficiency). However, there may be board/system issues that require
>> switching to PWM mode for certain use cases as it has better load
>> regulation (i.e. no PFM ripple for lower loads) and supports more
>> aggressive load current steps (i.e. greater A/ns).
> 
>> If a Linux consumer requires the ability to force a given SMPS regulator
>> from AUTO mode into PWM mode and that SMPS is shared by other Linux
>> consumers (which may be the case, but at least must be guaranteed to work
>> architecturally), then regulator_set_load() is the only option since it
>> provides aggregation, where as regulator_set_mode() does not.
> 
> That's obviously broken though, what you're describing is just clearly
> nothing to do with load so trying to configure it using load is just
> going to lead to problems later on.  Honestly it sounds like you just
> want to force the regulator into forced PWM mode all the time, otherwise
> you need to look into implementing support for describing the thing
> you're actually trying to do and add a mechanism for per consumer mode
> configuration.
> 
> This has been a frequent pattern with these RPM drivers, trying to find
> some way to shoehorn something that happens to work right now into the
> code.  That's going to make things fragile and hard to maintain, we need
> code that does the thing it's saying it does so that it's easier to
> understand and work with - getting things running isn't enough, they
> need to be clear.

I agree that using regulator_set_load() to handle SMPS AUTO mode to PWM
mode switching is hacky.  Since there is no natural way to pick SMPS modes
based on load current, I will remove the functionality completely.  In its
place, we can configure the SMPSes to have an initial mode of AUTO.  If a
use case requiring PWM mode arises, then the consumer driver responsible
for it can call regulator_set_mode() to switch between AUTO and PWM modes
explicitly.

Since regulator_set_mode() does not support aggregation currently, this
technique would work only if there is exactly one consumer per regulator
that needs explicit mode control.  Should we hit a situation that needs
multiple consumers to have such mode control, then we could simply default
the 

Re: [PATCH v4 1/2] regulator: dt-bindings: add QCOM RPMh regulator bindings

2018-06-01 Thread David Collins
Hello Mark,

On 05/31/2018 04:48 AM, Mark Brown wrote:
> On Wed, May 30, 2018 at 04:39:10PM -0700, David Collins wrote:
>> The DRMS modes to use and max allowed current per mode need to be
>> specified at the board level in device tree instead of hard-coded per
>> regulator type in the driver.  There are at least two use cases driving
>> this need: LDOs shared between RPMh client processors and SMPSes requiring
>> PWM mode in certain circumstances.
> 
> Is there really no way to improve the RPM firmware?

This aggregation takes place in a discrete hardware block, not in a
general purpose processor.  Theoretically, future chips could have
redesigned VRM hardware; however, there is no plan to make such a change.


>> Consider the case of a regulator with physical 10 mA LPM max current. Say
>> that modem and application processors each have a load on the regulator
>> that draws 9 mA. If they each respect the 10 mA limit, then they'd each
>> vote for LPM. The VRM block in RPMh hardware will aggregate these requests
> 
> This is, of course, why the regulator API aggregates this stuff based on
> the current not based on having every individual user select a mode.
> 
>> together using a max function which will result in the regulator being set
>> to LPM, even though the total load is 18 mA (which would require high
>> power mode (HPM)). To get around this corner case, a LPM max current of 1
>> uA can be used for all LDO supplies that have non-application processor
>> consumers. Thus, any non-zero regulator_set_load() current request will
>> result in setting the regulator to HPM (which is always safe).
> 
> That's obviously just never going to work well, the best you can do here
> is just pretend that the other components are always operating at full
> power (which I assume all the other components are doing too...) or not
> try to use load based mode switching in the first place.

I will remove the DT-based mode and max load current mapping support.  In
its place, I'll implement hard coded LPM current limits for LDOs of 10 mA
or 30 mA (depending upon subtype) like is done in other regulator drivers.

If we actually encounter an issue caused by the APPS processor and another
RPMh client both voting for LPM when HPM is needed for the combination,
then we can disable load-based mode control for the affected regulator in
DT and configure its initial mode as HPM.


>> The second situation that needs board-level DRMS mode and current limit
>> specification is SMPS regulator AUTO mode to PWM (HPM) mode switching.
>> SMPS regulators should theoretically always be able to operate in AUTO
>> mode as it switches automatically between PWM mode (which can provide the
>> maximum current) and PFM mode (which supports lower current but has higher
>> efficiency). However, there may be board/system issues that require
>> switching to PWM mode for certain use cases as it has better load
>> regulation (i.e. no PFM ripple for lower loads) and supports more
>> aggressive load current steps (i.e. greater A/ns).
> 
>> If a Linux consumer requires the ability to force a given SMPS regulator
>> from AUTO mode into PWM mode and that SMPS is shared by other Linux
>> consumers (which may be the case, but at least must be guaranteed to work
>> architecturally), then regulator_set_load() is the only option since it
>> provides aggregation, where as regulator_set_mode() does not.
> 
> That's obviously broken though, what you're describing is just clearly
> nothing to do with load so trying to configure it using load is just
> going to lead to problems later on.  Honestly it sounds like you just
> want to force the regulator into forced PWM mode all the time, otherwise
> you need to look into implementing support for describing the thing
> you're actually trying to do and add a mechanism for per consumer mode
> configuration.
> 
> This has been a frequent pattern with these RPM drivers, trying to find
> some way to shoehorn something that happens to work right now into the
> code.  That's going to make things fragile and hard to maintain, we need
> code that does the thing it's saying it does so that it's easier to
> understand and work with - getting things running isn't enough, they
> need to be clear.

I agree that using regulator_set_load() to handle SMPS AUTO mode to PWM
mode switching is hacky.  Since there is no natural way to pick SMPS modes
based on load current, I will remove the functionality completely.  In its
place, we can configure the SMPSes to have an initial mode of AUTO.  If a
use case requiring PWM mode arises, then the consumer driver responsible
for it can call regulator_set_mode() to switch between AUTO and PWM modes
explicitly.

Since regulator_set_mode() does not support aggregation currently, this
technique would work only if there is exactly one consumer per regulator
that needs explicit mode control.  Should we hit a situation that needs
multiple consumers to have such mode control, then we could simply default
the 

Re: [PATCH] x86,switch_mm: skip atomic operations for init_mm

2018-06-01 Thread Andy Lutomirski
On Fri, Jun 1, 2018 at 1:35 PM Rik van Riel  wrote:
>
> On Fri, 2018-06-01 at 13:03 -0700, Andy Lutomirski wrote:
> > Mike, you never did say: do you have PCID on your CPU?  Also, what is
> > your workload doing to cause so many switches back and forth between
> > init_mm and a task.
> >
> > The point of the optimization is that switching to init_mm() should
> > be
> > fairly fast on a PCID system, whereas an IPI to do the deferred flush
> > is very expensive regardless of PCID.
>
> While I am sure that bit is true, Song and I
> observed about 4x as much CPU use in the atomic
> operations in cpumask_clear_cpu and cpumask_set_cpu
> (inside switch_mm_irqs_off) as we saw CPU used
> in the %cr3 reload itself.
>
> Given how expensive those cpumask updates are,
> lazy TLB mode might always be worth it, especially
> on larger systems.
>

Hmm.  I wonder if there's a more clever data structure than a bitmap
that we could be using here.  Each CPU only ever needs to be in one
mm's cpumask, and each cpu only ever changes its own state in the
bitmask.  And writes are much less common than reads for most
workloads.


Re: [PATCH] x86,switch_mm: skip atomic operations for init_mm

2018-06-01 Thread Andy Lutomirski
On Fri, Jun 1, 2018 at 1:35 PM Rik van Riel  wrote:
>
> On Fri, 2018-06-01 at 13:03 -0700, Andy Lutomirski wrote:
> > Mike, you never did say: do you have PCID on your CPU?  Also, what is
> > your workload doing to cause so many switches back and forth between
> > init_mm and a task.
> >
> > The point of the optimization is that switching to init_mm() should
> > be
> > fairly fast on a PCID system, whereas an IPI to do the deferred flush
> > is very expensive regardless of PCID.
>
> While I am sure that bit is true, Song and I
> observed about 4x as much CPU use in the atomic
> operations in cpumask_clear_cpu and cpumask_set_cpu
> (inside switch_mm_irqs_off) as we saw CPU used
> in the %cr3 reload itself.
>
> Given how expensive those cpumask updates are,
> lazy TLB mode might always be worth it, especially
> on larger systems.
>

Hmm.  I wonder if there's a more clever data structure than a bitmap
that we could be using here.  Each CPU only ever needs to be in one
mm's cpumask, and each cpu only ever changes its own state in the
bitmask.  And writes are much less common than reads for most
workloads.


  1   2   3   4   5   6   7   8   9   10   >