date:20180430

Re: [PATCH v4 5/6] bus: fsl-mc: supoprt dma configure for devices on fsl-mc bus

2018-04-30 Thread kbuild test robot

Hi Nipun,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on linus/master]
[also build test ERROR on v4.17-rc3 next-20180430]
[cannot apply to iommu/next glikely/devicetree/next]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Nipun-Gupta/Support-for-fsl-mc-bus-and-its-devices-in-SMMU/20180501-125745
config: x86_64-allmodconfig (attached as .config)
compiler: gcc-7 (Debian 7.3.0-16) 7.3.0
reproduce:
# save the attached .config to linux build tree
make ARCH=x86_64 

All errors (new ones prefixed by >>):

   drivers/bus/fsl-mc/fsl-mc-bus.c: In function 'fsl_mc_dma_configure':
>> drivers/bus/fsl-mc/fsl-mc-bus.c:137:9: error: too many arguments to function 
>> 'of_dma_configure'
 return of_dma_configure(dev, dma_dev->of_node, 0);
^~~~
   In file included from drivers/bus/fsl-mc/fsl-mc-bus.c:13:0:
   include/linux/of_device.h:58:5: note: declared here
int of_dma_configure(struct device *dev, struct device_node *np);
^~~~
   drivers/bus/fsl-mc/fsl-mc-bus.c: At top level:
>> drivers/bus/fsl-mc/fsl-mc-bus.c:161:3: error: 'struct bus_type' has no 
>> member named 'dma_configure'
 .dma_configure  = fsl_mc_dma_configure,
  ^

vim +/of_dma_configure +137 drivers/bus/fsl-mc/fsl-mc-bus.c

   129  
   130  static int fsl_mc_dma_configure(struct device *dev)
   131  {
   132  struct device *dma_dev = dev;
   133  
   134  while (dev_is_fsl_mc(dma_dev))
   135  dma_dev = dma_dev->parent;
   136  
 > 137  return of_dma_configure(dev, dma_dev->of_node, 0);
   138  }
   139  
   140  static ssize_t modalias_show(struct device *dev, struct 
device_attribute *attr,
   141   char *buf)
   142  {
   143  struct fsl_mc_device *mc_dev = to_fsl_mc_device(dev);
   144  
   145  return sprintf(buf, "fsl-mc:v%08Xd%s\n", 
mc_dev->obj_desc.vendor,
   146 mc_dev->obj_desc.type);
   147  }
   148  static DEVICE_ATTR_RO(modalias);
   149  
   150  static struct attribute *fsl_mc_dev_attrs[] = {
   151  _attr_modalias.attr,
   152  NULL,
   153  };
   154  
   155  ATTRIBUTE_GROUPS(fsl_mc_dev);
   156  
   157  struct bus_type fsl_mc_bus_type = {
   158  .name = "fsl-mc",
   159  .match = fsl_mc_bus_match,
   160  .uevent = fsl_mc_bus_uevent,
 > 161  .dma_configure  = fsl_mc_dma_configure,
   162  .dev_groups = fsl_mc_dev_groups,
   163  };
   164  EXPORT_SYMBOL_GPL(fsl_mc_bus_type);
   165  

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip

Re: [PATCH v4 5/6] bus: fsl-mc: supoprt dma configure for devices on fsl-mc bus

2018-04-30 Thread kbuild test robot

Hi Nipun,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on linus/master]
[also build test ERROR on v4.17-rc3 next-20180430]
[cannot apply to iommu/next glikely/devicetree/next]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Nipun-Gupta/Support-for-fsl-mc-bus-and-its-devices-in-SMMU/20180501-125745
config: x86_64-allmodconfig (attached as .config)
compiler: gcc-7 (Debian 7.3.0-16) 7.3.0
reproduce:
# save the attached .config to linux build tree
make ARCH=x86_64 

All errors (new ones prefixed by >>):

   drivers/bus/fsl-mc/fsl-mc-bus.c: In function 'fsl_mc_dma_configure':
>> drivers/bus/fsl-mc/fsl-mc-bus.c:137:9: error: too many arguments to function 
>> 'of_dma_configure'
 return of_dma_configure(dev, dma_dev->of_node, 0);
^~~~
   In file included from drivers/bus/fsl-mc/fsl-mc-bus.c:13:0:
   include/linux/of_device.h:58:5: note: declared here
int of_dma_configure(struct device *dev, struct device_node *np);
^~~~
   drivers/bus/fsl-mc/fsl-mc-bus.c: At top level:
>> drivers/bus/fsl-mc/fsl-mc-bus.c:161:3: error: 'struct bus_type' has no 
>> member named 'dma_configure'
 .dma_configure  = fsl_mc_dma_configure,
  ^

vim +/of_dma_configure +137 drivers/bus/fsl-mc/fsl-mc-bus.c

   129  
   130  static int fsl_mc_dma_configure(struct device *dev)
   131  {
   132  struct device *dma_dev = dev;
   133  
   134  while (dev_is_fsl_mc(dma_dev))
   135  dma_dev = dma_dev->parent;
   136  
 > 137  return of_dma_configure(dev, dma_dev->of_node, 0);
   138  }
   139  
   140  static ssize_t modalias_show(struct device *dev, struct 
device_attribute *attr,
   141   char *buf)
   142  {
   143  struct fsl_mc_device *mc_dev = to_fsl_mc_device(dev);
   144  
   145  return sprintf(buf, "fsl-mc:v%08Xd%s\n", 
mc_dev->obj_desc.vendor,
   146 mc_dev->obj_desc.type);
   147  }
   148  static DEVICE_ATTR_RO(modalias);
   149  
   150  static struct attribute *fsl_mc_dev_attrs[] = {
   151  _attr_modalias.attr,
   152  NULL,
   153  };
   154  
   155  ATTRIBUTE_GROUPS(fsl_mc_dev);
   156  
   157  struct bus_type fsl_mc_bus_type = {
   158  .name = "fsl-mc",
   159  .match = fsl_mc_bus_match,
   160  .uevent = fsl_mc_bus_uevent,
 > 161  .dma_configure  = fsl_mc_dma_configure,
   162  .dev_groups = fsl_mc_dev_groups,
   163  };
   164  EXPORT_SYMBOL_GPL(fsl_mc_bus_type);
   165  

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip

linux-next: Tree for May 1

2018-04-30 Thread Stephen Rothwell

Hi all,

Changes since 20180430:

The rdma tree gained a conflict against the rdma-fixes tree.

Non-merge commits (relative to Linus' tree): 3348
 3188 files changed, 126791 insertions(+), 59113 deletions(-)



I have created today's linux-next tree at
git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
(patches at http://www.kernel.org/pub/linux/kernel/next/ ).  If you
are tracking the linux-next tree using git, you should not use "git pull"
to do so as that will try to merge the new linux-next release with the
old one.  You should use "git fetch" and checkout or reset to the new
master.

You can see which trees have been included by looking in the Next/Trees
file in the source.  There are also quilt-import.log and merge.log
files in the Next directory.  Between each merge, the tree was built
with a ppc64_defconfig for powerpc, an allmodconfig for x86_64, a
multi_v7_defconfig for arm and a native build of tools/perf. After
the final fixups (if any), I do an x86_64 modules_install followed by
builds for x86_64 allnoconfig, powerpc allnoconfig (32 and 64 bit),
ppc44x_defconfig, allyesconfig and pseries_le_defconfig and i386, sparc
and sparc64 defconfig. And finally, a simple boot test of the powerpc
pseries_le_defconfig kernel in qemu (with and without kvm enabled).

Below is a summary of the state of the merge.

I am currently merging 258 trees (counting Linus' and 44 trees of bug
fix patches pending for the current merge release).

Stats about the size of the tree over time can be seen at
http://neuling.org/linux-next-size.html .

Status of my local build tests will be at
http://kisskb.ellerman.id.au/linux-next .  If maintainers want to give
advice about cross compilers/configs that work, we are always open to add
more builds.

Thanks to Randy Dunlap for doing many randconfig builds.  And to Paul
Gortmaker for triage and bug fixes.

-- 
Cheers,
Stephen Rothwell

$ git checkout master
$ git reset --hard stable
Merging origin/master (8188fc8bef8c Merge 
git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc)
Merging fixes/master (147a89bc71e7 Merge tag 'kconfig-v4.17' of 
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild)
Merging kbuild-current/fixes (6d08b06e67cd Linux 4.17-rc2)
Merging arc-current/for-curr (661e50bc8532 Linux 4.16-rc4)
Merging arm-current/fixes (30cfae461581 ARM: replace unnecessary perl with sed 
and the shell $(( )) operator)
Merging arm64-fixes/for-next/fixes (3789c122d0a0 arm64: avoid instrumenting 
atomic_ll_sc.o)
Merging m68k-current/for-linus (ecd685580c8f m68k/mac: Remove bogus "FIXME" 
comment)
Merging powerpc-fixes/fixes (b2d7ecbe3556 powerpc/kvm/booke: Fix altivec 
related build break)
Merging sparc/master (00ad691ab140 sparc: vio: use put_device() instead of 
kfree())
Merging fscrypt-current/for-stable (ae64f9bd1d36 Linux 4.15-rc2)
Merging net/master (f944ad1b2b66 net: ethernet: ucc: fix spelling mistake: 
"tx-late-collsion" -> "tx-late-collision")
Merging bpf/master (815425567dea bpf: fix uninitialized variable in bpf tools)
Merging ipsec/master (b4331a681822 vti6: Change minimum MTU to IPV4_MIN_MTU, 
vti6 can carry IPv4 too)
Merging netfilter/master (2f99aa31cd7a netfilter: nf_tables: skip 
synchronize_rcu if transaction log is empty)
Merging ipvs/master (765cca91b895 netfilter: conntrack: include kmemleak.h for 
kmemleak_not_leak())
Merging wireless-drivers/master (af8a41cccf8f rtlwifi: cleanup 8723be ant_sel 
definition)
Merging mac80211/master (2f0605a697f4 nl80211: Free connkeys on external 
authentication failure)
Merging rdma-fixes/for-rc (db82476f3741 IB/core: Make ib_mad_client_id atomic)
Merging sound-current/for-linus (76b3421b39bd ALSA: aloop: Add missing cable 
lock to ctl API callbacks)
Merging pci-current/for-linus (0cf22d6b317c PCI: Add "PCIe" to 
pcie_print_link_status() messages)
Merging driver-core.current/driver-core-linus (6da6c0db5316 Linux v4.17-rc3)
Merging tty.current/tty-linus (6da6c0db5316 Linux v4.17-rc3)
Merging usb.current/usb-linus (9aea9b6cc78d usb: musb: trace: fix NULL pointer 
dereference in musb_g_tx())
Merging usb-gadget-fixes/fixes (ed769520727e usb: gadget: composite Allow for 
larger configuration descriptors)
Merging usb-serial-fixes/usb-linus (470b5d6f0cf4 USB: serial: ftdi_sio: use 
jtag quirk for Arrow USB Blaster)
Merging usb-chipidea-fixes/ci-for-usb-stable (964728f9f407 USB: chipidea: msm: 
fix ulpi-node lookup)
Merging phy/fixes (60cc43fc8884 Linux 4.17-rc1)
Merging staging.current/staging-linus (6da6c0db5316 Linux v4.17-rc3)
Merging char-misc.current/char-misc-linus (6da6c0db5316 Linux v4.17-rc3)
Merging input-current/for-linus (596ea7aad431 MAINTAINERS: Rakesh Iyer can't be 
reached anymore)
Merging crypto-current/master (eea0d3ea7546 crypto: drbg - set freed buffers to 
NULL)
Merging ide/master (8e44e6600caa Merge branch 'KASAN-read_word_at_a_time')
Mer

linux-next: Tree for May 1

2018-04-30 Thread Stephen Rothwell

Hi all,

Changes since 20180430:

The rdma tree gained a conflict against the rdma-fixes tree.

Non-merge commits (relative to Linus' tree): 3348
 3188 files changed, 126791 insertions(+), 59113 deletions(-)



I have created today's linux-next tree at
git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
(patches at http://www.kernel.org/pub/linux/kernel/next/ ).  If you
are tracking the linux-next tree using git, you should not use "git pull"
to do so as that will try to merge the new linux-next release with the
old one.  You should use "git fetch" and checkout or reset to the new
master.

You can see which trees have been included by looking in the Next/Trees
file in the source.  There are also quilt-import.log and merge.log
files in the Next directory.  Between each merge, the tree was built
with a ppc64_defconfig for powerpc, an allmodconfig for x86_64, a
multi_v7_defconfig for arm and a native build of tools/perf. After
the final fixups (if any), I do an x86_64 modules_install followed by
builds for x86_64 allnoconfig, powerpc allnoconfig (32 and 64 bit),
ppc44x_defconfig, allyesconfig and pseries_le_defconfig and i386, sparc
and sparc64 defconfig. And finally, a simple boot test of the powerpc
pseries_le_defconfig kernel in qemu (with and without kvm enabled).

Below is a summary of the state of the merge.

I am currently merging 258 trees (counting Linus' and 44 trees of bug
fix patches pending for the current merge release).

Stats about the size of the tree over time can be seen at
http://neuling.org/linux-next-size.html .

Status of my local build tests will be at
http://kisskb.ellerman.id.au/linux-next .  If maintainers want to give
advice about cross compilers/configs that work, we are always open to add
more builds.

Thanks to Randy Dunlap for doing many randconfig builds.  And to Paul
Gortmaker for triage and bug fixes.

-- 
Cheers,
Stephen Rothwell

$ git checkout master
$ git reset --hard stable
Merging origin/master (8188fc8bef8c Merge 
git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc)
Merging fixes/master (147a89bc71e7 Merge tag 'kconfig-v4.17' of 
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild)
Merging kbuild-current/fixes (6d08b06e67cd Linux 4.17-rc2)
Merging arc-current/for-curr (661e50bc8532 Linux 4.16-rc4)
Merging arm-current/fixes (30cfae461581 ARM: replace unnecessary perl with sed 
and the shell $(( )) operator)
Merging arm64-fixes/for-next/fixes (3789c122d0a0 arm64: avoid instrumenting 
atomic_ll_sc.o)
Merging m68k-current/for-linus (ecd685580c8f m68k/mac: Remove bogus "FIXME" 
comment)
Merging powerpc-fixes/fixes (b2d7ecbe3556 powerpc/kvm/booke: Fix altivec 
related build break)
Merging sparc/master (00ad691ab140 sparc: vio: use put_device() instead of 
kfree())
Merging fscrypt-current/for-stable (ae64f9bd1d36 Linux 4.15-rc2)
Merging net/master (f944ad1b2b66 net: ethernet: ucc: fix spelling mistake: 
"tx-late-collsion" -> "tx-late-collision")
Merging bpf/master (815425567dea bpf: fix uninitialized variable in bpf tools)
Merging ipsec/master (b4331a681822 vti6: Change minimum MTU to IPV4_MIN_MTU, 
vti6 can carry IPv4 too)
Merging netfilter/master (2f99aa31cd7a netfilter: nf_tables: skip 
synchronize_rcu if transaction log is empty)
Merging ipvs/master (765cca91b895 netfilter: conntrack: include kmemleak.h for 
kmemleak_not_leak())
Merging wireless-drivers/master (af8a41cccf8f rtlwifi: cleanup 8723be ant_sel 
definition)
Merging mac80211/master (2f0605a697f4 nl80211: Free connkeys on external 
authentication failure)
Merging rdma-fixes/for-rc (db82476f3741 IB/core: Make ib_mad_client_id atomic)
Merging sound-current/for-linus (76b3421b39bd ALSA: aloop: Add missing cable 
lock to ctl API callbacks)
Merging pci-current/for-linus (0cf22d6b317c PCI: Add "PCIe" to 
pcie_print_link_status() messages)
Merging driver-core.current/driver-core-linus (6da6c0db5316 Linux v4.17-rc3)
Merging tty.current/tty-linus (6da6c0db5316 Linux v4.17-rc3)
Merging usb.current/usb-linus (9aea9b6cc78d usb: musb: trace: fix NULL pointer 
dereference in musb_g_tx())
Merging usb-gadget-fixes/fixes (ed769520727e usb: gadget: composite Allow for 
larger configuration descriptors)
Merging usb-serial-fixes/usb-linus (470b5d6f0cf4 USB: serial: ftdi_sio: use 
jtag quirk for Arrow USB Blaster)
Merging usb-chipidea-fixes/ci-for-usb-stable (964728f9f407 USB: chipidea: msm: 
fix ulpi-node lookup)
Merging phy/fixes (60cc43fc8884 Linux 4.17-rc1)
Merging staging.current/staging-linus (6da6c0db5316 Linux v4.17-rc3)
Merging char-misc.current/char-misc-linus (6da6c0db5316 Linux v4.17-rc3)
Merging input-current/for-linus (596ea7aad431 MAINTAINERS: Rakesh Iyer can't be 
reached anymore)
Merging crypto-current/master (eea0d3ea7546 crypto: drbg - set freed buffers to 
NULL)
Merging ide/master (8e44e6600caa Merge branch 'KASAN-read_word_at_a_time')
Mer

RE: [PATCH 2/2] Use bit-wise majority to recover the contents of ONFI parameter

2018-04-30 Thread Wan, Jane (Nokia - US/Sunnyvale)

Hi Miquèl,

Thank you for your response and feedback.  I've modified the fix based on your 
comments.  
Please see the updated patch file at the end of this message (also in 
attachment).
My answers to your comments/questions are inline in the previous message.

The new patch is rebased on top of v4.17-rc1.

Best regards,
Jane

Updated patch:
From e14ed7dc08296a52f81d14781dee2f455dd90bbd Mon Sep 17 00:00:00 2001
From: Jane Wan 
Date: Mon, 30 Apr 2018 14:05:40 -0700
Subject: [PATCH 2/2] mtd: rawnand: fsl_ifc: use bit-wise majority to recover
 the contents of ONFI parameter

Per ONFI specification (Rev. 4.0), if all parameter pages have invalid
CRC values, the bit-wise majority may be used to recover the contents of
the parameter pages from the parameter page copies present.

Signed-off-by: Jane Wan 
---
 drivers/mtd/nand/raw/nand_base.c |   36 ++--
 1 file changed, 30 insertions(+), 6 deletions(-)

diff --git a/drivers/mtd/nand/raw/nand_base.c b/drivers/mtd/nand/raw/nand_base.c
index 72f3a89..464c4fb 100644
--- a/drivers/mtd/nand/raw/nand_base.c
+++ b/drivers/mtd/nand/raw/nand_base.c
@@ -5086,6 +5086,8 @@ static int nand_flash_detect_ext_param_page(struct 
nand_chip *chip,
return ret;
 }
 
+#define GET_BIT(bit, val)   (((val) >> (bit)) & 0x01)
+
 /*
  * Check if the NAND chip is ONFI compliant, returns 1 if it is, 0 otherwise.
  */
@@ -5094,7 +5096,8 @@ static int nand_flash_detect_onfi(struct nand_chip *chip)
struct mtd_info *mtd = nand_to_mtd(chip);
struct nand_onfi_params *p;
char id[4];
-   int i, ret, val;
+   int i, ret, val, pagesize;
+   u8 *buf;
 
/* Try ONFI for unknown chip or LP */
ret = nand_readid_op(chip, 0x20, id, sizeof(id));
@@ -5102,8 +5105,9 @@ static int nand_flash_detect_onfi(struct nand_chip *chip)
return 0;
 
/* ONFI chip: allocate a buffer to hold its parameter page */
-   p = kzalloc(sizeof(*p), GFP_KERNEL);
-   if (!p)
+   pagesize = sizeof(*p);
+   buf = kzalloc((pagesize * 3), GFP_KERNEL);
+   if (!buf)
return -ENOMEM;
 
ret = nand_read_param_page_op(chip, 0, NULL, 0);
@@ -5113,7 +5117,8 @@ static int nand_flash_detect_onfi(struct nand_chip *chip)
}
 
for (i = 0; i < 3; i++) {
-   ret = nand_read_data_op(chip, p, sizeof(*p), true);
+   p = (struct nand_onfi_params *)[i*pagesize];
+   ret = nand_read_data_op(chip, p, pagesize, true);
if (ret) {
ret = 0;
goto free_onfi_param_page;
@@ -5126,8 +5131,27 @@ static int nand_flash_detect_onfi(struct nand_chip *chip)
}
 
if (i == 3) {
-   pr_err("Could not find valid ONFI parameter page; aborting\n");
-   goto free_onfi_param_page;
+   int j, k, l;
+   u8 v, m;
+
+   pr_err("Could not find valid ONFI parameter page\n");
+   pr_info("Recover ONFI params with bit-wise majority\n");
+   for (j = 0; j < pagesize; j++) {
+   v = 0;
+   for (k = 0; k < 8; k++) {
+   m = 0;
+   for (l = 0; l < 3; l++)
+   m += GET_BIT(k, buf[l*pagesize + j]);
+   if (m > 1)
+   v |= BIT(k);
+   }
+   ((u8 *)p)[j] = v;
+   }
+   if (onfi_crc16(ONFI_CRC_BASE, (uint8_t *)p, 254) !=
+   le16_to_cpu(p->crc)) {
+   pr_err("ONFI parameter recovery failed, aborting\n");
+   goto free_onfi_param_page;
+   }
}
 
/* Check version */
-- 
1.7.9.5

-Original Message-
From: Miquel Raynal [mailto:miquel.ray...@bootlin.com] 
Sent: Saturday, April 28, 2018 5:07 AM
To: Wan, Jane (Nokia - US/Sunnyvale) 
Cc: dw...@infradead.org; computersforpe...@gmail.com; Bos, Ties (Nokia - 
US/Sunnyvale) ; linux-...@lists.infradead.org; 
linux-kernel@vger.kernel.org; Boris Brezillon 
Subject: Re: [PATCH 2/2] Use bit-wise majority to recover the contents of ONFI 
parameter

Hi Jane,

Same comments as before, please: get the right maintainers, add a commit log, 
rebase and fix the title prefix.
[Jane]  Added.  Thanks.

Have you ever needed/tried this algorithm before?
[Jane] Yes, we got a batch of particularly bad NAND chips recently and we 
needed these changes to make them work reliably over temperature.  The patch 
was verified using these bad chips.

On Thu, 26 Apr 2018 17:19:56 -0700, Jane Wan  wrote:

> Signed-off-by: Jane Wan 
> ---
>  drivers/mtd/nand/nand_base.c |   35 +++
>  1 file

RE: [PATCH 2/2] Use bit-wise majority to recover the contents of ONFI parameter

2018-04-30 Thread Wan, Jane (Nokia - US/Sunnyvale)

Hi Miquèl,

Thank you for your response and feedback.  I've modified the fix based on your 
comments.  
Please see the updated patch file at the end of this message (also in 
attachment).
My answers to your comments/questions are inline in the previous message.

The new patch is rebased on top of v4.17-rc1.

Best regards,
Jane

Updated patch:
From e14ed7dc08296a52f81d14781dee2f455dd90bbd Mon Sep 17 00:00:00 2001
From: Jane Wan 
Date: Mon, 30 Apr 2018 14:05:40 -0700
Subject: [PATCH 2/2] mtd: rawnand: fsl_ifc: use bit-wise majority to recover
 the contents of ONFI parameter

Per ONFI specification (Rev. 4.0), if all parameter pages have invalid
CRC values, the bit-wise majority may be used to recover the contents of
the parameter pages from the parameter page copies present.

Signed-off-by: Jane Wan 
---
 drivers/mtd/nand/raw/nand_base.c |   36 ++--
 1 file changed, 30 insertions(+), 6 deletions(-)

diff --git a/drivers/mtd/nand/raw/nand_base.c b/drivers/mtd/nand/raw/nand_base.c
index 72f3a89..464c4fb 100644
--- a/drivers/mtd/nand/raw/nand_base.c
+++ b/drivers/mtd/nand/raw/nand_base.c
@@ -5086,6 +5086,8 @@ static int nand_flash_detect_ext_param_page(struct 
nand_chip *chip,
return ret;
 }
 
+#define GET_BIT(bit, val)   (((val) >> (bit)) & 0x01)
+
 /*
  * Check if the NAND chip is ONFI compliant, returns 1 if it is, 0 otherwise.
  */
@@ -5094,7 +5096,8 @@ static int nand_flash_detect_onfi(struct nand_chip *chip)
struct mtd_info *mtd = nand_to_mtd(chip);
struct nand_onfi_params *p;
char id[4];
-   int i, ret, val;
+   int i, ret, val, pagesize;
+   u8 *buf;
 
/* Try ONFI for unknown chip or LP */
ret = nand_readid_op(chip, 0x20, id, sizeof(id));
@@ -5102,8 +5105,9 @@ static int nand_flash_detect_onfi(struct nand_chip *chip)
return 0;
 
/* ONFI chip: allocate a buffer to hold its parameter page */
-   p = kzalloc(sizeof(*p), GFP_KERNEL);
-   if (!p)
+   pagesize = sizeof(*p);
+   buf = kzalloc((pagesize * 3), GFP_KERNEL);
+   if (!buf)
return -ENOMEM;
 
ret = nand_read_param_page_op(chip, 0, NULL, 0);
@@ -5113,7 +5117,8 @@ static int nand_flash_detect_onfi(struct nand_chip *chip)
}
 
for (i = 0; i < 3; i++) {
-   ret = nand_read_data_op(chip, p, sizeof(*p), true);
+   p = (struct nand_onfi_params *)[i*pagesize];
+   ret = nand_read_data_op(chip, p, pagesize, true);
if (ret) {
ret = 0;
goto free_onfi_param_page;
@@ -5126,8 +5131,27 @@ static int nand_flash_detect_onfi(struct nand_chip *chip)
}
 
if (i == 3) {
-   pr_err("Could not find valid ONFI parameter page; aborting\n");
-   goto free_onfi_param_page;
+   int j, k, l;
+   u8 v, m;
+
+   pr_err("Could not find valid ONFI parameter page\n");
+   pr_info("Recover ONFI params with bit-wise majority\n");
+   for (j = 0; j < pagesize; j++) {
+   v = 0;
+   for (k = 0; k < 8; k++) {
+   m = 0;
+   for (l = 0; l < 3; l++)
+   m += GET_BIT(k, buf[l*pagesize + j]);
+   if (m > 1)
+   v |= BIT(k);
+   }
+   ((u8 *)p)[j] = v;
+   }
+   if (onfi_crc16(ONFI_CRC_BASE, (uint8_t *)p, 254) !=
+   le16_to_cpu(p->crc)) {
+   pr_err("ONFI parameter recovery failed, aborting\n");
+   goto free_onfi_param_page;
+   }
}
 
/* Check version */
-- 
1.7.9.5

-Original Message-
From: Miquel Raynal [mailto:miquel.ray...@bootlin.com] 
Sent: Saturday, April 28, 2018 5:07 AM
To: Wan, Jane (Nokia - US/Sunnyvale) 
Cc: dw...@infradead.org; computersforpe...@gmail.com; Bos, Ties (Nokia - 
US/Sunnyvale) ; linux-...@lists.infradead.org; 
linux-kernel@vger.kernel.org; Boris Brezillon 
Subject: Re: [PATCH 2/2] Use bit-wise majority to recover the contents of ONFI 
parameter

Hi Jane,

Same comments as before, please: get the right maintainers, add a commit log, 
rebase and fix the title prefix.
[Jane]  Added.  Thanks.

Have you ever needed/tried this algorithm before?
[Jane] Yes, we got a batch of particularly bad NAND chips recently and we 
needed these changes to make them work reliably over temperature.  The patch 
was verified using these bad chips.

On Thu, 26 Apr 2018 17:19:56 -0700, Jane Wan  wrote:

> Signed-off-by: Jane Wan 
> ---
>  drivers/mtd/nand/nand_base.c |   35 +++
>  1 file changed, 31 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/mtd/nand/nand_base.c 
> b/drivers/mtd/nand/nand_base.c index c2e1232..161b523

[PATCH v3 2/2] nvmem: Add RAVE SP EEPROM driver

2018-04-30 Thread Andrey Smirnov

Add driver providing access to EEPROMs connected to RAVE SP devices

Cc: Srinivas Kandagatla 
Cc: linux-kernel@vger.kernel.org
Cc: Chris Healy 
Cc: Lucas Stach 
Cc: Aleksander Morgado 
Signed-off-by: Andrey Smirnov 
---
 drivers/nvmem/Kconfig  |   6 +
 drivers/nvmem/Makefile |   3 +
 drivers/nvmem/rave-sp-eeprom.c | 357 +
 3 files changed, 366 insertions(+)
 create mode 100644 drivers/nvmem/rave-sp-eeprom.c

diff --git a/drivers/nvmem/Kconfig b/drivers/nvmem/Kconfig
index 1090924efdb1..54a3c298247b 100644
--- a/drivers/nvmem/Kconfig
+++ b/drivers/nvmem/Kconfig
@@ -175,4 +175,10 @@ config NVMEM_SNVS_LPGPR
  This driver can also be built as a module. If so, the module
  will be called nvmem-snvs-lpgpr.
 
+config RAVE_SP_EEPROM
+   tristate "Rave SP EEPROM Support"
+   depends on RAVE_SP_CORE
+   help
+ Say y here to enable Rave SP EEPROM support.
+
 endif
diff --git a/drivers/nvmem/Makefile b/drivers/nvmem/Makefile
index e54dcfa6565a..27e96a8efd1c 100644
--- a/drivers/nvmem/Makefile
+++ b/drivers/nvmem/Makefile
@@ -37,3 +37,6 @@ obj-$(CONFIG_MESON_MX_EFUSE)  += nvmem_meson_mx_efuse.o
 nvmem_meson_mx_efuse-y := meson-mx-efuse.o
 obj-$(CONFIG_NVMEM_SNVS_LPGPR) += nvmem_snvs_lpgpr.o
 nvmem_snvs_lpgpr-y := snvs_lpgpr.o
+obj-$(CONFIG_RAVE_SP_EEPROM)   += nvmem-rave-sp-eeprom.o
+nvmem-rave-sp-eeprom-y := rave-sp-eeprom.o
+
diff --git a/drivers/nvmem/rave-sp-eeprom.c b/drivers/nvmem/rave-sp-eeprom.c
new file mode 100644
index ..50aeea6ec6cc
--- /dev/null
+++ b/drivers/nvmem/rave-sp-eeprom.c
@@ -0,0 +1,357 @@
+// SPDX-License-Identifier: GPL-2.0+
+
+/*
+ * EEPROM driver for RAVE SP
+ *
+ * Copyright (C) 2018 Zodiac Inflight Innovations
+ *
+ */
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+/**
+ * enum rave_sp_eeprom_access_type - Supported types of EEPROM access
+ *
+ * @RAVE_SP_EEPROM_WRITE:  EEPROM write
+ * @RAVE_SP_EEPROM_READ:   EEPROM read
+ */
+enum rave_sp_eeprom_access_type {
+   RAVE_SP_EEPROM_WRITE = 0,
+   RAVE_SP_EEPROM_READ  = 1,
+};
+
+/**
+ * enum rave_sp_eeprom_header_size - EEPROM command header sizes
+ *
+ * @RAVE_SP_EEPROM_HEADER_SMALL: EEPROM header size for "small" devices (< 8K)
+ * @RAVE_SP_EEPROM_HEADER_BIG:  EEPROM header size for "big" devices (> 8K)
+ */
+enum rave_sp_eeprom_header_size {
+   RAVE_SP_EEPROM_HEADER_SMALL = 4U,
+   RAVE_SP_EEPROM_HEADER_BIG   = 5U,
+};
+
+#defineRAVE_SP_EEPROM_PAGE_SIZE32U
+
+/**
+ * struct rave_sp_eeprom_page - RAVE SP EEPROM page
+ *
+ * @type:  Access type (see enum rave_sp_eeprom_access_type)
+ * @success:   Success flag (Success = 1, Failure = 0)
+ * @data:  Read data
+
+ * Note this structure corresponds to RSP_*_EEPROM payload from RAVE
+ * SP ICD
+ */
+struct rave_sp_eeprom_page {
+   u8  type;
+   u8  success;
+   u8  data[RAVE_SP_EEPROM_PAGE_SIZE];
+} __packed;
+
+/**
+ * struct rave_sp_eeprom - RAVE SP EEPROM device
+ *
+ * @sp:Pointer to parent RAVE SP device
+ * @mutex: Lock protecting access to EEPROM
+ * @address:   EEPROM device address
+ * @header_size:   Size of EEPROM command header for this device
+ * @dev:   Pointer to corresponding struct device used for logging
+ */
+struct rave_sp_eeprom {
+   struct rave_sp *sp;
+   struct mutex mutex;
+   u8 address;
+   unsigned int header_size;
+   struct device *dev;
+};
+
+/**
+ * rave_sp_eeprom_io - Low-level part of EEPROM page access
+ *
+ * @eeprom:EEPROM device to write to
+ * @type:  EEPROM access type (read or write)
+ * @idx:   number of the EEPROM page
+ * @page:  Data to write or buffer to store result (via page->data)
+ *
+ * This function does all of the low-level work required to perform a
+ * EEPROM access. This includes formatting correct command payload,
+ * sending it and checking received results.
+ *
+ * Returns zero in case of success or negative error code in
+ * case of failure.
+ */
+static int rave_sp_eeprom_io(struct rave_sp_eeprom *eeprom,
+enum rave_sp_eeprom_access_type type,
+u16 idx,
+struct rave_sp_eeprom_page *page)
+{
+   const bool is_write = type == RAVE_SP_EEPROM_WRITE;
+   const unsigned int data_size = is_write ? sizeof(page->data) : 0;
+   const unsigned int cmd_size = eeprom->header_size + data_size;
+   const unsigned int rsp_size =
+   is_write ? sizeof(*page) - sizeof(page->data) : sizeof(*page);
+   unsigned int offset = 0;
+   u8 cmd[cmd_size];
+   int ret;
+
+   cmd[offset++] = eeprom->address;
+   cmd[offset++] = 0;
+   cmd[offset++] = type;
+   cmd[offset++] = idx;
+

[PATCH v3 2/2] nvmem: Add RAVE SP EEPROM driver

2018-04-30 Thread Andrey Smirnov

Add driver providing access to EEPROMs connected to RAVE SP devices

Cc: Srinivas Kandagatla 
Cc: linux-kernel@vger.kernel.org
Cc: Chris Healy 
Cc: Lucas Stach 
Cc: Aleksander Morgado 
Signed-off-by: Andrey Smirnov 
---
 drivers/nvmem/Kconfig  |   6 +
 drivers/nvmem/Makefile |   3 +
 drivers/nvmem/rave-sp-eeprom.c | 357 +
 3 files changed, 366 insertions(+)
 create mode 100644 drivers/nvmem/rave-sp-eeprom.c

diff --git a/drivers/nvmem/Kconfig b/drivers/nvmem/Kconfig
index 1090924efdb1..54a3c298247b 100644
--- a/drivers/nvmem/Kconfig
+++ b/drivers/nvmem/Kconfig
@@ -175,4 +175,10 @@ config NVMEM_SNVS_LPGPR
  This driver can also be built as a module. If so, the module
  will be called nvmem-snvs-lpgpr.
 
+config RAVE_SP_EEPROM
+   tristate "Rave SP EEPROM Support"
+   depends on RAVE_SP_CORE
+   help
+ Say y here to enable Rave SP EEPROM support.
+
 endif
diff --git a/drivers/nvmem/Makefile b/drivers/nvmem/Makefile
index e54dcfa6565a..27e96a8efd1c 100644
--- a/drivers/nvmem/Makefile
+++ b/drivers/nvmem/Makefile
@@ -37,3 +37,6 @@ obj-$(CONFIG_MESON_MX_EFUSE)  += nvmem_meson_mx_efuse.o
 nvmem_meson_mx_efuse-y := meson-mx-efuse.o
 obj-$(CONFIG_NVMEM_SNVS_LPGPR) += nvmem_snvs_lpgpr.o
 nvmem_snvs_lpgpr-y := snvs_lpgpr.o
+obj-$(CONFIG_RAVE_SP_EEPROM)   += nvmem-rave-sp-eeprom.o
+nvmem-rave-sp-eeprom-y := rave-sp-eeprom.o
+
diff --git a/drivers/nvmem/rave-sp-eeprom.c b/drivers/nvmem/rave-sp-eeprom.c
new file mode 100644
index ..50aeea6ec6cc
--- /dev/null
+++ b/drivers/nvmem/rave-sp-eeprom.c
@@ -0,0 +1,357 @@
+// SPDX-License-Identifier: GPL-2.0+
+
+/*
+ * EEPROM driver for RAVE SP
+ *
+ * Copyright (C) 2018 Zodiac Inflight Innovations
+ *
+ */
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+/**
+ * enum rave_sp_eeprom_access_type - Supported types of EEPROM access
+ *
+ * @RAVE_SP_EEPROM_WRITE:  EEPROM write
+ * @RAVE_SP_EEPROM_READ:   EEPROM read
+ */
+enum rave_sp_eeprom_access_type {
+   RAVE_SP_EEPROM_WRITE = 0,
+   RAVE_SP_EEPROM_READ  = 1,
+};
+
+/**
+ * enum rave_sp_eeprom_header_size - EEPROM command header sizes
+ *
+ * @RAVE_SP_EEPROM_HEADER_SMALL: EEPROM header size for "small" devices (< 8K)
+ * @RAVE_SP_EEPROM_HEADER_BIG:  EEPROM header size for "big" devices (> 8K)
+ */
+enum rave_sp_eeprom_header_size {
+   RAVE_SP_EEPROM_HEADER_SMALL = 4U,
+   RAVE_SP_EEPROM_HEADER_BIG   = 5U,
+};
+
+#defineRAVE_SP_EEPROM_PAGE_SIZE32U
+
+/**
+ * struct rave_sp_eeprom_page - RAVE SP EEPROM page
+ *
+ * @type:  Access type (see enum rave_sp_eeprom_access_type)
+ * @success:   Success flag (Success = 1, Failure = 0)
+ * @data:  Read data
+
+ * Note this structure corresponds to RSP_*_EEPROM payload from RAVE
+ * SP ICD
+ */
+struct rave_sp_eeprom_page {
+   u8  type;
+   u8  success;
+   u8  data[RAVE_SP_EEPROM_PAGE_SIZE];
+} __packed;
+
+/**
+ * struct rave_sp_eeprom - RAVE SP EEPROM device
+ *
+ * @sp:Pointer to parent RAVE SP device
+ * @mutex: Lock protecting access to EEPROM
+ * @address:   EEPROM device address
+ * @header_size:   Size of EEPROM command header for this device
+ * @dev:   Pointer to corresponding struct device used for logging
+ */
+struct rave_sp_eeprom {
+   struct rave_sp *sp;
+   struct mutex mutex;
+   u8 address;
+   unsigned int header_size;
+   struct device *dev;
+};
+
+/**
+ * rave_sp_eeprom_io - Low-level part of EEPROM page access
+ *
+ * @eeprom:EEPROM device to write to
+ * @type:  EEPROM access type (read or write)
+ * @idx:   number of the EEPROM page
+ * @page:  Data to write or buffer to store result (via page->data)
+ *
+ * This function does all of the low-level work required to perform a
+ * EEPROM access. This includes formatting correct command payload,
+ * sending it and checking received results.
+ *
+ * Returns zero in case of success or negative error code in
+ * case of failure.
+ */
+static int rave_sp_eeprom_io(struct rave_sp_eeprom *eeprom,
+enum rave_sp_eeprom_access_type type,
+u16 idx,
+struct rave_sp_eeprom_page *page)
+{
+   const bool is_write = type == RAVE_SP_EEPROM_WRITE;
+   const unsigned int data_size = is_write ? sizeof(page->data) : 0;
+   const unsigned int cmd_size = eeprom->header_size + data_size;
+   const unsigned int rsp_size =
+   is_write ? sizeof(*page) - sizeof(page->data) : sizeof(*page);
+   unsigned int offset = 0;
+   u8 cmd[cmd_size];
+   int ret;
+
+   cmd[offset++] = eeprom->address;
+   cmd[offset++] = 0;
+   cmd[offset++] = type;
+   cmd[offset++] = idx;
+
+   /*
+* If there's still room in this command's header it means we
+* are talkin to EEPROM that uses

[PATCH v3 0/2] RAVE SP EEPROM driver

2018-04-30 Thread Andrey Smirnov

Srinivas:

This series is a third iteration of the patchset adding NVMEM support
for EEPROMs connected to RAVE SP MFD device (support for which landed
in 4.15).

Chagnes since [v2]:

- Added verbiage about data cells, fixed captial case hex
  number as well as lack of address in node name in
  zii,rave-sp-eeprom.txt

- Added optional DT property "zii,eeprom-name", to allow
  giving more descriptive names to NVMEM devices created by
  the driver

Changes since [v1]:

- Patchset rebased on latest master from Linus which contains
  all necessary dependencies

- Added sizes.h to include in patch 1/2 to avoid build breaks
  reported by build-bot

- Added missing #size-cells, #address-cells as well as example
  cell to DT bindings documentation (pointed out by Rob)

Feedback is wellcome!

Thanks,
Andrey Smirnov

[v2] lkml.kernel.org/r/20180411015948.19562-1-andrew.smir...@gmail.com
[v1] lkml.kernel.org/r/20180321134710.29757-1-andrew.smir...@gmail.com

Andrey Smirnov (2):
  dt-bindings: nvmem: Add binding for RAVE SP EEPROM driver
  nvmem: Add RAVE SP EEPROM driver

 .../bindings/nvmem/zii,rave-sp-eeprom.txt  |  40 +++
 drivers/nvmem/Kconfig  |   6 +
 drivers/nvmem/Makefile |   3 +
 drivers/nvmem/rave-sp-eeprom.c | 357 +
 4 files changed, 406 insertions(+)
 create mode 100644 
Documentation/devicetree/bindings/nvmem/zii,rave-sp-eeprom.txt
 create mode 100644 drivers/nvmem/rave-sp-eeprom.c

-- 
2.14.3

[PATCH v3 1/2] dt-bindings: nvmem: Add binding for RAVE SP EEPROM driver

2018-04-30 Thread Andrey Smirnov

Add Device Tree bindings for RAVE SP EEPROM driver - an MFD cell of
parent RAVE SP driver (documented in
Documentation/devicetree/bindings/mfd/zii,rave-sp.txt).

Cc: Srinivas Kandagatla 
Cc: linux-kernel@vger.kernel.org
Cc: Chris Healy 
Cc: Lucas Stach 
Cc: Aleksander Morgado 
Cc: Rob Herring 
Cc: Mark Rutland 
Cc: devicet...@vger.kernel.org
Signed-off-by: Andrey Smirnov 
---
 .../bindings/nvmem/zii,rave-sp-eeprom.txt  | 40 ++
 1 file changed, 40 insertions(+)
 create mode 100644 
Documentation/devicetree/bindings/nvmem/zii,rave-sp-eeprom.txt

diff --git a/Documentation/devicetree/bindings/nvmem/zii,rave-sp-eeprom.txt 
b/Documentation/devicetree/bindings/nvmem/zii,rave-sp-eeprom.txt
new file mode 100644
index ..d5e22fc67d66
--- /dev/null
+++ b/Documentation/devicetree/bindings/nvmem/zii,rave-sp-eeprom.txt
@@ -0,0 +1,40 @@
+Zodiac Inflight Innovations RAVE EEPROM Bindings
+
+RAVE SP EEPROM device is a "MFD cell" device exposing physical EEPROM
+attached to RAVE Supervisory Processor. It is expected that its Device
+Tree node is specified as a child of the node corresponding to the
+parent RAVE SP device (as documented in
+Documentation/devicetree/bindings/mfd/zii,rave-sp.txt)
+
+Required properties:
+
+- compatible: Should be "zii,rave-sp-eeprom"
+
+Optional properties:
+
+- zii,eeprom-name: Unique EEPROM identifier describing its function in the
+  system. Will be used as created NVMEM deivce's name.
+
+Data cells:
+
+Data cells are child nodes of eerpom node, bindings for which are
+documented in Documentation/bindings/nvmem/nvmem.txt
+
+Example:
+
+   rave-sp {
+   compatible = "zii,rave-sp-rdu1";
+   current-speed = <38400>;
+
+   eeprom@a4 {
+   compatible = "zii,rave-sp-eeprom";
+   reg = <0xa4 0x4000>;
+   #address-cells = <1>;
+   #size-cells = <1>;
+   zii,eeprom-name = "main-eeprom";
+
+   wdt_timeout: wdt-timeout@81 {
+   reg = <0x81 2>;
+   };
+   };
+   }
-- 
2.14.3

[PATCH v3 0/2] RAVE SP EEPROM driver

2018-04-30 Thread Andrey Smirnov

Srinivas:

This series is a third iteration of the patchset adding NVMEM support
for EEPROMs connected to RAVE SP MFD device (support for which landed
in 4.15).

Chagnes since [v2]:

- Added verbiage about data cells, fixed captial case hex
  number as well as lack of address in node name in
  zii,rave-sp-eeprom.txt

- Added optional DT property "zii,eeprom-name", to allow
  giving more descriptive names to NVMEM devices created by
  the driver

Changes since [v1]:

- Patchset rebased on latest master from Linus which contains
  all necessary dependencies

- Added sizes.h to include in patch 1/2 to avoid build breaks
  reported by build-bot

- Added missing #size-cells, #address-cells as well as example
  cell to DT bindings documentation (pointed out by Rob)

Feedback is wellcome!

Thanks,
Andrey Smirnov

[v2] lkml.kernel.org/r/20180411015948.19562-1-andrew.smir...@gmail.com
[v1] lkml.kernel.org/r/20180321134710.29757-1-andrew.smir...@gmail.com

Andrey Smirnov (2):
  dt-bindings: nvmem: Add binding for RAVE SP EEPROM driver
  nvmem: Add RAVE SP EEPROM driver

 .../bindings/nvmem/zii,rave-sp-eeprom.txt  |  40 +++
 drivers/nvmem/Kconfig  |   6 +
 drivers/nvmem/Makefile |   3 +
 drivers/nvmem/rave-sp-eeprom.c | 357 +
 4 files changed, 406 insertions(+)
 create mode 100644 
Documentation/devicetree/bindings/nvmem/zii,rave-sp-eeprom.txt
 create mode 100644 drivers/nvmem/rave-sp-eeprom.c

-- 
2.14.3

[PATCH v3 1/2] dt-bindings: nvmem: Add binding for RAVE SP EEPROM driver

2018-04-30 Thread Andrey Smirnov

Add Device Tree bindings for RAVE SP EEPROM driver - an MFD cell of
parent RAVE SP driver (documented in
Documentation/devicetree/bindings/mfd/zii,rave-sp.txt).

Cc: Srinivas Kandagatla 
Cc: linux-kernel@vger.kernel.org
Cc: Chris Healy 
Cc: Lucas Stach 
Cc: Aleksander Morgado 
Cc: Rob Herring 
Cc: Mark Rutland 
Cc: devicet...@vger.kernel.org
Signed-off-by: Andrey Smirnov 
---
 .../bindings/nvmem/zii,rave-sp-eeprom.txt  | 40 ++
 1 file changed, 40 insertions(+)
 create mode 100644 
Documentation/devicetree/bindings/nvmem/zii,rave-sp-eeprom.txt

diff --git a/Documentation/devicetree/bindings/nvmem/zii,rave-sp-eeprom.txt 
b/Documentation/devicetree/bindings/nvmem/zii,rave-sp-eeprom.txt
new file mode 100644
index ..d5e22fc67d66
--- /dev/null
+++ b/Documentation/devicetree/bindings/nvmem/zii,rave-sp-eeprom.txt
@@ -0,0 +1,40 @@
+Zodiac Inflight Innovations RAVE EEPROM Bindings
+
+RAVE SP EEPROM device is a "MFD cell" device exposing physical EEPROM
+attached to RAVE Supervisory Processor. It is expected that its Device
+Tree node is specified as a child of the node corresponding to the
+parent RAVE SP device (as documented in
+Documentation/devicetree/bindings/mfd/zii,rave-sp.txt)
+
+Required properties:
+
+- compatible: Should be "zii,rave-sp-eeprom"
+
+Optional properties:
+
+- zii,eeprom-name: Unique EEPROM identifier describing its function in the
+  system. Will be used as created NVMEM deivce's name.
+
+Data cells:
+
+Data cells are child nodes of eerpom node, bindings for which are
+documented in Documentation/bindings/nvmem/nvmem.txt
+
+Example:
+
+   rave-sp {
+   compatible = "zii,rave-sp-rdu1";
+   current-speed = <38400>;
+
+   eeprom@a4 {
+   compatible = "zii,rave-sp-eeprom";
+   reg = <0xa4 0x4000>;
+   #address-cells = <1>;
+   #size-cells = <1>;
+   zii,eeprom-name = "main-eeprom";
+
+   wdt_timeout: wdt-timeout@81 {
+   reg = <0x81 2>;
+   };
+   };
+   }
-- 
2.14.3

Re: [PATCH v2 2/2] mm: vmalloc: Pass proper vm_start into debugobjects

2018-04-30 Thread Chintan Pandya




On 5/1/2018 4:34 AM, Andrew Morton wrote:

should check for it and do a WARN_ONCE so it gets fixed.


Yes, that was an idea in discussion but I've been suggested that it
could be intentional. But since you are raising this, I will try to dig
once again and share a patch with WARN_ONCE if passing intermediate
'addr' is absolutely not right thing to do.

Chintan
--
Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center,
Inc. is a member of the Code Aurora Forum, a Linux Foundation
Collaborative Project

Re: [PATCH v2 2/2] mm: vmalloc: Pass proper vm_start into debugobjects

2018-04-30 Thread Chintan Pandya




On 5/1/2018 4:34 AM, Andrew Morton wrote:

should check for it and do a WARN_ONCE so it gets fixed.


Yes, that was an idea in discussion but I've been suggested that it
could be intentional. But since you are raising this, I will try to dig
once again and share a patch with WARN_ONCE if passing intermediate
'addr' is absolutely not right thing to do.

Chintan
--
Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center,
Inc. is a member of the Code Aurora Forum, a Linux Foundation
Collaborative Project

[PATCH v1] clk: qcom: gdsc: Add support to poll CFG register to check GDSC state

2018-04-30 Thread Taniya Das

From: Amit Nischal 

The default behavior of the GDSC enable/disable sequence is to
poll the status bits of either the actual GDSCR or the
corresponding HW_CTRL registers.

On targets which have support for a CFG_GDSCR register, the
status bits might not show the correct state of the GDSC,
especially in the disable sequence, where the status bit
will be cleared even before the core is completely power
collapsed. On targets with this issue, poll the power on/off
bits in the CFG_GDSCR register instead to correctly determine
the GDSC state.

Signed-off-by: Amit Nischal 
Signed-off-by: Taniya Das 
---
 drivers/clk/qcom/gdsc.c | 42 ++
 drivers/clk/qcom/gdsc.h |  1 +
 2 files changed, 27 insertions(+), 16 deletions(-)

diff --git a/drivers/clk/qcom/gdsc.c b/drivers/clk/qcom/gdsc.c
index cb61c15..2a6b0ff 100644
--- a/drivers/clk/qcom/gdsc.c
+++ b/drivers/clk/qcom/gdsc.c
@@ -33,6 +33,11 @@
 #define GMEM_CLAMP_IO_MASK BIT(0)
 #define GMEM_RESET_MASKBIT(4)

+/* CFG_GDSCR */
+#define GDSC_POWER_UP_COMPLETE BIT(16)
+#define GDSC_POWER_DOWN_COMPLETE   BIT(15)
+#define CFG_GDSCR_OFFSET   0x4
+
 /* Wait 2^n CXO cycles between all states. Here, n=2 (4 cycles). */
 #define EN_REST_WAIT_VAL   (0x2 << 20)
 #define EN_FEW_WAIT_VAL(0x8 << 16)
@@ -45,15 +50,28 @@

 #define domain_to_gdsc(domain) container_of(domain, struct gdsc, pd)

-static int gdsc_is_enabled(struct gdsc *sc, unsigned int reg)
+static int gdsc_is_enabled(struct gdsc *sc, bool en)
 {
+   unsigned int reg;
u32 val;
int ret;

+   if (sc->flags & POLL_CFG_GDSCR)
+   reg = sc->gdscr + CFG_GDSCR_OFFSET;
+   else
+   reg = sc->gds_hw_ctrl ? sc->gds_hw_ctrl : sc->gdscr;
+
ret = regmap_read(sc->regmap, reg, );
if (ret)
return ret;

+   if (sc->flags & POLL_CFG_GDSCR) {
+   if (en)
+   return !!(val & GDSC_POWER_UP_COMPLETE);
+   else
+   return !(val & GDSC_POWER_DOWN_COMPLETE);
+   }
+
return !!(val & PWR_ON_MASK);
 }

@@ -64,17 +82,17 @@ static int gdsc_hwctrl(struct gdsc *sc, bool en)
return regmap_update_bits(sc->regmap, sc->gdscr, HW_CONTROL_MASK, val);
 }

-static int gdsc_poll_status(struct gdsc *sc, unsigned int reg, bool en)
+static int gdsc_poll_status(struct gdsc *sc, bool en)
 {
ktime_t start;

start = ktime_get();
do {
-   if (gdsc_is_enabled(sc, reg) == en)
+   if (gdsc_is_enabled(sc, en) == en)
return 0;
} while (ktime_us_delta(ktime_get(), start) < TIMEOUT_US);

-   if (gdsc_is_enabled(sc, reg) == en)
+   if (gdsc_is_enabled(sc, en) == en)
return 0;

return -ETIMEDOUT;
@@ -84,7 +102,6 @@ static int gdsc_toggle_logic(struct gdsc *sc, bool en)
 {
int ret;
u32 val = en ? 0 : SW_COLLAPSE_MASK;
-   unsigned int status_reg = sc->gdscr;

ret = regmap_update_bits(sc->regmap, sc->gdscr, SW_COLLAPSE_MASK, val);
if (ret)
@@ -101,8 +118,7 @@ static int gdsc_toggle_logic(struct gdsc *sc, bool en)
return 0;
}

-   if (sc->gds_hw_ctrl) {
-   status_reg = sc->gds_hw_ctrl;
+   if (sc->gds_hw_ctrl)
/*
 * The gds hw controller asserts/de-asserts the status bit soon
 * after it receives a power on/off request from a master.
@@ -114,9 +130,8 @@ static int gdsc_toggle_logic(struct gdsc *sc, bool en)
 * and polling the status bit.
 */
udelay(1);
-   }

-   return gdsc_poll_status(sc, status_reg, en);
+   return gdsc_poll_status(sc, en);
 }

 static inline int gdsc_deassert_reset(struct gdsc *sc)
@@ -240,8 +255,6 @@ static int gdsc_disable(struct generic_pm_domain *domain)

/* Turn off HW trigger mode if supported */
if (sc->flags & HW_CTRL) {
-   unsigned int reg;
-
ret = gdsc_hwctrl(sc, false);
if (ret < 0)
return ret;
@@ -253,8 +266,7 @@ static int gdsc_disable(struct generic_pm_domain *domain)
 */
udelay(1);

-   reg = sc->gds_hw_ctrl ? sc->gds_hw_ctrl : sc->gdscr;
-   ret = gdsc_poll_status(sc, reg, true);
+   ret = gdsc_poll_status(sc, true);
if (ret)
return ret;
}
@@ -276,7 +288,6 @@ static int gdsc_init(struct gdsc *sc)
 {
u32 mask, val;
int on, ret;
-   unsigned int reg;

/*
 * Disable HW trigger: collapse/restore occur based on registers writes.
@@ -297,8 +308,7 @@ static int gdsc_init(struct gdsc *sc)
return ret;
}

-   reg = sc->gds_hw_ctrl ? sc->gds_hw_ctrl :

[PATCH v1] clk: qcom: gdsc: Add support to poll CFG register to check GDSC state

2018-04-30 Thread Taniya Das

From: Amit Nischal 

The default behavior of the GDSC enable/disable sequence is to
poll the status bits of either the actual GDSCR or the
corresponding HW_CTRL registers.

On targets which have support for a CFG_GDSCR register, the
status bits might not show the correct state of the GDSC,
especially in the disable sequence, where the status bit
will be cleared even before the core is completely power
collapsed. On targets with this issue, poll the power on/off
bits in the CFG_GDSCR register instead to correctly determine
the GDSC state.

Signed-off-by: Amit Nischal 
Signed-off-by: Taniya Das 
---
 drivers/clk/qcom/gdsc.c | 42 ++
 drivers/clk/qcom/gdsc.h |  1 +
 2 files changed, 27 insertions(+), 16 deletions(-)

diff --git a/drivers/clk/qcom/gdsc.c b/drivers/clk/qcom/gdsc.c
index cb61c15..2a6b0ff 100644
--- a/drivers/clk/qcom/gdsc.c
+++ b/drivers/clk/qcom/gdsc.c
@@ -33,6 +33,11 @@
 #define GMEM_CLAMP_IO_MASK BIT(0)
 #define GMEM_RESET_MASKBIT(4)

+/* CFG_GDSCR */
+#define GDSC_POWER_UP_COMPLETE BIT(16)
+#define GDSC_POWER_DOWN_COMPLETE   BIT(15)
+#define CFG_GDSCR_OFFSET   0x4
+
 /* Wait 2^n CXO cycles between all states. Here, n=2 (4 cycles). */
 #define EN_REST_WAIT_VAL   (0x2 << 20)
 #define EN_FEW_WAIT_VAL(0x8 << 16)
@@ -45,15 +50,28 @@

 #define domain_to_gdsc(domain) container_of(domain, struct gdsc, pd)

-static int gdsc_is_enabled(struct gdsc *sc, unsigned int reg)
+static int gdsc_is_enabled(struct gdsc *sc, bool en)
 {
+   unsigned int reg;
u32 val;
int ret;

+   if (sc->flags & POLL_CFG_GDSCR)
+   reg = sc->gdscr + CFG_GDSCR_OFFSET;
+   else
+   reg = sc->gds_hw_ctrl ? sc->gds_hw_ctrl : sc->gdscr;
+
ret = regmap_read(sc->regmap, reg, );
if (ret)
return ret;

+   if (sc->flags & POLL_CFG_GDSCR) {
+   if (en)
+   return !!(val & GDSC_POWER_UP_COMPLETE);
+   else
+   return !(val & GDSC_POWER_DOWN_COMPLETE);
+   }
+
return !!(val & PWR_ON_MASK);
 }

@@ -64,17 +82,17 @@ static int gdsc_hwctrl(struct gdsc *sc, bool en)
return regmap_update_bits(sc->regmap, sc->gdscr, HW_CONTROL_MASK, val);
 }

-static int gdsc_poll_status(struct gdsc *sc, unsigned int reg, bool en)
+static int gdsc_poll_status(struct gdsc *sc, bool en)
 {
ktime_t start;

start = ktime_get();
do {
-   if (gdsc_is_enabled(sc, reg) == en)
+   if (gdsc_is_enabled(sc, en) == en)
return 0;
} while (ktime_us_delta(ktime_get(), start) < TIMEOUT_US);

-   if (gdsc_is_enabled(sc, reg) == en)
+   if (gdsc_is_enabled(sc, en) == en)
return 0;

return -ETIMEDOUT;
@@ -84,7 +102,6 @@ static int gdsc_toggle_logic(struct gdsc *sc, bool en)
 {
int ret;
u32 val = en ? 0 : SW_COLLAPSE_MASK;
-   unsigned int status_reg = sc->gdscr;

ret = regmap_update_bits(sc->regmap, sc->gdscr, SW_COLLAPSE_MASK, val);
if (ret)
@@ -101,8 +118,7 @@ static int gdsc_toggle_logic(struct gdsc *sc, bool en)
return 0;
}

-   if (sc->gds_hw_ctrl) {
-   status_reg = sc->gds_hw_ctrl;
+   if (sc->gds_hw_ctrl)
/*
 * The gds hw controller asserts/de-asserts the status bit soon
 * after it receives a power on/off request from a master.
@@ -114,9 +130,8 @@ static int gdsc_toggle_logic(struct gdsc *sc, bool en)
 * and polling the status bit.
 */
udelay(1);
-   }

-   return gdsc_poll_status(sc, status_reg, en);
+   return gdsc_poll_status(sc, en);
 }

 static inline int gdsc_deassert_reset(struct gdsc *sc)
@@ -240,8 +255,6 @@ static int gdsc_disable(struct generic_pm_domain *domain)

/* Turn off HW trigger mode if supported */
if (sc->flags & HW_CTRL) {
-   unsigned int reg;
-
ret = gdsc_hwctrl(sc, false);
if (ret < 0)
return ret;
@@ -253,8 +266,7 @@ static int gdsc_disable(struct generic_pm_domain *domain)
 */
udelay(1);

-   reg = sc->gds_hw_ctrl ? sc->gds_hw_ctrl : sc->gdscr;
-   ret = gdsc_poll_status(sc, reg, true);
+   ret = gdsc_poll_status(sc, true);
if (ret)
return ret;
}
@@ -276,7 +288,6 @@ static int gdsc_init(struct gdsc *sc)
 {
u32 mask, val;
int on, ret;
-   unsigned int reg;

/*
 * Disable HW trigger: collapse/restore occur based on registers writes.
@@ -297,8 +308,7 @@ static int gdsc_init(struct gdsc *sc)
return ret;
}

-   reg = sc->gds_hw_ctrl ? sc->gds_hw_ctrl : sc->gdscr;
-   on = gdsc_is_enabled(sc, reg);
+   on =

Re: [PATCH] clk: qcom: gdsc: Add support to poll CFG register to check GDSC state

2018-04-30 Thread Taniya Das


Hello Doug,

Thanks for the comments, I have based my latest patch on top of the 
earlier patches (clk-qcom-sdm845 branch of clk-next).


On 5/1/2018 12:12 AM, Doug Anderson wrote:

Hi,

On Fri, Apr 27, 2018 at 1:19 AM, Taniya Das  wrote:


-static int gdsc_is_enabled(struct gdsc *sc, unsigned int reg)
+static int gdsc_is_enabled(struct gdsc *sc, bool en)
  {
+   unsigned int reg;
 u32 val;
 int ret;

+   if (sc->flags & POLL_CFG_GDSCR)
+   reg = sc->gdscr + CFG_GDSCR_OFFSET;
+   else
+   reg = sc->gds_hw_ctrl ?  sc->gds_hw_ctrl : sc->gdscr;


nit: why two spaces after the "?" in this new patch?  Should be just one.



diff --git a/drivers/clk/qcom/gdsc.h b/drivers/clk/qcom/gdsc.h
index 3964834..ac5f844 100644
--- a/drivers/clk/qcom/gdsc.h
+++ b/drivers/clk/qcom/gdsc.h
@@ -53,6 +53,7 @@ struct gdsc {
  #define VOTABLEBIT(0)
  #define CLAMP_IO   BIT(1)
  #define HW_CTRLBIT(2)
+#define POLL_CFG_GDSCR  BIT(5)


This doesn't apply cleanly to clk-next because clk-next already has
the old patch #1 and patch #2 from your series.  You should have
applied your patch to clk-next before sending out.

Also a nit here is that you have two spaces before "BIT(5)" but all
other entries in this list have a tab before them.  You should be
consistent and use a tab.


In general I'd tend to assume that Stephen could handle this small
merge conflict and fixing the whitespace issues when applying, but if
he tells you to spin then you certainly should.  I'll also say that
I'm nowhere near an expert on gdsc but it looks like Stephen's
previous comments were addressed and the patch seems sane in general.
Stephen: feel free to add my Reviewed-by: if you wish when applying
(or Taniya, if you end up spinning).


-Doug



--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation.

--

Re: [PATCH] clk: qcom: gdsc: Add support to poll CFG register to check GDSC state

2018-04-30 Thread Taniya Das


Hello Doug,

Thanks for the comments, I have based my latest patch on top of the 
earlier patches (clk-qcom-sdm845 branch of clk-next).


On 5/1/2018 12:12 AM, Doug Anderson wrote:

Hi,

On Fri, Apr 27, 2018 at 1:19 AM, Taniya Das  wrote:


-static int gdsc_is_enabled(struct gdsc *sc, unsigned int reg)
+static int gdsc_is_enabled(struct gdsc *sc, bool en)
  {
+   unsigned int reg;
 u32 val;
 int ret;

+   if (sc->flags & POLL_CFG_GDSCR)
+   reg = sc->gdscr + CFG_GDSCR_OFFSET;
+   else
+   reg = sc->gds_hw_ctrl ?  sc->gds_hw_ctrl : sc->gdscr;


nit: why two spaces after the "?" in this new patch?  Should be just one.



diff --git a/drivers/clk/qcom/gdsc.h b/drivers/clk/qcom/gdsc.h
index 3964834..ac5f844 100644
--- a/drivers/clk/qcom/gdsc.h
+++ b/drivers/clk/qcom/gdsc.h
@@ -53,6 +53,7 @@ struct gdsc {
  #define VOTABLEBIT(0)
  #define CLAMP_IO   BIT(1)
  #define HW_CTRLBIT(2)
+#define POLL_CFG_GDSCR  BIT(5)


This doesn't apply cleanly to clk-next because clk-next already has
the old patch #1 and patch #2 from your series.  You should have
applied your patch to clk-next before sending out.

Also a nit here is that you have two spaces before "BIT(5)" but all
other entries in this list have a tab before them.  You should be
consistent and use a tab.


In general I'd tend to assume that Stephen could handle this small
merge conflict and fixing the whitespace issues when applying, but if
he tells you to spin then you certainly should.  I'll also say that
I'm nowhere near an expert on gdsc but it looks like Stephen's
previous comments were addressed and the patch seems sane in general.
Stephen: feel free to add my Reviewed-by: if you wish when applying
(or Taniya, if you end up spinning).


-Doug



--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation.

--

RE: [PATCH 1/2] Fix FSL NAND driver to read all ONFI parameter pages

2018-04-30 Thread Wan, Jane (Nokia - US/Sunnyvale)

Hi Miquèl and Boris,

Thank you for your response and feedback.  I've modified the fix based on your 
comments.  
Please see the updated patch file at the end of this message (also in 
attachment).
My answers to your comments/questions are inline in the previous message.

Here is the answer to Boris question in another email thread:

> What if some NANDs have 4 or more copies of the param page?
 [Jane] The ONFI spec defines that the parameter page and its two redundant 
copies are mandatory.  
The additional redundant pages are optional.  Currently, the FSL NAND driver 
only reads the first 
parameter page.  This patch is to fix the driver to meet the mandatory 
requirement in the spec. 
We got a batch of particularly bad NAND chips recently and we needed these 
changes to make them 
work reliably over temperature.  The patch was verified using these bad chips.

Best regards,
Jane

Updated patch:
From 701de4146aa6355c951e97a77476e12d2da56d42 Mon Sep 17 00:00:00 2001
From: Jane Wan 
Date: Mon, 30 Apr 2018 13:30:46 -0700
Subject: [PATCH 1/2] mtd: rawnand: fsl_ifc: fix FSL NAND driver to read all
 ONFI parameter pages

Per ONFI specification (Rev. 4.0), if the CRC of the first parameter page
read is not valid, the host should read redundant parameter page copies.
Fix FSL NAND driver to read the two redundant copies which are mandatory
in the specification.

Signed-off-by: Jane Wan 
---
 drivers/mtd/nand/raw/fsl_ifc_nand.c |   17 ++---
 1 file changed, 10 insertions(+), 7 deletions(-)

diff --git a/drivers/mtd/nand/raw/fsl_ifc_nand.c 
b/drivers/mtd/nand/raw/fsl_ifc_nand.c
index 61aae02..98aac1f 100644
--- a/drivers/mtd/nand/raw/fsl_ifc_nand.c
+++ b/drivers/mtd/nand/raw/fsl_ifc_nand.c
@@ -342,9 +342,16 @@ static void fsl_ifc_cmdfunc(struct mtd_info *mtd, unsigned 
int command,
 
case NAND_CMD_READID:
case NAND_CMD_PARAM: {
+   /*
+* For READID, read 8 bytes that are currently used.
+* For PARAM, read all 3 copies of 256-bytes pages.
+*/
+   int len = 8;
int timing = IFC_FIR_OP_RB;
-   if (command == NAND_CMD_PARAM)
+   if (command == NAND_CMD_PARAM) {
timing = IFC_FIR_OP_RBCD;
+   len = 256 * 3;
+   }
 
ifc_out32((IFC_FIR_OP_CW0 << IFC_NAND_FIR0_OP0_SHIFT) |
  (IFC_FIR_OP_UA  << IFC_NAND_FIR0_OP1_SHIFT) |
@@ -354,12 +361,8 @@ static void fsl_ifc_cmdfunc(struct mtd_info *mtd, unsigned 
int command,
  >ifc_nand.nand_fcr0);
ifc_out32(column, >ifc_nand.row3);
 
-   /*
-* although currently it's 8 bytes for READID, we always read
-* the maximum 256 bytes(for PARAM)
-*/
-   ifc_out32(256, >ifc_nand.nand_fbcr);
-   ifc_nand_ctrl->read_bytes = 256;
+   ifc_out32(len, >ifc_nand.nand_fbcr);
+   ifc_nand_ctrl->read_bytes = len;
 
set_addr(mtd, 0, 0, 0);
fsl_ifc_run_command(mtd);
-- 
1.7.9.5


-Original Message-
From: Miquel Raynal [mailto:miquel.ray...@bootlin.com] 
Sent: Saturday, April 28, 2018 4:42 AM
To: Wan, Jane (Nokia - US/Sunnyvale) 
Cc: dw...@infradead.org; computersforpe...@gmail.com; Bos, Ties (Nokia - 
US/Sunnyvale) ; linux-...@lists.infradead.org; 
linux-kernel@vger.kernel.org; Boris Brezillon 
Subject: Re: [PATCH 1/2] Fix FSL NAND driver to read all ONFI parameter pages

Hi Jane,

You forgot to Cc the right maintainers, please use ./scripts/get_maintainer.pl 
for that.
[Jane]  Added through 4.17-rc1 get_maintainer.pl.  I was running the script 
from an older kernel version we're using.

> Signed-off-by: Jane Wan 

Please add a description of what your are doing in the commit message.
The description in the cover letter is good, you can copy the relevant section 
here.
[Jane]  Added.

> ---
>  drivers/mtd/nand/fsl_ifc_nand.c |   10 ++

Also, just for you to know, files have moved in a raw/ subdirectory, so please 
rebase on top of 4.17-rc1 and prefix the commit title with
"mtd: rawnand: fsl_ifc:".
[Jane] Thank you for the info.  I've rebased the change on top of 4.17-rc1 and 
modified the commit title as you suggested.

>  1 file changed, 6 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/mtd/nand/fsl_ifc_nand.c 
> b/drivers/mtd/nand/fsl_ifc_nand.c index ca36b35..a3cf6ca 100644
> --- a/drivers/mtd/nand/fsl_ifc_nand.c
> +++ b/drivers/mtd/nand/fsl_ifc_nand.c
> @@ -413,6 +413,7 @@ static void fsl_ifc_cmdfunc(struct mtd_info *mtd, 
> unsigned int command,
>   struct fsl_ifc_mtd *priv = chip->priv;
>   struct fsl_ifc_ctrl *ctrl = priv->ctrl;
>   struct fsl_ifc_runtime __iomem *ifc = ctrl->rregs;
> + int len;
>  
>   /* clear the read

RE: [PATCH 1/2] Fix FSL NAND driver to read all ONFI parameter pages

2018-04-30 Thread Wan, Jane (Nokia - US/Sunnyvale)

Hi Miquèl and Boris,

Thank you for your response and feedback.  I've modified the fix based on your 
comments.  
Please see the updated patch file at the end of this message (also in 
attachment).
My answers to your comments/questions are inline in the previous message.

Here is the answer to Boris question in another email thread:

> What if some NANDs have 4 or more copies of the param page?
 [Jane] The ONFI spec defines that the parameter page and its two redundant 
copies are mandatory.  
The additional redundant pages are optional.  Currently, the FSL NAND driver 
only reads the first 
parameter page.  This patch is to fix the driver to meet the mandatory 
requirement in the spec. 
We got a batch of particularly bad NAND chips recently and we needed these 
changes to make them 
work reliably over temperature.  The patch was verified using these bad chips.

Best regards,
Jane

Updated patch:
From 701de4146aa6355c951e97a77476e12d2da56d42 Mon Sep 17 00:00:00 2001
From: Jane Wan 
Date: Mon, 30 Apr 2018 13:30:46 -0700
Subject: [PATCH 1/2] mtd: rawnand: fsl_ifc: fix FSL NAND driver to read all
 ONFI parameter pages

Per ONFI specification (Rev. 4.0), if the CRC of the first parameter page
read is not valid, the host should read redundant parameter page copies.
Fix FSL NAND driver to read the two redundant copies which are mandatory
in the specification.

Signed-off-by: Jane Wan 
---
 drivers/mtd/nand/raw/fsl_ifc_nand.c |   17 ++---
 1 file changed, 10 insertions(+), 7 deletions(-)

diff --git a/drivers/mtd/nand/raw/fsl_ifc_nand.c 
b/drivers/mtd/nand/raw/fsl_ifc_nand.c
index 61aae02..98aac1f 100644
--- a/drivers/mtd/nand/raw/fsl_ifc_nand.c
+++ b/drivers/mtd/nand/raw/fsl_ifc_nand.c
@@ -342,9 +342,16 @@ static void fsl_ifc_cmdfunc(struct mtd_info *mtd, unsigned 
int command,
 
case NAND_CMD_READID:
case NAND_CMD_PARAM: {
+   /*
+* For READID, read 8 bytes that are currently used.
+* For PARAM, read all 3 copies of 256-bytes pages.
+*/
+   int len = 8;
int timing = IFC_FIR_OP_RB;
-   if (command == NAND_CMD_PARAM)
+   if (command == NAND_CMD_PARAM) {
timing = IFC_FIR_OP_RBCD;
+   len = 256 * 3;
+   }
 
ifc_out32((IFC_FIR_OP_CW0 << IFC_NAND_FIR0_OP0_SHIFT) |
  (IFC_FIR_OP_UA  << IFC_NAND_FIR0_OP1_SHIFT) |
@@ -354,12 +361,8 @@ static void fsl_ifc_cmdfunc(struct mtd_info *mtd, unsigned 
int command,
  >ifc_nand.nand_fcr0);
ifc_out32(column, >ifc_nand.row3);
 
-   /*
-* although currently it's 8 bytes for READID, we always read
-* the maximum 256 bytes(for PARAM)
-*/
-   ifc_out32(256, >ifc_nand.nand_fbcr);
-   ifc_nand_ctrl->read_bytes = 256;
+   ifc_out32(len, >ifc_nand.nand_fbcr);
+   ifc_nand_ctrl->read_bytes = len;
 
set_addr(mtd, 0, 0, 0);
fsl_ifc_run_command(mtd);
-- 
1.7.9.5


-Original Message-
From: Miquel Raynal [mailto:miquel.ray...@bootlin.com] 
Sent: Saturday, April 28, 2018 4:42 AM
To: Wan, Jane (Nokia - US/Sunnyvale) 
Cc: dw...@infradead.org; computersforpe...@gmail.com; Bos, Ties (Nokia - 
US/Sunnyvale) ; linux-...@lists.infradead.org; 
linux-kernel@vger.kernel.org; Boris Brezillon 
Subject: Re: [PATCH 1/2] Fix FSL NAND driver to read all ONFI parameter pages

Hi Jane,

You forgot to Cc the right maintainers, please use ./scripts/get_maintainer.pl 
for that.
[Jane]  Added through 4.17-rc1 get_maintainer.pl.  I was running the script 
from an older kernel version we're using.

> Signed-off-by: Jane Wan 

Please add a description of what your are doing in the commit message.
The description in the cover letter is good, you can copy the relevant section 
here.
[Jane]  Added.

> ---
>  drivers/mtd/nand/fsl_ifc_nand.c |   10 ++

Also, just for you to know, files have moved in a raw/ subdirectory, so please 
rebase on top of 4.17-rc1 and prefix the commit title with
"mtd: rawnand: fsl_ifc:".
[Jane] Thank you for the info.  I've rebased the change on top of 4.17-rc1 and 
modified the commit title as you suggested.

>  1 file changed, 6 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/mtd/nand/fsl_ifc_nand.c 
> b/drivers/mtd/nand/fsl_ifc_nand.c index ca36b35..a3cf6ca 100644
> --- a/drivers/mtd/nand/fsl_ifc_nand.c
> +++ b/drivers/mtd/nand/fsl_ifc_nand.c
> @@ -413,6 +413,7 @@ static void fsl_ifc_cmdfunc(struct mtd_info *mtd, 
> unsigned int command,
>   struct fsl_ifc_mtd *priv = chip->priv;
>   struct fsl_ifc_ctrl *ctrl = priv->ctrl;
>   struct fsl_ifc_runtime __iomem *ifc = ctrl->rregs;
> + int len;
>  
>   /* clear the read buffer */
>   ifc_nand_ctrl->read_bytes = 0;
> @@ -462,11 +463,12 @@ static void fsl_ifc_cmdfunc(struct mtd_info *mtd, 
>

general protection fault in n_tty_set_termios

2018-04-30 Thread syzbot


Hello,

syzbot found the following crash on:

HEAD commit:8188fc8bef8c Merge  
git://git.kernel.org/pub/scm/linux/kerne...

git tree:   upstream
console output: https://syzkaller.appspot.com/x/log.txt?id=5093449355231232
kernel config:   
https://syzkaller.appspot.com/x/.config?id=6493557782959164711

dashboard link: https://syzkaller.appspot.com/bug?extid=ed02be0ad5f26ef4e31b
compiler:   gcc (GCC) 8.0.1 20180413 (experimental)
syzkaller  
repro:https://syzkaller.appspot.com/x/repro.syz?id=6543533393575936

C reproducer:   https://syzkaller.appspot.com/x/repro.c?id=5754063643738112

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+ed02be0ad5f26ef4e...@syzkaller.appspotmail.com

kasan: CONFIG_KASAN_INLINE enabled
kasan: GPF could be caused by NULL-ptr deref or user memory access
general protection fault:  [#1] SMP KASAN
Dumping ftrace buffer:
   (ftrace buffer empty)
Modules linked in:
CPU: 1 PID: 4509 Comm: syz-executor654 Not tainted 4.17.0-rc3+ #26
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011

RIP: 0010:n_tty_set_termios+0x2d9/0xe80 drivers/tty/n_tty.c:1782
RSP: 0018:8801b42df698 EFLAGS: 00010203
RAX: 0001 RBX:  RCX: 000b
RDX: dc00 RSI:  RDI: 0005
RBP: 8801b42df6d0 R08: 8801d97aa000 R09: 0002
R10: 8801d97aa888 R11: 8801d97aa000 R12: 8801d9bea500
R13: 8801d9bea8b4 R14: 005d R15: 8801b42df730
FS:  7f082d3d7700() GS:8801daf0() knlGS:
CS:  0010 DS:  ES:  CR0: 80050033
CR2: 7f082d3b5e78 CR3: 0001ac8aa000 CR4: 001406e0
DR0:  DR1:  DR2: 
DR3:  DR6: fffe0ff0 DR7: 0400
Call Trace:
 tty_set_termios+0x7a0/0xac0 drivers/tty/tty_ioctl.c:341
 set_termios+0x41e/0x7d0 drivers/tty/tty_ioctl.c:414
 tty_mode_ioctl+0x855/0xb50 drivers/tty/tty_ioctl.c:749
 n_tty_ioctl_helper+0x54/0x3b0 drivers/tty/tty_ioctl.c:940
 n_tty_ioctl+0x54/0x320 drivers/tty/n_tty.c:2441
 tty_ioctl+0x5e1/0x1870 drivers/tty/tty_io.c:2655
 vfs_ioctl fs/ioctl.c:46 [inline]
 file_ioctl fs/ioctl.c:500 [inline]
 do_vfs_ioctl+0x1cf/0x16a0 fs/ioctl.c:684
 ksys_ioctl+0xa9/0xd0 fs/ioctl.c:701
 __do_sys_ioctl fs/ioctl.c:708 [inline]
 __se_sys_ioctl fs/ioctl.c:706 [inline]
 __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:706
 do_syscall_64+0x1b1/0x800 arch/x86/entry/common.c:287
 entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x445d19
RSP: 002b:7f082d3d6da8 EFLAGS: 0297 ORIG_RAX: 0010
RAX: ffda RBX: 006dac3c RCX: 00445d19
RDX: 2040 RSI: 5402 RDI: 0033
RBP:  R08:  R09: 
R10:  R11: 0297 R12: 006dac38
R13: 6d74702f7665642f R14: 7f082d3d79c0 R15: 0007
Code: 8b 45 d0 31 ff 83 e0 02 89 c6 89 45 d0 e8 50 4a e1 fd 8b 45 d0 4c 89  
f1 48 ba 00 00 00 00 00 fc ff df 85 c0 0f 95 c0 48 c1 e9 03 <0f> b6 14 11  
4c 89 f1 83 e1 07 38 ca 7f 08 84 d2 0f 85 96 09 00
RIP: n_tty_set_termios+0x2d9/0xe80 drivers/tty/n_tty.c:1782 RSP:  
8801b42df698

---[ end trace b89be7398398fc5c ]---


---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkal...@googlegroups.com.

syzbot will keep track of this bug report.
If you forgot to add the Reported-by tag, once the fix for this bug is  
merged

into any tree, please reply to this email with:
#syz fix: exact-commit-title
If you want to test a patch for this bug, please reply with:
#syz test: git://repo/address.git branch
and provide the patch inline or as an attachment.
To mark this as a duplicate of another syzbot report, please reply with:
#syz dup: exact-subject-of-another-report
If it's a one-off invalid bug report, please reply with:
#syz invalid
Note: if the crash happens again, it will cause creation of a new bug  
report.

Note: all commands must start from beginning of the line in the email body.

general protection fault in n_tty_set_termios

2018-04-30 Thread syzbot


Hello,

syzbot found the following crash on:

HEAD commit:8188fc8bef8c Merge  
git://git.kernel.org/pub/scm/linux/kerne...

git tree:   upstream
console output: https://syzkaller.appspot.com/x/log.txt?id=5093449355231232
kernel config:   
https://syzkaller.appspot.com/x/.config?id=6493557782959164711

dashboard link: https://syzkaller.appspot.com/bug?extid=ed02be0ad5f26ef4e31b
compiler:   gcc (GCC) 8.0.1 20180413 (experimental)
syzkaller  
repro:https://syzkaller.appspot.com/x/repro.syz?id=6543533393575936

C reproducer:   https://syzkaller.appspot.com/x/repro.c?id=5754063643738112

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+ed02be0ad5f26ef4e...@syzkaller.appspotmail.com

kasan: CONFIG_KASAN_INLINE enabled
kasan: GPF could be caused by NULL-ptr deref or user memory access
general protection fault:  [#1] SMP KASAN
Dumping ftrace buffer:
   (ftrace buffer empty)
Modules linked in:
CPU: 1 PID: 4509 Comm: syz-executor654 Not tainted 4.17.0-rc3+ #26
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011

RIP: 0010:n_tty_set_termios+0x2d9/0xe80 drivers/tty/n_tty.c:1782
RSP: 0018:8801b42df698 EFLAGS: 00010203
RAX: 0001 RBX:  RCX: 000b
RDX: dc00 RSI:  RDI: 0005
RBP: 8801b42df6d0 R08: 8801d97aa000 R09: 0002
R10: 8801d97aa888 R11: 8801d97aa000 R12: 8801d9bea500
R13: 8801d9bea8b4 R14: 005d R15: 8801b42df730
FS:  7f082d3d7700() GS:8801daf0() knlGS:
CS:  0010 DS:  ES:  CR0: 80050033
CR2: 7f082d3b5e78 CR3: 0001ac8aa000 CR4: 001406e0
DR0:  DR1:  DR2: 
DR3:  DR6: fffe0ff0 DR7: 0400
Call Trace:
 tty_set_termios+0x7a0/0xac0 drivers/tty/tty_ioctl.c:341
 set_termios+0x41e/0x7d0 drivers/tty/tty_ioctl.c:414
 tty_mode_ioctl+0x855/0xb50 drivers/tty/tty_ioctl.c:749
 n_tty_ioctl_helper+0x54/0x3b0 drivers/tty/tty_ioctl.c:940
 n_tty_ioctl+0x54/0x320 drivers/tty/n_tty.c:2441
 tty_ioctl+0x5e1/0x1870 drivers/tty/tty_io.c:2655
 vfs_ioctl fs/ioctl.c:46 [inline]
 file_ioctl fs/ioctl.c:500 [inline]
 do_vfs_ioctl+0x1cf/0x16a0 fs/ioctl.c:684
 ksys_ioctl+0xa9/0xd0 fs/ioctl.c:701
 __do_sys_ioctl fs/ioctl.c:708 [inline]
 __se_sys_ioctl fs/ioctl.c:706 [inline]
 __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:706
 do_syscall_64+0x1b1/0x800 arch/x86/entry/common.c:287
 entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x445d19
RSP: 002b:7f082d3d6da8 EFLAGS: 0297 ORIG_RAX: 0010
RAX: ffda RBX: 006dac3c RCX: 00445d19
RDX: 2040 RSI: 5402 RDI: 0033
RBP:  R08:  R09: 
R10:  R11: 0297 R12: 006dac38
R13: 6d74702f7665642f R14: 7f082d3d79c0 R15: 0007
Code: 8b 45 d0 31 ff 83 e0 02 89 c6 89 45 d0 e8 50 4a e1 fd 8b 45 d0 4c 89  
f1 48 ba 00 00 00 00 00 fc ff df 85 c0 0f 95 c0 48 c1 e9 03 <0f> b6 14 11  
4c 89 f1 83 e1 07 38 ca 7f 08 84 d2 0f 85 96 09 00
RIP: n_tty_set_termios+0x2d9/0xe80 drivers/tty/n_tty.c:1782 RSP:  
8801b42df698

---[ end trace b89be7398398fc5c ]---


---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkal...@googlegroups.com.

syzbot will keep track of this bug report.
If you forgot to add the Reported-by tag, once the fix for this bug is  
merged

into any tree, please reply to this email with:
#syz fix: exact-commit-title
If you want to test a patch for this bug, please reply with:
#syz test: git://repo/address.git branch
and provide the patch inline or as an attachment.
To mark this as a duplicate of another syzbot report, please reply with:
#syz dup: exact-subject-of-another-report
If it's a one-off invalid bug report, please reply with:
#syz invalid
Note: if the crash happens again, it will cause creation of a new bug  
report.

Note: all commands must start from beginning of the line in the email body.

Re: [PATCH] IB/core: Make ib_mad_client_id atomic

2018-04-30 Thread jackm

On Mon, 30 Apr 2018 13:10:49 -0400
Doug Ledford  wrote:

Looks good!

-Jack

> On Mon, 2018-04-30 at 08:49 -0600, Jason Gunthorpe wrote:
> > On Mon, Apr 23, 2018 at 10:16:18PM +0300, jackm wrote:
> >   
> > > > > TIDs need to be globally unique on the entire machine.
> > > Jason, that is not exactly correct.  
> > 
> > The expecation for /dev/umad users is that they all receive locally
> > unique TID prefixes. The kernel may be OK to keep things
> > port-specific but it is slightly breaking the API we are presenting
> > to userspace to allow them to alias..
> > 
> > Jason  
> 
> Would people be happier with this commit message then:
> 
> IB/core: Make ib_mad_client_id atomic
>   
> Currently, the kernel protects access to the agent ID allocator on a
> per port basis using a spinlock, so it is impossible for two
> apps/threads on the same port to get the same TID, but it is entirely
> possible for two threads on different ports to end up with the same
> TID.  
> 
> As this can be confusing (regardless of it being legal according to
> the IB Spec 1.3, C13-18.1.1, in section 13.4.6.4 - TransactionID
> usage), and as the rdma-core user space API for /dev/umad devices
> implies unique TIDs even across ports, make the TID an atomic type so
> that no two allocations, regardless of port number, will be the same.
> 
> Signed-off-by: Håkon Bugge 
> Reviewed-by: Jack Morgenstein 
> Reviewed-by: Ira Weiny 
> Reviewed-by: Zhu Yanjun 
> Signed-off-by: Doug Ledford 
> 
>

Re: [PATCH] IB/core: Make ib_mad_client_id atomic

2018-04-30 Thread jackm

On Mon, 30 Apr 2018 13:10:49 -0400
Doug Ledford  wrote:

Looks good!

-Jack

> On Mon, 2018-04-30 at 08:49 -0600, Jason Gunthorpe wrote:
> > On Mon, Apr 23, 2018 at 10:16:18PM +0300, jackm wrote:
> >   
> > > > > TIDs need to be globally unique on the entire machine.
> > > Jason, that is not exactly correct.  
> > 
> > The expecation for /dev/umad users is that they all receive locally
> > unique TID prefixes. The kernel may be OK to keep things
> > port-specific but it is slightly breaking the API we are presenting
> > to userspace to allow them to alias..
> > 
> > Jason  
> 
> Would people be happier with this commit message then:
> 
> IB/core: Make ib_mad_client_id atomic
>   
> Currently, the kernel protects access to the agent ID allocator on a
> per port basis using a spinlock, so it is impossible for two
> apps/threads on the same port to get the same TID, but it is entirely
> possible for two threads on different ports to end up with the same
> TID.  
> 
> As this can be confusing (regardless of it being legal according to
> the IB Spec 1.3, C13-18.1.1, in section 13.4.6.4 - TransactionID
> usage), and as the rdma-core user space API for /dev/umad devices
> implies unique TIDs even across ports, make the TID an atomic type so
> that no two allocations, regardless of port number, will be the same.
> 
> Signed-off-by: Håkon Bugge 
> Reviewed-by: Jack Morgenstein 
> Reviewed-by: Ira Weiny 
> Reviewed-by: Zhu Yanjun 
> Signed-off-by: Doug Ledford 
> 
>

[PATCH] tipc: fix a potential missing-check bug

2018-04-30 Thread Wenwen Wang

In tipc_link_xmit(), the member field "len" of l->backlog[imp] must
be less than the member field "limit" of l->backlog[imp] when imp is
equal to TIPC_SYSTEM_IMPORTANCE. Otherwise, an error code, i.e., -ENOBUFS,
is returned. This is enforced by the security check. However, at the end
of tipc_link_xmit(), the length of "list" is added to l->backlog[imp].len
without any further check. This can potentially cause unexpected values for
l->backlog[imp].len. If imp is equal to TIPC_SYSTEM_IMPORTANCE and the
original value of l->backlog[imp].len is less than l->backlog[imp].limit,
after this addition, l->backlog[imp] could be larger than
l->backlog[imp].limit. That means the security check can potentially be
bypassed, especially when an adversary can control the length of "list".

This patch performs such a check after the modification to
l->backlog[imp].len (if imp is TIPC_SYSTEM_IMPORTANCE) to avoid such
security issues. An error code will be returned if an unexpected value of
l->backlog[imp].len is generated.

Signed-off-by: Wenwen Wang 
---
 net/tipc/link.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/net/tipc/link.c b/net/tipc/link.c
index 695acb7..62972fa 100644
--- a/net/tipc/link.c
+++ b/net/tipc/link.c
@@ -948,6 +948,11 @@ int tipc_link_xmit(struct tipc_link *l, struct 
sk_buff_head *list,
continue;
}
l->backlog[imp].len += skb_queue_len(list);
+   if (imp == TIPC_SYSTEM_IMPORTANCE &&
+   l->backlog[imp].len >= l->backlog[imp].limit) {
+   pr_warn("%s<%s>, link overflow", link_rst_msg, l->name);
+   return -ENOBUFS;
+   }
skb_queue_splice_tail_init(list, backlogq);
}
l->snd_nxt = seqno;
-- 
2.7.4

[PATCH] tipc: fix a potential missing-check bug

2018-04-30 Thread Wenwen Wang

In tipc_link_xmit(), the member field "len" of l->backlog[imp] must
be less than the member field "limit" of l->backlog[imp] when imp is
equal to TIPC_SYSTEM_IMPORTANCE. Otherwise, an error code, i.e., -ENOBUFS,
is returned. This is enforced by the security check. However, at the end
of tipc_link_xmit(), the length of "list" is added to l->backlog[imp].len
without any further check. This can potentially cause unexpected values for
l->backlog[imp].len. If imp is equal to TIPC_SYSTEM_IMPORTANCE and the
original value of l->backlog[imp].len is less than l->backlog[imp].limit,
after this addition, l->backlog[imp] could be larger than
l->backlog[imp].limit. That means the security check can potentially be
bypassed, especially when an adversary can control the length of "list".

This patch performs such a check after the modification to
l->backlog[imp].len (if imp is TIPC_SYSTEM_IMPORTANCE) to avoid such
security issues. An error code will be returned if an unexpected value of
l->backlog[imp].len is generated.

Signed-off-by: Wenwen Wang 
---
 net/tipc/link.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/net/tipc/link.c b/net/tipc/link.c
index 695acb7..62972fa 100644
--- a/net/tipc/link.c
+++ b/net/tipc/link.c
@@ -948,6 +948,11 @@ int tipc_link_xmit(struct tipc_link *l, struct 
sk_buff_head *list,
continue;
}
l->backlog[imp].len += skb_queue_len(list);
+   if (imp == TIPC_SYSTEM_IMPORTANCE &&
+   l->backlog[imp].len >= l->backlog[imp].limit) {
+   pr_warn("%s<%s>, link overflow", link_rst_msg, l->name);
+   return -ENOBUFS;
+   }
skb_queue_splice_tail_init(list, backlogq);
}
l->snd_nxt = seqno;
-- 
2.7.4

Re: [PATCH][next] pinctrl: actions: Fix Kconfig dependency and help text

2018-04-30 Thread Randy Dunlap

On 04/30/2018 08:00 PM, Manivannan Sadhasivam wrote:
> 1. Fix Kconfig dependency for Actions Semi S900 pinctrl driver which
> generates below warning in x86:
> 
> WARNING: unmet direct dependencies detected for PINCTRL_OWL
>   Depends on [n]: PINCTRL [=y] && (ARCH_ACTIONS || COMPILE_TEST [=n]) && OF 
> [=n]
>   Selected by [y]:
>   - PINCTRL_S900 [=y] && PINCTRL [=y]
> 
> 2. Add help text for OWL pinctrl driver
> 
> Signed-off-by: Manivannan Sadhasivam 

Reported-by: Randy Dunlap 
Tested-by: Randy Dunlap 
Acked-by: Randy Dunlap 

Thanks.

> ---
>  drivers/pinctrl/actions/Kconfig | 6 --
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/pinctrl/actions/Kconfig b/drivers/pinctrl/actions/Kconfig
> index 1c7309c90f0d..ede97cdbbc12 100644
> --- a/drivers/pinctrl/actions/Kconfig
> +++ b/drivers/pinctrl/actions/Kconfig
> @@ -1,12 +1,14 @@
>  config PINCTRL_OWL
> - bool
> + bool "Actions Semi OWL pinctrl driver"
>   depends on (ARCH_ACTIONS || COMPILE_TEST) && OF
>   select PINMUX
>   select PINCONF
>   select GENERIC_PINCONF
> + help
> +   Say Y here to enable Actions Semi OWL pinctrl driver
>  
>  config PINCTRL_S900
>   bool "Actions Semi S900 pinctrl driver"
> - select PINCTRL_OWL
> + depends on PINCTRL_OWL
>   help
> Say Y here to enable Actions Semi S900 pinctrl driver
> 


-- 
~Randy

Re: [PATCH][next] pinctrl: actions: Fix Kconfig dependency and help text

2018-04-30 Thread Randy Dunlap

On 04/30/2018 08:00 PM, Manivannan Sadhasivam wrote:
> 1. Fix Kconfig dependency for Actions Semi S900 pinctrl driver which
> generates below warning in x86:
> 
> WARNING: unmet direct dependencies detected for PINCTRL_OWL
>   Depends on [n]: PINCTRL [=y] && (ARCH_ACTIONS || COMPILE_TEST [=n]) && OF 
> [=n]
>   Selected by [y]:
>   - PINCTRL_S900 [=y] && PINCTRL [=y]
> 
> 2. Add help text for OWL pinctrl driver
> 
> Signed-off-by: Manivannan Sadhasivam 

Reported-by: Randy Dunlap 
Tested-by: Randy Dunlap 
Acked-by: Randy Dunlap 

Thanks.

> ---
>  drivers/pinctrl/actions/Kconfig | 6 --
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/pinctrl/actions/Kconfig b/drivers/pinctrl/actions/Kconfig
> index 1c7309c90f0d..ede97cdbbc12 100644
> --- a/drivers/pinctrl/actions/Kconfig
> +++ b/drivers/pinctrl/actions/Kconfig
> @@ -1,12 +1,14 @@
>  config PINCTRL_OWL
> - bool
> + bool "Actions Semi OWL pinctrl driver"
>   depends on (ARCH_ACTIONS || COMPILE_TEST) && OF
>   select PINMUX
>   select PINCONF
>   select GENERIC_PINCONF
> + help
> +   Say Y here to enable Actions Semi OWL pinctrl driver
>  
>  config PINCTRL_S900
>   bool "Actions Semi S900 pinctrl driver"
> - select PINCTRL_OWL
> + depends on PINCTRL_OWL
>   help
> Say Y here to enable Actions Semi S900 pinctrl driver
> 


-- 
~Randy

Re: [PATCH 03/10] staging: lustre: lu_object: discard extra lru count.

2018-04-30 Thread Dilger, Andreas

On Apr 30, 2018, at 21:52, NeilBrown  wrote:
> 
> lu_object maintains 2 lru counts.
> One is a per-bucket lsb_lru_len.
> The other is the per-cpu ls_lru_len_counter.
> 
> The only times the per-bucket counters are use are:
> - a debug message when an object is added
> - in lu_site_stats_get when all the counters are combined.
> 
> The debug message is not essential, and the per-cpu counter
> can be used to get the combined total.
> 
> So discard the per-bucket lsb_lru_len.
> 
> Signed-off-by: NeilBrown 

Looks reasonable, though it would also be possible to fix the percpu
functions rather than adding a workaround in this code.

Reviewed-by: Andreas Dilger 

> ---
> drivers/staging/lustre/lustre/obdclass/lu_object.c |   24 
> 1 file changed, 9 insertions(+), 15 deletions(-)
> 
> diff --git a/drivers/staging/lustre/lustre/obdclass/lu_object.c 
> b/drivers/staging/lustre/lustre/obdclass/lu_object.c
> index 2a8a25d6edb5..2bf089817157 100644
> --- a/drivers/staging/lustre/lustre/obdclass/lu_object.c
> +++ b/drivers/staging/lustre/lustre/obdclass/lu_object.c
> @@ -57,10 +57,6 @@
> #include 
> 
> struct lu_site_bkt_data {
> - /**
> -  * number of object in this bucket on the lsb_lru list.
> -  */
> - longlsb_lru_len;
>   /**
>* LRU list, updated on each access to object. Protected by
>* bucket lock of lu_site::ls_obj_hash.
> @@ -187,10 +183,9 @@ void lu_object_put(const struct lu_env *env, struct 
> lu_object *o)
>   if (!lu_object_is_dying(top)) {
>   LASSERT(list_empty(>loh_lru));
>   list_add_tail(>loh_lru, >lsb_lru);
> - bkt->lsb_lru_len++;
>   percpu_counter_inc(>ls_lru_len_counter);
> - CDEBUG(D_INODE, "Add %p to site lru. hash: %p, bkt: %p, 
> lru_len: %ld\n",
> -o, site->ls_obj_hash, bkt, bkt->lsb_lru_len);
> + CDEBUG(D_INODE, "Add %p to site lru. hash: %p, bkt: %p\n",
> +o, site->ls_obj_hash, bkt);
>   cfs_hash_bd_unlock(site->ls_obj_hash, , 1);
>   return;
>   }
> @@ -238,7 +233,6 @@ void lu_object_unhash(const struct lu_env *env, struct 
> lu_object *o)
> 
>   list_del_init(>loh_lru);
>   bkt = cfs_hash_bd_extra_get(obj_hash, );
> - bkt->lsb_lru_len--;
>   percpu_counter_dec(>ls_lru_len_counter);
>   }
>   cfs_hash_bd_del_locked(obj_hash, , >loh_hash);
> @@ -422,7 +416,6 @@ int lu_site_purge_objects(const struct lu_env *env, 
> struct lu_site *s,
>   cfs_hash_bd_del_locked(s->ls_obj_hash,
>  , >loh_hash);
>   list_move(>loh_lru, );
> - bkt->lsb_lru_len--;
>   percpu_counter_dec(>ls_lru_len_counter);
>   if (did_sth == 0)
>   did_sth = 1;
> @@ -621,7 +614,6 @@ static struct lu_object *htable_lookup(struct lu_site *s,
>   lprocfs_counter_incr(s->ls_stats, LU_SS_CACHE_HIT);
>   if (!list_empty(>loh_lru)) {
>   list_del_init(>loh_lru);
> - bkt->lsb_lru_len--;
>   percpu_counter_dec(>ls_lru_len_counter);
>   }
>   return lu_object_top(h);
> @@ -1834,19 +1826,21 @@ struct lu_site_stats {
>   unsigned intlss_busy;
> };
> 
> -static void lu_site_stats_get(struct cfs_hash *hs,
> +static void lu_site_stats_get(const struct lu_site *s,
> struct lu_site_stats *stats, int populated)
> {
> + struct cfs_hash *hs = s->ls_obj_hash;
>   struct cfs_hash_bd bd;
>   unsigned int i;
> + /* percpu_counter_read_positive() won't accept a const pointer */
> + struct lu_site *s2 = (struct lu_site *)s;

It would seem worthwhile to change the percpu_counter_read_positive() and
percpu_counter_read() arguments to be "const struct percpu_counter *fbc",
rather than doing this cast here.  I can't see any reason that would be bad,
since both implementations just access fbc->count, and do not modify anything.

> + stats->lss_busy += cfs_hash_size_get(hs) -
> + percpu_counter_read_positive(>ls_lru_len_counter);
>   cfs_hash_for_each_bucket(hs, , i) {
> - struct lu_site_bkt_data *bkt = cfs_hash_bd_extra_get(hs, );
>   struct hlist_head   *hhead;
> 
>   cfs_hash_bd_lock(hs, , 1);
> - stats->lss_busy  +=
> - cfs_hash_bd_count_get() - bkt->lsb_lru_len;
>   stats->lss_total += cfs_hash_bd_count_get();
>   stats->lss_max_search = max((int)stats->lss_max_search,
>   cfs_hash_bd_depmax_get());
> @@ -2039,7 +2033,7 @@ int lu_site_stats_print(const struct lu_site *s, struct 
> seq_file *m)

Re: [PATCH 03/10] staging: lustre: lu_object: discard extra lru count.

2018-04-30 Thread Dilger, Andreas

On Apr 30, 2018, at 21:52, NeilBrown  wrote:
> 
> lu_object maintains 2 lru counts.
> One is a per-bucket lsb_lru_len.
> The other is the per-cpu ls_lru_len_counter.
> 
> The only times the per-bucket counters are use are:
> - a debug message when an object is added
> - in lu_site_stats_get when all the counters are combined.
> 
> The debug message is not essential, and the per-cpu counter
> can be used to get the combined total.
> 
> So discard the per-bucket lsb_lru_len.
> 
> Signed-off-by: NeilBrown 

Looks reasonable, though it would also be possible to fix the percpu
functions rather than adding a workaround in this code.

Reviewed-by: Andreas Dilger 

> ---
> drivers/staging/lustre/lustre/obdclass/lu_object.c |   24 
> 1 file changed, 9 insertions(+), 15 deletions(-)
> 
> diff --git a/drivers/staging/lustre/lustre/obdclass/lu_object.c 
> b/drivers/staging/lustre/lustre/obdclass/lu_object.c
> index 2a8a25d6edb5..2bf089817157 100644
> --- a/drivers/staging/lustre/lustre/obdclass/lu_object.c
> +++ b/drivers/staging/lustre/lustre/obdclass/lu_object.c
> @@ -57,10 +57,6 @@
> #include 
> 
> struct lu_site_bkt_data {
> - /**
> -  * number of object in this bucket on the lsb_lru list.
> -  */
> - longlsb_lru_len;
>   /**
>* LRU list, updated on each access to object. Protected by
>* bucket lock of lu_site::ls_obj_hash.
> @@ -187,10 +183,9 @@ void lu_object_put(const struct lu_env *env, struct 
> lu_object *o)
>   if (!lu_object_is_dying(top)) {
>   LASSERT(list_empty(>loh_lru));
>   list_add_tail(>loh_lru, >lsb_lru);
> - bkt->lsb_lru_len++;
>   percpu_counter_inc(>ls_lru_len_counter);
> - CDEBUG(D_INODE, "Add %p to site lru. hash: %p, bkt: %p, 
> lru_len: %ld\n",
> -o, site->ls_obj_hash, bkt, bkt->lsb_lru_len);
> + CDEBUG(D_INODE, "Add %p to site lru. hash: %p, bkt: %p\n",
> +o, site->ls_obj_hash, bkt);
>   cfs_hash_bd_unlock(site->ls_obj_hash, , 1);
>   return;
>   }
> @@ -238,7 +233,6 @@ void lu_object_unhash(const struct lu_env *env, struct 
> lu_object *o)
> 
>   list_del_init(>loh_lru);
>   bkt = cfs_hash_bd_extra_get(obj_hash, );
> - bkt->lsb_lru_len--;
>   percpu_counter_dec(>ls_lru_len_counter);
>   }
>   cfs_hash_bd_del_locked(obj_hash, , >loh_hash);
> @@ -422,7 +416,6 @@ int lu_site_purge_objects(const struct lu_env *env, 
> struct lu_site *s,
>   cfs_hash_bd_del_locked(s->ls_obj_hash,
>  , >loh_hash);
>   list_move(>loh_lru, );
> - bkt->lsb_lru_len--;
>   percpu_counter_dec(>ls_lru_len_counter);
>   if (did_sth == 0)
>   did_sth = 1;
> @@ -621,7 +614,6 @@ static struct lu_object *htable_lookup(struct lu_site *s,
>   lprocfs_counter_incr(s->ls_stats, LU_SS_CACHE_HIT);
>   if (!list_empty(>loh_lru)) {
>   list_del_init(>loh_lru);
> - bkt->lsb_lru_len--;
>   percpu_counter_dec(>ls_lru_len_counter);
>   }
>   return lu_object_top(h);
> @@ -1834,19 +1826,21 @@ struct lu_site_stats {
>   unsigned intlss_busy;
> };
> 
> -static void lu_site_stats_get(struct cfs_hash *hs,
> +static void lu_site_stats_get(const struct lu_site *s,
> struct lu_site_stats *stats, int populated)
> {
> + struct cfs_hash *hs = s->ls_obj_hash;
>   struct cfs_hash_bd bd;
>   unsigned int i;
> + /* percpu_counter_read_positive() won't accept a const pointer */
> + struct lu_site *s2 = (struct lu_site *)s;

It would seem worthwhile to change the percpu_counter_read_positive() and
percpu_counter_read() arguments to be "const struct percpu_counter *fbc",
rather than doing this cast here.  I can't see any reason that would be bad,
since both implementations just access fbc->count, and do not modify anything.

> + stats->lss_busy += cfs_hash_size_get(hs) -
> + percpu_counter_read_positive(>ls_lru_len_counter);
>   cfs_hash_for_each_bucket(hs, , i) {
> - struct lu_site_bkt_data *bkt = cfs_hash_bd_extra_get(hs, );
>   struct hlist_head   *hhead;
> 
>   cfs_hash_bd_lock(hs, , 1);
> - stats->lss_busy  +=
> - cfs_hash_bd_count_get() - bkt->lsb_lru_len;
>   stats->lss_total += cfs_hash_bd_count_get();
>   stats->lss_max_search = max((int)stats->lss_max_search,
>   cfs_hash_bd_depmax_get());
> @@ -2039,7 +2033,7 @@ int lu_site_stats_print(const struct lu_site *s, struct 
> seq_file *m)
>   struct lu_site_stats stats;
> 
>   memset(, 0,

Re: [lustre-devel] [PATCH 02/10] staging: lustre: make struct lu_site_bkt_data private

2018-04-30 Thread Dilger, Andreas

On Apr 30, 2018, at 21:52, NeilBrown  wrote:
> 
> This data structure only needs to be public so that
> various modules can access a wait queue to wait for object
> destruction.
> If we provide a function to get the wait queue, rather than the
> whole bucket, the structure can be made private.
> 
> Signed-off-by: NeilBrown 

Nice cleanup.

Reviewed-by: Andreas Dilger 

> ---
> drivers/staging/lustre/lustre/include/lu_object.h  |   36 +-
> drivers/staging/lustre/lustre/llite/lcommon_cl.c   |8 ++-
> drivers/staging/lustre/lustre/lov/lov_object.c |8 ++-
> drivers/staging/lustre/lustre/obdclass/lu_object.c |   50 +---
> 4 files changed, 54 insertions(+), 48 deletions(-)
> 
> diff --git a/drivers/staging/lustre/lustre/include/lu_object.h 
> b/drivers/staging/lustre/lustre/include/lu_object.h
> index c3b0ed518819..f29bbca5af65 100644
> --- a/drivers/staging/lustre/lustre/include/lu_object.h
> +++ b/drivers/staging/lustre/lustre/include/lu_object.h
> @@ -549,31 +549,7 @@ struct lu_object_header {
> };
> 
> struct fld;
> -
> -struct lu_site_bkt_data {
> - /**
> -  * number of object in this bucket on the lsb_lru list.
> -  */
> - longlsb_lru_len;
> - /**
> -  * LRU list, updated on each access to object. Protected by
> -  * bucket lock of lu_site::ls_obj_hash.
> -  *
> -  * "Cold" end of LRU is lu_site::ls_lru.next. Accessed object are
> -  * moved to the lu_site::ls_lru.prev (this is due to the non-existence
> -  * of list_for_each_entry_safe_reverse()).
> -  */
> - struct list_headlsb_lru;
> - /**
> -  * Wait-queue signaled when an object in this site is ultimately
> -  * destroyed (lu_object_free()). It is used by lu_object_find() to
> -  * wait before re-trying when object in the process of destruction is
> -  * found in the hash table.
> -  *
> -  * \see htable_lookup().
> -  */
> - wait_queue_head_t  lsb_marche_funebre;
> -};
> +struct lu_site_bkt_data;
> 
> enum {
>   LU_SS_CREATED= 0,
> @@ -642,14 +618,8 @@ struct lu_site {
>   struct percpu_counterls_lru_len_counter;
> };
> 
> -static inline struct lu_site_bkt_data *
> -lu_site_bkt_from_fid(struct lu_site *site, struct lu_fid *fid)
> -{
> - struct cfs_hash_bd bd;
> -
> - cfs_hash_bd_get(site->ls_obj_hash, fid, );
> - return cfs_hash_bd_extra_get(site->ls_obj_hash, );
> -}
> +wait_queue_head_t *
> +lu_site_wq_from_fid(struct lu_site *site, struct lu_fid *fid);
> 
> static inline struct seq_server_site *lu_site2seq(const struct lu_site *s)
> {
> diff --git a/drivers/staging/lustre/lustre/llite/lcommon_cl.c 
> b/drivers/staging/lustre/lustre/llite/lcommon_cl.c
> index df5c0c0ae703..d5b42fb1d601 100644
> --- a/drivers/staging/lustre/lustre/llite/lcommon_cl.c
> +++ b/drivers/staging/lustre/lustre/llite/lcommon_cl.c
> @@ -211,12 +211,12 @@ static void cl_object_put_last(struct lu_env *env, 
> struct cl_object *obj)
> 
>   if (unlikely(atomic_read(>loh_ref) != 1)) {
>   struct lu_site *site = obj->co_lu.lo_dev->ld_site;
> - struct lu_site_bkt_data *bkt;
> + wait_queue_head_t *wq;
> 
> - bkt = lu_site_bkt_from_fid(site, >loh_fid);
> + wq = lu_site_wq_from_fid(site, >loh_fid);
> 
>   init_waitqueue_entry(, current);
> - add_wait_queue(>lsb_marche_funebre, );
> + add_wait_queue(wq, );
> 
>   while (1) {
>   set_current_state(TASK_UNINTERRUPTIBLE);
> @@ -226,7 +226,7 @@ static void cl_object_put_last(struct lu_env *env, struct 
> cl_object *obj)
>   }
> 
>   set_current_state(TASK_RUNNING);
> - remove_wait_queue(>lsb_marche_funebre, );
> + remove_wait_queue(wq, );
>   }
> 
>   cl_object_put(env, obj);
> diff --git a/drivers/staging/lustre/lustre/lov/lov_object.c 
> b/drivers/staging/lustre/lustre/lov/lov_object.c
> index f7c69680cb7d..adc90f310fd7 100644
> --- a/drivers/staging/lustre/lustre/lov/lov_object.c
> +++ b/drivers/staging/lustre/lustre/lov/lov_object.c
> @@ -370,7 +370,7 @@ static void lov_subobject_kill(const struct lu_env *env, 
> struct lov_object *lov,
>   struct cl_object*sub;
>   struct lov_layout_raid0 *r0;
>   struct lu_site*site;
> - struct lu_site_bkt_data *bkt;
> + wait_queue_head_t *wq;
>   wait_queue_entry_t*waiter;
> 
>   r0  = >u.raid0;
> @@ -378,7 +378,7 @@ static void lov_subobject_kill(const struct lu_env *env, 
> struct lov_object *lov,
> 
>   sub  = lovsub2cl(los);
>   site = sub->co_lu.lo_dev->ld_site;
> - bkt  = lu_site_bkt_from_fid(site, >co_lu.lo_header->loh_fid);
> + wq   = lu_site_wq_from_fid(site, >co_lu.lo_header->loh_fid);
> 
>   cl_object_kill(env, sub);
>   /* release a reference to the sub-object and ... */
> @@

Re: [lustre-devel] [PATCH 02/10] staging: lustre: make struct lu_site_bkt_data private

2018-04-30 Thread Dilger, Andreas

On Apr 30, 2018, at 21:52, NeilBrown  wrote:
> 
> This data structure only needs to be public so that
> various modules can access a wait queue to wait for object
> destruction.
> If we provide a function to get the wait queue, rather than the
> whole bucket, the structure can be made private.
> 
> Signed-off-by: NeilBrown 

Nice cleanup.

Reviewed-by: Andreas Dilger 

> ---
> drivers/staging/lustre/lustre/include/lu_object.h  |   36 +-
> drivers/staging/lustre/lustre/llite/lcommon_cl.c   |8 ++-
> drivers/staging/lustre/lustre/lov/lov_object.c |8 ++-
> drivers/staging/lustre/lustre/obdclass/lu_object.c |   50 +---
> 4 files changed, 54 insertions(+), 48 deletions(-)
> 
> diff --git a/drivers/staging/lustre/lustre/include/lu_object.h 
> b/drivers/staging/lustre/lustre/include/lu_object.h
> index c3b0ed518819..f29bbca5af65 100644
> --- a/drivers/staging/lustre/lustre/include/lu_object.h
> +++ b/drivers/staging/lustre/lustre/include/lu_object.h
> @@ -549,31 +549,7 @@ struct lu_object_header {
> };
> 
> struct fld;
> -
> -struct lu_site_bkt_data {
> - /**
> -  * number of object in this bucket on the lsb_lru list.
> -  */
> - longlsb_lru_len;
> - /**
> -  * LRU list, updated on each access to object. Protected by
> -  * bucket lock of lu_site::ls_obj_hash.
> -  *
> -  * "Cold" end of LRU is lu_site::ls_lru.next. Accessed object are
> -  * moved to the lu_site::ls_lru.prev (this is due to the non-existence
> -  * of list_for_each_entry_safe_reverse()).
> -  */
> - struct list_headlsb_lru;
> - /**
> -  * Wait-queue signaled when an object in this site is ultimately
> -  * destroyed (lu_object_free()). It is used by lu_object_find() to
> -  * wait before re-trying when object in the process of destruction is
> -  * found in the hash table.
> -  *
> -  * \see htable_lookup().
> -  */
> - wait_queue_head_t  lsb_marche_funebre;
> -};
> +struct lu_site_bkt_data;
> 
> enum {
>   LU_SS_CREATED= 0,
> @@ -642,14 +618,8 @@ struct lu_site {
>   struct percpu_counterls_lru_len_counter;
> };
> 
> -static inline struct lu_site_bkt_data *
> -lu_site_bkt_from_fid(struct lu_site *site, struct lu_fid *fid)
> -{
> - struct cfs_hash_bd bd;
> -
> - cfs_hash_bd_get(site->ls_obj_hash, fid, );
> - return cfs_hash_bd_extra_get(site->ls_obj_hash, );
> -}
> +wait_queue_head_t *
> +lu_site_wq_from_fid(struct lu_site *site, struct lu_fid *fid);
> 
> static inline struct seq_server_site *lu_site2seq(const struct lu_site *s)
> {
> diff --git a/drivers/staging/lustre/lustre/llite/lcommon_cl.c 
> b/drivers/staging/lustre/lustre/llite/lcommon_cl.c
> index df5c0c0ae703..d5b42fb1d601 100644
> --- a/drivers/staging/lustre/lustre/llite/lcommon_cl.c
> +++ b/drivers/staging/lustre/lustre/llite/lcommon_cl.c
> @@ -211,12 +211,12 @@ static void cl_object_put_last(struct lu_env *env, 
> struct cl_object *obj)
> 
>   if (unlikely(atomic_read(>loh_ref) != 1)) {
>   struct lu_site *site = obj->co_lu.lo_dev->ld_site;
> - struct lu_site_bkt_data *bkt;
> + wait_queue_head_t *wq;
> 
> - bkt = lu_site_bkt_from_fid(site, >loh_fid);
> + wq = lu_site_wq_from_fid(site, >loh_fid);
> 
>   init_waitqueue_entry(, current);
> - add_wait_queue(>lsb_marche_funebre, );
> + add_wait_queue(wq, );
> 
>   while (1) {
>   set_current_state(TASK_UNINTERRUPTIBLE);
> @@ -226,7 +226,7 @@ static void cl_object_put_last(struct lu_env *env, struct 
> cl_object *obj)
>   }
> 
>   set_current_state(TASK_RUNNING);
> - remove_wait_queue(>lsb_marche_funebre, );
> + remove_wait_queue(wq, );
>   }
> 
>   cl_object_put(env, obj);
> diff --git a/drivers/staging/lustre/lustre/lov/lov_object.c 
> b/drivers/staging/lustre/lustre/lov/lov_object.c
> index f7c69680cb7d..adc90f310fd7 100644
> --- a/drivers/staging/lustre/lustre/lov/lov_object.c
> +++ b/drivers/staging/lustre/lustre/lov/lov_object.c
> @@ -370,7 +370,7 @@ static void lov_subobject_kill(const struct lu_env *env, 
> struct lov_object *lov,
>   struct cl_object*sub;
>   struct lov_layout_raid0 *r0;
>   struct lu_site*site;
> - struct lu_site_bkt_data *bkt;
> + wait_queue_head_t *wq;
>   wait_queue_entry_t*waiter;
> 
>   r0  = >u.raid0;
> @@ -378,7 +378,7 @@ static void lov_subobject_kill(const struct lu_env *env, 
> struct lov_object *lov,
> 
>   sub  = lovsub2cl(los);
>   site = sub->co_lu.lo_dev->ld_site;
> - bkt  = lu_site_bkt_from_fid(site, >co_lu.lo_header->loh_fid);
> + wq   = lu_site_wq_from_fid(site, >co_lu.lo_header->loh_fid);
> 
>   cl_object_kill(env, sub);
>   /* release a reference to the sub-object and ... */
> @@ -391,7 +391,7 @@ static void lov_subobject_kill(const

Re: [PATCH v2] staging: lustre: llite: fix potential missing-check bug when copying lumv

2018-04-30 Thread Dilger, Andreas

On Apr 30, 2018, at 16:56, Wenwen Wang  wrote:
> 
> In ll_dir_ioctl(), the object lumv3 is firstly copied from the user space
> using Its address, i.e., lumv1 =  If the lmm_magic field of lumv3 is
> LOV_USER_MAGIC_V3, lumv3 will be modified by the second copy from the user
> space. The second copy is necessary, because the two versions (i.e.,
> lov_user_md_v1 and lov_user_md_v3) have different data formats and lengths.
> However, given that the user data resides in the user space, a malicious
> user-space process can race to change the data between the two copies. By
> doing so, the attacker can provide a data with an inconsistent version,
> e.g., v1 version + v3 data. This can lead to logical errors in the
> following execution in ll_dir_setstripe(), which performs different actions
> according to the version specified by the field lmm_magic.
> 
> This patch rechecks the version field lmm_magic in the second copy.  If the
> version is not as expected, i.e., LOV_USER_MAGIC_V3, an error code will be
> returned: -EINVAL.
> 
> Signed-off-by: Wenwen Wang 

Thanks for the updated patch.

Reviewed-by: Andreas Dilger 

> ---
> drivers/staging/lustre/lustre/llite/dir.c | 2 ++
> 1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/staging/lustre/lustre/llite/dir.c 
> b/drivers/staging/lustre/lustre/llite/dir.c
> index d10d272..80d44ca 100644
> --- a/drivers/staging/lustre/lustre/llite/dir.c
> +++ b/drivers/staging/lustre/lustre/llite/dir.c
> @@ -1185,6 +1185,8 @@ static long ll_dir_ioctl(struct file *file, unsigned 
> int cmd, unsigned long arg)
>   if (lumv1->lmm_magic == LOV_USER_MAGIC_V3) {
>   if (copy_from_user(, lumv3p, sizeof(lumv3)))
>   return -EFAULT;
> + if (lumv3.lmm_magic != LOV_USER_MAGIC_V3)
> + return -EINVAL;
>   }
> 
>   if (is_root_inode(inode))
> -- 
> 2.7.4
> 

Cheers, Andreas
--
Andreas Dilger
Lustre Principal Architect
Intel Corporation

Re: [PATCH v2] staging: lustre: llite: fix potential missing-check bug when copying lumv

2018-04-30 Thread Dilger, Andreas

On Apr 30, 2018, at 16:56, Wenwen Wang  wrote:
> 
> In ll_dir_ioctl(), the object lumv3 is firstly copied from the user space
> using Its address, i.e., lumv1 =  If the lmm_magic field of lumv3 is
> LOV_USER_MAGIC_V3, lumv3 will be modified by the second copy from the user
> space. The second copy is necessary, because the two versions (i.e.,
> lov_user_md_v1 and lov_user_md_v3) have different data formats and lengths.
> However, given that the user data resides in the user space, a malicious
> user-space process can race to change the data between the two copies. By
> doing so, the attacker can provide a data with an inconsistent version,
> e.g., v1 version + v3 data. This can lead to logical errors in the
> following execution in ll_dir_setstripe(), which performs different actions
> according to the version specified by the field lmm_magic.
> 
> This patch rechecks the version field lmm_magic in the second copy.  If the
> version is not as expected, i.e., LOV_USER_MAGIC_V3, an error code will be
> returned: -EINVAL.
> 
> Signed-off-by: Wenwen Wang 

Thanks for the updated patch.

Reviewed-by: Andreas Dilger 

> ---
> drivers/staging/lustre/lustre/llite/dir.c | 2 ++
> 1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/staging/lustre/lustre/llite/dir.c 
> b/drivers/staging/lustre/lustre/llite/dir.c
> index d10d272..80d44ca 100644
> --- a/drivers/staging/lustre/lustre/llite/dir.c
> +++ b/drivers/staging/lustre/lustre/llite/dir.c
> @@ -1185,6 +1185,8 @@ static long ll_dir_ioctl(struct file *file, unsigned 
> int cmd, unsigned long arg)
>   if (lumv1->lmm_magic == LOV_USER_MAGIC_V3) {
>   if (copy_from_user(, lumv3p, sizeof(lumv3)))
>   return -EFAULT;
> + if (lumv3.lmm_magic != LOV_USER_MAGIC_V3)
> + return -EINVAL;
>   }
> 
>   if (is_root_inode(inode))
> -- 
> 2.7.4
> 

Cheers, Andreas
--
Andreas Dilger
Lustre Principal Architect
Intel Corporation

Re: [PATCH 01/10] staging: lustre: ldlm: store name directly in namespace.

2018-04-30 Thread Dilger, Andreas

On Apr 30, 2018, at 21:52, NeilBrown  wrote:
> 
> Rather than storing the name of a namespace in the
> hash table, store it directly in the namespace.
> This will allow the hashtable to be changed to use
> rhashtable.
> 
> Signed-off-by: NeilBrown 

Reviewed-by: Andreas Dilger 

> ---
> drivers/staging/lustre/lustre/include/lustre_dlm.h |5 -
> drivers/staging/lustre/lustre/ldlm/ldlm_resource.c |5 +
> 2 files changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/staging/lustre/lustre/include/lustre_dlm.h 
> b/drivers/staging/lustre/lustre/include/lustre_dlm.h
> index d668d86423a4..b3532adac31c 100644
> --- a/drivers/staging/lustre/lustre/include/lustre_dlm.h
> +++ b/drivers/staging/lustre/lustre/include/lustre_dlm.h
> @@ -362,6 +362,9 @@ struct ldlm_namespace {
>   /** Flag indicating if namespace is on client instead of server */
>   enum ldlm_side  ns_client;
> 
> + /** name of this namespace */
> + char*ns_name;
> +
>   /** Resource hash table for namespace. */
>   struct cfs_hash *ns_rs_hash;
> 
> @@ -878,7 +881,7 @@ static inline bool ldlm_has_layout(struct ldlm_lock *lock)
> static inline char *
> ldlm_ns_name(struct ldlm_namespace *ns)
> {
> - return ns->ns_rs_hash->hs_name;
> + return ns->ns_name;
> }
> 
> static inline struct ldlm_namespace *
> diff --git a/drivers/staging/lustre/lustre/ldlm/ldlm_resource.c 
> b/drivers/staging/lustre/lustre/ldlm/ldlm_resource.c
> index 6c615b6e9bdc..43bbc5fd94cc 100644
> --- a/drivers/staging/lustre/lustre/ldlm/ldlm_resource.c
> +++ b/drivers/staging/lustre/lustre/ldlm/ldlm_resource.c
> @@ -688,6 +688,9 @@ struct ldlm_namespace *ldlm_namespace_new(struct 
> obd_device *obd, char *name,
>   ns->ns_obd  = obd;
>   ns->ns_appetite = apt;
>   ns->ns_client   = client;
> + ns->ns_name = kstrdup(name, GFP_KERNEL);
> + if (!ns->ns_name)
> + goto out_hash;
> 
>   INIT_LIST_HEAD(>ns_list_chain);
>   INIT_LIST_HEAD(>ns_unused_list);
> @@ -730,6 +733,7 @@ struct ldlm_namespace *ldlm_namespace_new(struct 
> obd_device *obd, char *name,
>   ldlm_namespace_sysfs_unregister(ns);
>   ldlm_namespace_cleanup(ns, 0);
> out_hash:
> + kfree(ns->ns_name);
>   cfs_hash_putref(ns->ns_rs_hash);
> out_ns:
>   kfree(ns);
> @@ -993,6 +997,7 @@ void ldlm_namespace_free_post(struct ldlm_namespace *ns)
>   ldlm_namespace_debugfs_unregister(ns);
>   ldlm_namespace_sysfs_unregister(ns);
>   cfs_hash_putref(ns->ns_rs_hash);
> + kfree(ns->ns_name);
>   /* Namespace \a ns should be not on list at this time, otherwise
>* this will cause issues related to using freed \a ns in poold
>* thread.
> 
> 

Cheers, Andreas
--
Andreas Dilger
Lustre Principal Architect
Intel Corporation

Re: [PATCH 01/10] staging: lustre: ldlm: store name directly in namespace.

2018-04-30 Thread Dilger, Andreas

On Apr 30, 2018, at 21:52, NeilBrown  wrote:
> 
> Rather than storing the name of a namespace in the
> hash table, store it directly in the namespace.
> This will allow the hashtable to be changed to use
> rhashtable.
> 
> Signed-off-by: NeilBrown 

Reviewed-by: Andreas Dilger 

> ---
> drivers/staging/lustre/lustre/include/lustre_dlm.h |5 -
> drivers/staging/lustre/lustre/ldlm/ldlm_resource.c |5 +
> 2 files changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/staging/lustre/lustre/include/lustre_dlm.h 
> b/drivers/staging/lustre/lustre/include/lustre_dlm.h
> index d668d86423a4..b3532adac31c 100644
> --- a/drivers/staging/lustre/lustre/include/lustre_dlm.h
> +++ b/drivers/staging/lustre/lustre/include/lustre_dlm.h
> @@ -362,6 +362,9 @@ struct ldlm_namespace {
>   /** Flag indicating if namespace is on client instead of server */
>   enum ldlm_side  ns_client;
> 
> + /** name of this namespace */
> + char*ns_name;
> +
>   /** Resource hash table for namespace. */
>   struct cfs_hash *ns_rs_hash;
> 
> @@ -878,7 +881,7 @@ static inline bool ldlm_has_layout(struct ldlm_lock *lock)
> static inline char *
> ldlm_ns_name(struct ldlm_namespace *ns)
> {
> - return ns->ns_rs_hash->hs_name;
> + return ns->ns_name;
> }
> 
> static inline struct ldlm_namespace *
> diff --git a/drivers/staging/lustre/lustre/ldlm/ldlm_resource.c 
> b/drivers/staging/lustre/lustre/ldlm/ldlm_resource.c
> index 6c615b6e9bdc..43bbc5fd94cc 100644
> --- a/drivers/staging/lustre/lustre/ldlm/ldlm_resource.c
> +++ b/drivers/staging/lustre/lustre/ldlm/ldlm_resource.c
> @@ -688,6 +688,9 @@ struct ldlm_namespace *ldlm_namespace_new(struct 
> obd_device *obd, char *name,
>   ns->ns_obd  = obd;
>   ns->ns_appetite = apt;
>   ns->ns_client   = client;
> + ns->ns_name = kstrdup(name, GFP_KERNEL);
> + if (!ns->ns_name)
> + goto out_hash;
> 
>   INIT_LIST_HEAD(>ns_list_chain);
>   INIT_LIST_HEAD(>ns_unused_list);
> @@ -730,6 +733,7 @@ struct ldlm_namespace *ldlm_namespace_new(struct 
> obd_device *obd, char *name,
>   ldlm_namespace_sysfs_unregister(ns);
>   ldlm_namespace_cleanup(ns, 0);
> out_hash:
> + kfree(ns->ns_name);
>   cfs_hash_putref(ns->ns_rs_hash);
> out_ns:
>   kfree(ns);
> @@ -993,6 +997,7 @@ void ldlm_namespace_free_post(struct ldlm_namespace *ns)
>   ldlm_namespace_debugfs_unregister(ns);
>   ldlm_namespace_sysfs_unregister(ns);
>   cfs_hash_putref(ns->ns_rs_hash);
> + kfree(ns->ns_name);
>   /* Namespace \a ns should be not on list at this time, otherwise
>* this will cause issues related to using freed \a ns in poold
>* thread.
> 
> 

Cheers, Andreas
--
Andreas Dilger
Lustre Principal Architect
Intel Corporation

[PATCH 10/10] staging: lustre: fix error deref in ll_splice_alias().

2018-04-30 Thread NeilBrown

d_splice_alias() can return an ERR_PTR().
If it does while debugging is enabled, the following
CDEBUG() will dereference that error and crash.

So add appropriate checking, and provide a separate
debug message for the error case.

Reported-by: James Simmons 
Fixes: e9d4f0b9f559 ("staging: lustre: llite: use d_splice_alias for 
directories.")
Signed-off-by: NeilBrown 
---
 drivers/staging/lustre/lustre/llite/namei.c |8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/staging/lustre/lustre/llite/namei.c 
b/drivers/staging/lustre/lustre/llite/namei.c
index 6c9ec462eb41..24a6873d86a2 100644
--- a/drivers/staging/lustre/lustre/llite/namei.c
+++ b/drivers/staging/lustre/lustre/llite/namei.c
@@ -442,11 +442,15 @@ struct dentry *ll_splice_alias(struct inode *inode, 
struct dentry *de)
} else {
struct dentry *new = d_splice_alias(inode, de);
 
+   if (IS_ERR(new))
+   CDEBUG(D_DENTRY, "splice inode %p as %pd gives error 
%lu\n",
+  inode, de, PTR_ERR(new));
if (new)
de = new;
}
-   CDEBUG(D_DENTRY, "Add dentry %p inode %p refc %d flags %#x\n",
-  de, d_inode(de), d_count(de), de->d_flags);
+   if (!IS_ERR(de))
+   CDEBUG(D_DENTRY, "Add dentry %p inode %p refc %d flags %#x\n",
+  de, d_inode(de), d_count(de), de->d_flags);
return de;
 }

[PATCH 10/10] staging: lustre: fix error deref in ll_splice_alias().

2018-04-30 Thread NeilBrown

d_splice_alias() can return an ERR_PTR().
If it does while debugging is enabled, the following
CDEBUG() will dereference that error and crash.

So add appropriate checking, and provide a separate
debug message for the error case.

Reported-by: James Simmons 
Fixes: e9d4f0b9f559 ("staging: lustre: llite: use d_splice_alias for 
directories.")
Signed-off-by: NeilBrown 
---
 drivers/staging/lustre/lustre/llite/namei.c |8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/staging/lustre/lustre/llite/namei.c 
b/drivers/staging/lustre/lustre/llite/namei.c
index 6c9ec462eb41..24a6873d86a2 100644
--- a/drivers/staging/lustre/lustre/llite/namei.c
+++ b/drivers/staging/lustre/lustre/llite/namei.c
@@ -442,11 +442,15 @@ struct dentry *ll_splice_alias(struct inode *inode, 
struct dentry *de)
} else {
struct dentry *new = d_splice_alias(inode, de);
 
+   if (IS_ERR(new))
+   CDEBUG(D_DENTRY, "splice inode %p as %pd gives error 
%lu\n",
+  inode, de, PTR_ERR(new));
if (new)
de = new;
}
-   CDEBUG(D_DENTRY, "Add dentry %p inode %p refc %d flags %#x\n",
-  de, d_inode(de), d_count(de), de->d_flags);
+   if (!IS_ERR(de))
+   CDEBUG(D_DENTRY, "Add dentry %p inode %p refc %d flags %#x\n",
+  de, d_inode(de), d_count(de), de->d_flags);
return de;
 }

[PATCH 09/10] staging: lustre: move remaining code from linux-module.c to module.c

2018-04-30 Thread NeilBrown

There is no longer any need to keep this code separate,
and now we can remove linux-module.c

Signed-off-by: NeilBrown 
---
 .../staging/lustre/include/linux/libcfs/libcfs.h   |4 
 drivers/staging/lustre/lnet/libcfs/Makefile|1 
 .../lustre/lnet/libcfs/linux/linux-module.c|  168 
 drivers/staging/lustre/lnet/libcfs/module.c|  131 
 4 files changed, 131 insertions(+), 173 deletions(-)
 delete mode 100644 drivers/staging/lustre/lnet/libcfs/linux/linux-module.c

diff --git a/drivers/staging/lustre/include/linux/libcfs/libcfs.h 
b/drivers/staging/lustre/include/linux/libcfs/libcfs.h
index 9263e151451b..d420449b620e 100644
--- a/drivers/staging/lustre/include/linux/libcfs/libcfs.h
+++ b/drivers/staging/lustre/include/linux/libcfs/libcfs.h
@@ -138,10 +138,6 @@ struct libcfs_ioctl_handler {
 int libcfs_register_ioctl(struct libcfs_ioctl_handler *hand);
 int libcfs_deregister_ioctl(struct libcfs_ioctl_handler *hand);
 
-int libcfs_ioctl_getdata(struct libcfs_ioctl_hdr **hdr_pp,
-const struct libcfs_ioctl_hdr __user *uparam);
-int libcfs_ioctl_data_adjust(struct libcfs_ioctl_data *data);
-
 #define _LIBCFS_H
 
 /**
diff --git a/drivers/staging/lustre/lnet/libcfs/Makefile 
b/drivers/staging/lustre/lnet/libcfs/Makefile
index e6fda27fdabd..e73515789a11 100644
--- a/drivers/staging/lustre/lnet/libcfs/Makefile
+++ b/drivers/staging/lustre/lnet/libcfs/Makefile
@@ -5,7 +5,6 @@ subdir-ccflags-y += 
-I$(srctree)/drivers/staging/lustre/lustre/include
 obj-$(CONFIG_LNET) += libcfs.o
 
 libcfs-linux-objs := linux-tracefile.o linux-debug.o
-libcfs-linux-objs += linux-module.o
 libcfs-linux-objs += linux-crypto.o
 libcfs-linux-objs += linux-crypto-adler.o
 
diff --git a/drivers/staging/lustre/lnet/libcfs/linux/linux-module.c 
b/drivers/staging/lustre/lnet/libcfs/linux/linux-module.c
deleted file mode 100644
index 954b681f9db7..
--- a/drivers/staging/lustre/lnet/libcfs/linux/linux-module.c
+++ /dev/null
@@ -1,168 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0
-/*
- * GPL HEADER START
- *
- * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 only,
- * as published by the Free Software Foundation.
- *
- * This program is distributed in the hope that it will be useful, but
- * WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
- * General Public License version 2 for more details (a copy is included
- * in the LICENSE file that accompanied this code).
- *
- * You should have received a copy of the GNU General Public License
- * version 2 along with this program; If not, see
- * http://www.gnu.org/licenses/gpl-2.0.html
- *
- * GPL HEADER END
- */
-/*
- * Copyright (c) 2008, 2010, Oracle and/or its affiliates. All rights reserved.
- * Use is subject to license terms.
- *
- * Copyright (c) 2012, Intel Corporation.
- */
-/*
- * This file is part of Lustre, http://www.lustre.org/
- * Lustre is a trademark of Sun Microsystems, Inc.
- */
-
-#define DEBUG_SUBSYSTEM S_LNET
-
-#include 
-#include 
-
-static inline size_t libcfs_ioctl_packlen(struct libcfs_ioctl_data *data)
-{
-   size_t len = sizeof(*data);
-
-   len += cfs_size_round(data->ioc_inllen1);
-   len += cfs_size_round(data->ioc_inllen2);
-   return len;
-}
-
-static inline bool libcfs_ioctl_is_invalid(struct libcfs_ioctl_data *data)
-{
-   if (data->ioc_hdr.ioc_len > BIT(30)) {
-   CERROR("LIBCFS ioctl: ioc_len larger than 1<<30\n");
-   return true;
-   }
-   if (data->ioc_inllen1 > BIT(30)) {
-   CERROR("LIBCFS ioctl: ioc_inllen1 larger than 1<<30\n");
-   return true;
-   }
-   if (data->ioc_inllen2 > BIT(30)) {
-   CERROR("LIBCFS ioctl: ioc_inllen2 larger than 1<<30\n");
-   return true;
-   }
-   if (data->ioc_inlbuf1 && !data->ioc_inllen1) {
-   CERROR("LIBCFS ioctl: inlbuf1 pointer but 0 length\n");
-   return true;
-   }
-   if (data->ioc_inlbuf2 && !data->ioc_inllen2) {
-   CERROR("LIBCFS ioctl: inlbuf2 pointer but 0 length\n");
-   return true;
-   }
-   if (data->ioc_pbuf1 && !data->ioc_plen1) {
-   CERROR("LIBCFS ioctl: pbuf1 pointer but 0 length\n");
-   return true;
-   }
-   if (data->ioc_pbuf2 && !data->ioc_plen2) {
-   CERROR("LIBCFS ioctl: pbuf2 pointer but 0 length\n");
-   return true;
-   }
-   if (data->ioc_plen1 && !data->ioc_pbuf1) {
-   CERROR("LIBCFS ioctl: plen1 nonzero but no pbuf1 pointer\n");
-   return true;
-   }
-   if (data->ioc_plen2 && !data->ioc_pbuf2) {
-   CERROR("LIBCFS ioctl: plen2 nonzero but no pbuf2

[PATCH 07/10] staging: lustre: llite: remove redundant lookup in dump_pgcache

2018-04-30 Thread NeilBrown

Both the 'next' and the 'show' functions for the dump_page_cache
seqfile perform a lookup based on the current file index.  This is
needless duplication.

The reason appears to be that the state that needs to be communicated
from "next" to "show" is two pointers, but seq_file only provides for
a single pointer to be returned from next and passed to show.

So make use of the new 'seq_private' structure to store the extra
pointer.
So when 'next' (or 'start') find something, it returns the page and
stores the clob in the private area.
'show' accepts the page as an argument, and finds the clob where it
was stored.

Signed-off-by: NeilBrown 
---
 drivers/staging/lustre/lustre/llite/vvp_dev.c |   97 +++--
 1 file changed, 41 insertions(+), 56 deletions(-)

diff --git a/drivers/staging/lustre/lustre/llite/vvp_dev.c 
b/drivers/staging/lustre/lustre/llite/vvp_dev.c
index a2619dc04a7f..39a85e967368 100644
--- a/drivers/staging/lustre/lustre/llite/vvp_dev.c
+++ b/drivers/staging/lustre/lustre/llite/vvp_dev.c
@@ -394,6 +394,7 @@ struct seq_private {
struct ll_sb_info   *sbi;
struct lu_env   *env;
u16 refcheck;
+   struct cl_object*clob;
 };
 
 static void vvp_pgcache_id_unpack(loff_t pos, struct vvp_pgcache_id *id)
@@ -458,19 +459,20 @@ static struct cl_object *vvp_pgcache_obj(const struct 
lu_env *env,
return NULL;
 }
 
-static loff_t vvp_pgcache_find(const struct lu_env *env,
-  struct lu_device *dev, loff_t pos)
+static struct page *vvp_pgcache_find(const struct lu_env *env,
+struct lu_device *dev,
+struct cl_object **clobp, loff_t *pos)
 {
struct cl_object *clob;
struct lu_site   *site;
struct vvp_pgcache_id id;
 
site = dev->ld_site;
-   vvp_pgcache_id_unpack(pos, );
+   vvp_pgcache_id_unpack(*pos, );
 
while (1) {
if (id.vpi_bucket >= CFS_HASH_NHLIST(site->ls_obj_hash))
-   return ~0ULL;
+   return NULL;
clob = vvp_pgcache_obj(env, dev, );
if (clob) {
struct inode *inode = vvp_object_inode(clob);
@@ -482,20 +484,22 @@ static loff_t vvp_pgcache_find(const struct lu_env *env,
if (nr > 0) {
id.vpi_index = vmpage->index;
/* Cant support over 16T file */
-   nr = !(vmpage->index > 0x);
+   if (vmpage->index <= 0x) {
+   *clobp = clob;
+   *pos = vvp_pgcache_id_pack();
+   return vmpage;
+   }
put_page(vmpage);
}
 
lu_object_ref_del(>co_lu, "dump", current);
cl_object_put(env, clob);
-   if (nr > 0)
-   return vvp_pgcache_id_pack();
}
/* to the next object. */
++id.vpi_depth;
id.vpi_depth &= 0xf;
if (id.vpi_depth == 0 && ++id.vpi_bucket == 0)
-   return ~0ULL;
+   return NULL;
id.vpi_index = 0;
}
 }
@@ -538,71 +542,52 @@ static void vvp_pgcache_page_show(const struct lu_env 
*env,
 static int vvp_pgcache_show(struct seq_file *f, void *v)
 {
struct seq_private  *priv = f->private;
-   loff_t pos;
-   struct cl_object*clob;
-   struct vvp_pgcache_idid;
-
-   pos = *(loff_t *)v;
-   vvp_pgcache_id_unpack(pos, );
-   clob = vvp_pgcache_obj(priv->env, >sbi->ll_cl->cd_lu_dev, );
-   if (clob) {
-   struct inode *inode = vvp_object_inode(clob);
-   struct cl_page *page = NULL;
-   struct page *vmpage;
-   int result;
-
-   result = find_get_pages_contig(inode->i_mapping,
-  id.vpi_index, 1,
-  );
-   if (result > 0) {
-   lock_page(vmpage);
-   page = cl_vmpage_page(vmpage, clob);
-   unlock_page(vmpage);
-   put_page(vmpage);
-   }
-
-   seq_printf(f, "%8x@" DFID ": ", id.vpi_index,
-  PFID(lu_object_fid(>co_lu)));
-   if (page) {
-   vvp_pgcache_page_show(priv->env, f, page);
-   cl_page_put(priv->env, page);
-   } else {
-   seq_puts(f, "missing\n");
-   }
-   lu_object_ref_del(>co_lu, "dump", current);
-

[PATCH 08/10] staging: lustre: move misc-device registration closer to related code.

2018-04-30 Thread NeilBrown

The ioctl handler for the misc device is in  lnet/libcfs/module.c
but is it registered in lnet/libcfs/linux/linux-module.c.

Keeping related code together make maintenance easier, so move the
code.

Signed-off-by: NeilBrown 
---
 .../staging/lustre/include/linux/libcfs/libcfs.h   |2 -
 .../lustre/lnet/libcfs/linux/linux-module.c|   28 --
 drivers/staging/lustre/lnet/libcfs/module.c|   31 +++-
 3 files changed, 30 insertions(+), 31 deletions(-)

diff --git a/drivers/staging/lustre/include/linux/libcfs/libcfs.h 
b/drivers/staging/lustre/include/linux/libcfs/libcfs.h
index 6e7754b2f296..9263e151451b 100644
--- a/drivers/staging/lustre/include/linux/libcfs/libcfs.h
+++ b/drivers/staging/lustre/include/linux/libcfs/libcfs.h
@@ -141,11 +141,9 @@ int libcfs_deregister_ioctl(struct libcfs_ioctl_handler 
*hand);
 int libcfs_ioctl_getdata(struct libcfs_ioctl_hdr **hdr_pp,
 const struct libcfs_ioctl_hdr __user *uparam);
 int libcfs_ioctl_data_adjust(struct libcfs_ioctl_data *data);
-int libcfs_ioctl(unsigned long cmd, void __user *arg);
 
 #define _LIBCFS_H
 
-extern struct miscdevice libcfs_dev;
 /**
  * The path of debug log dump upcall script.
  */
diff --git a/drivers/staging/lustre/lnet/libcfs/linux/linux-module.c 
b/drivers/staging/lustre/lnet/libcfs/linux/linux-module.c
index c8908e816c4c..954b681f9db7 100644
--- a/drivers/staging/lustre/lnet/libcfs/linux/linux-module.c
+++ b/drivers/staging/lustre/lnet/libcfs/linux/linux-module.c
@@ -166,31 +166,3 @@ int libcfs_ioctl_getdata(struct libcfs_ioctl_hdr **hdr_pp,
kvfree(*hdr_pp);
return err;
 }
-
-static long
-libcfs_psdev_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
-{
-   if (!capable(CAP_SYS_ADMIN))
-   return -EACCES;
-
-   if (_IOC_TYPE(cmd) != IOC_LIBCFS_TYPE ||
-   _IOC_NR(cmd) < IOC_LIBCFS_MIN_NR  ||
-   _IOC_NR(cmd) > IOC_LIBCFS_MAX_NR) {
-   CDEBUG(D_IOCTL, "invalid ioctl ( type %d, nr %d, size %d )\n",
-  _IOC_TYPE(cmd), _IOC_NR(cmd), _IOC_SIZE(cmd));
-   return -EINVAL;
-   }
-
-   return libcfs_ioctl(cmd, (void __user *)arg);
-}
-
-static const struct file_operations libcfs_fops = {
-   .owner  = THIS_MODULE,
-   .unlocked_ioctl = libcfs_psdev_ioctl,
-};
-
-struct miscdevice libcfs_dev = {
-   .minor = MISC_DYNAMIC_MINOR,
-   .name = "lnet",
-   .fops = _fops,
-};
diff --git a/drivers/staging/lustre/lnet/libcfs/module.c 
b/drivers/staging/lustre/lnet/libcfs/module.c
index 4b9acd7bc5cf..3fb150a57f49 100644
--- a/drivers/staging/lustre/lnet/libcfs/module.c
+++ b/drivers/staging/lustre/lnet/libcfs/module.c
@@ -95,7 +95,7 @@ int libcfs_deregister_ioctl(struct libcfs_ioctl_handler *hand)
 }
 EXPORT_SYMBOL(libcfs_deregister_ioctl);
 
-int libcfs_ioctl(unsigned long cmd, void __user *uparam)
+static int libcfs_ioctl(unsigned long cmd, void __user *uparam)
 {
struct libcfs_ioctl_data *data = NULL;
struct libcfs_ioctl_hdr *hdr;
@@ -161,6 +161,35 @@ int libcfs_ioctl(unsigned long cmd, void __user *uparam)
return err;
 }
 
+
+static long
+libcfs_psdev_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
+{
+   if (!capable(CAP_SYS_ADMIN))
+   return -EACCES;
+
+   if (_IOC_TYPE(cmd) != IOC_LIBCFS_TYPE ||
+   _IOC_NR(cmd) < IOC_LIBCFS_MIN_NR  ||
+   _IOC_NR(cmd) > IOC_LIBCFS_MAX_NR) {
+   CDEBUG(D_IOCTL, "invalid ioctl ( type %d, nr %d, size %d )\n",
+  _IOC_TYPE(cmd), _IOC_NR(cmd), _IOC_SIZE(cmd));
+   return -EINVAL;
+   }
+
+   return libcfs_ioctl(cmd, (void __user *)arg);
+}
+
+static const struct file_operations libcfs_fops = {
+   .owner  = THIS_MODULE,
+   .unlocked_ioctl = libcfs_psdev_ioctl,
+};
+
+struct miscdevice libcfs_dev = {
+   .minor = MISC_DYNAMIC_MINOR,
+   .name = "lnet",
+   .fops = _fops,
+};
+
 int lprocfs_call_handler(void *data, int write, loff_t *ppos,
 void __user *buffer, size_t *lenp,
 int (*handler)(void *data, int write, loff_t pos,

[PATCH 09/10] staging: lustre: move remaining code from linux-module.c to module.c

2018-04-30 Thread NeilBrown

There is no longer any need to keep this code separate,
and now we can remove linux-module.c

Signed-off-by: NeilBrown 
---
 .../staging/lustre/include/linux/libcfs/libcfs.h   |4 
 drivers/staging/lustre/lnet/libcfs/Makefile|1 
 .../lustre/lnet/libcfs/linux/linux-module.c|  168 
 drivers/staging/lustre/lnet/libcfs/module.c|  131 
 4 files changed, 131 insertions(+), 173 deletions(-)
 delete mode 100644 drivers/staging/lustre/lnet/libcfs/linux/linux-module.c

diff --git a/drivers/staging/lustre/include/linux/libcfs/libcfs.h 
b/drivers/staging/lustre/include/linux/libcfs/libcfs.h
index 9263e151451b..d420449b620e 100644
--- a/drivers/staging/lustre/include/linux/libcfs/libcfs.h
+++ b/drivers/staging/lustre/include/linux/libcfs/libcfs.h
@@ -138,10 +138,6 @@ struct libcfs_ioctl_handler {
 int libcfs_register_ioctl(struct libcfs_ioctl_handler *hand);
 int libcfs_deregister_ioctl(struct libcfs_ioctl_handler *hand);
 
-int libcfs_ioctl_getdata(struct libcfs_ioctl_hdr **hdr_pp,
-const struct libcfs_ioctl_hdr __user *uparam);
-int libcfs_ioctl_data_adjust(struct libcfs_ioctl_data *data);
-
 #define _LIBCFS_H
 
 /**
diff --git a/drivers/staging/lustre/lnet/libcfs/Makefile 
b/drivers/staging/lustre/lnet/libcfs/Makefile
index e6fda27fdabd..e73515789a11 100644
--- a/drivers/staging/lustre/lnet/libcfs/Makefile
+++ b/drivers/staging/lustre/lnet/libcfs/Makefile
@@ -5,7 +5,6 @@ subdir-ccflags-y += 
-I$(srctree)/drivers/staging/lustre/lustre/include
 obj-$(CONFIG_LNET) += libcfs.o
 
 libcfs-linux-objs := linux-tracefile.o linux-debug.o
-libcfs-linux-objs += linux-module.o
 libcfs-linux-objs += linux-crypto.o
 libcfs-linux-objs += linux-crypto-adler.o
 
diff --git a/drivers/staging/lustre/lnet/libcfs/linux/linux-module.c 
b/drivers/staging/lustre/lnet/libcfs/linux/linux-module.c
deleted file mode 100644
index 954b681f9db7..
--- a/drivers/staging/lustre/lnet/libcfs/linux/linux-module.c
+++ /dev/null
@@ -1,168 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0
-/*
- * GPL HEADER START
- *
- * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 only,
- * as published by the Free Software Foundation.
- *
- * This program is distributed in the hope that it will be useful, but
- * WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
- * General Public License version 2 for more details (a copy is included
- * in the LICENSE file that accompanied this code).
- *
- * You should have received a copy of the GNU General Public License
- * version 2 along with this program; If not, see
- * http://www.gnu.org/licenses/gpl-2.0.html
- *
- * GPL HEADER END
- */
-/*
- * Copyright (c) 2008, 2010, Oracle and/or its affiliates. All rights reserved.
- * Use is subject to license terms.
- *
- * Copyright (c) 2012, Intel Corporation.
- */
-/*
- * This file is part of Lustre, http://www.lustre.org/
- * Lustre is a trademark of Sun Microsystems, Inc.
- */
-
-#define DEBUG_SUBSYSTEM S_LNET
-
-#include 
-#include 
-
-static inline size_t libcfs_ioctl_packlen(struct libcfs_ioctl_data *data)
-{
-   size_t len = sizeof(*data);
-
-   len += cfs_size_round(data->ioc_inllen1);
-   len += cfs_size_round(data->ioc_inllen2);
-   return len;
-}
-
-static inline bool libcfs_ioctl_is_invalid(struct libcfs_ioctl_data *data)
-{
-   if (data->ioc_hdr.ioc_len > BIT(30)) {
-   CERROR("LIBCFS ioctl: ioc_len larger than 1<<30\n");
-   return true;
-   }
-   if (data->ioc_inllen1 > BIT(30)) {
-   CERROR("LIBCFS ioctl: ioc_inllen1 larger than 1<<30\n");
-   return true;
-   }
-   if (data->ioc_inllen2 > BIT(30)) {
-   CERROR("LIBCFS ioctl: ioc_inllen2 larger than 1<<30\n");
-   return true;
-   }
-   if (data->ioc_inlbuf1 && !data->ioc_inllen1) {
-   CERROR("LIBCFS ioctl: inlbuf1 pointer but 0 length\n");
-   return true;
-   }
-   if (data->ioc_inlbuf2 && !data->ioc_inllen2) {
-   CERROR("LIBCFS ioctl: inlbuf2 pointer but 0 length\n");
-   return true;
-   }
-   if (data->ioc_pbuf1 && !data->ioc_plen1) {
-   CERROR("LIBCFS ioctl: pbuf1 pointer but 0 length\n");
-   return true;
-   }
-   if (data->ioc_pbuf2 && !data->ioc_plen2) {
-   CERROR("LIBCFS ioctl: pbuf2 pointer but 0 length\n");
-   return true;
-   }
-   if (data->ioc_plen1 && !data->ioc_pbuf1) {
-   CERROR("LIBCFS ioctl: plen1 nonzero but no pbuf1 pointer\n");
-   return true;
-   }
-   if (data->ioc_plen2 && !data->ioc_pbuf2) {
-   CERROR("LIBCFS ioctl: plen2 nonzero but no pbuf2 pointer\n");
-

[PATCH 07/10] staging: lustre: llite: remove redundant lookup in dump_pgcache

2018-04-30 Thread NeilBrown

Both the 'next' and the 'show' functions for the dump_page_cache
seqfile perform a lookup based on the current file index.  This is
needless duplication.

The reason appears to be that the state that needs to be communicated
from "next" to "show" is two pointers, but seq_file only provides for
a single pointer to be returned from next and passed to show.

So make use of the new 'seq_private' structure to store the extra
pointer.
So when 'next' (or 'start') find something, it returns the page and
stores the clob in the private area.
'show' accepts the page as an argument, and finds the clob where it
was stored.

Signed-off-by: NeilBrown 
---
 drivers/staging/lustre/lustre/llite/vvp_dev.c |   97 +++--
 1 file changed, 41 insertions(+), 56 deletions(-)

diff --git a/drivers/staging/lustre/lustre/llite/vvp_dev.c 
b/drivers/staging/lustre/lustre/llite/vvp_dev.c
index a2619dc04a7f..39a85e967368 100644
--- a/drivers/staging/lustre/lustre/llite/vvp_dev.c
+++ b/drivers/staging/lustre/lustre/llite/vvp_dev.c
@@ -394,6 +394,7 @@ struct seq_private {
struct ll_sb_info   *sbi;
struct lu_env   *env;
u16 refcheck;
+   struct cl_object*clob;
 };
 
 static void vvp_pgcache_id_unpack(loff_t pos, struct vvp_pgcache_id *id)
@@ -458,19 +459,20 @@ static struct cl_object *vvp_pgcache_obj(const struct 
lu_env *env,
return NULL;
 }
 
-static loff_t vvp_pgcache_find(const struct lu_env *env,
-  struct lu_device *dev, loff_t pos)
+static struct page *vvp_pgcache_find(const struct lu_env *env,
+struct lu_device *dev,
+struct cl_object **clobp, loff_t *pos)
 {
struct cl_object *clob;
struct lu_site   *site;
struct vvp_pgcache_id id;
 
site = dev->ld_site;
-   vvp_pgcache_id_unpack(pos, );
+   vvp_pgcache_id_unpack(*pos, );
 
while (1) {
if (id.vpi_bucket >= CFS_HASH_NHLIST(site->ls_obj_hash))
-   return ~0ULL;
+   return NULL;
clob = vvp_pgcache_obj(env, dev, );
if (clob) {
struct inode *inode = vvp_object_inode(clob);
@@ -482,20 +484,22 @@ static loff_t vvp_pgcache_find(const struct lu_env *env,
if (nr > 0) {
id.vpi_index = vmpage->index;
/* Cant support over 16T file */
-   nr = !(vmpage->index > 0x);
+   if (vmpage->index <= 0x) {
+   *clobp = clob;
+   *pos = vvp_pgcache_id_pack();
+   return vmpage;
+   }
put_page(vmpage);
}
 
lu_object_ref_del(>co_lu, "dump", current);
cl_object_put(env, clob);
-   if (nr > 0)
-   return vvp_pgcache_id_pack();
}
/* to the next object. */
++id.vpi_depth;
id.vpi_depth &= 0xf;
if (id.vpi_depth == 0 && ++id.vpi_bucket == 0)
-   return ~0ULL;
+   return NULL;
id.vpi_index = 0;
}
 }
@@ -538,71 +542,52 @@ static void vvp_pgcache_page_show(const struct lu_env 
*env,
 static int vvp_pgcache_show(struct seq_file *f, void *v)
 {
struct seq_private  *priv = f->private;
-   loff_t pos;
-   struct cl_object*clob;
-   struct vvp_pgcache_idid;
-
-   pos = *(loff_t *)v;
-   vvp_pgcache_id_unpack(pos, );
-   clob = vvp_pgcache_obj(priv->env, >sbi->ll_cl->cd_lu_dev, );
-   if (clob) {
-   struct inode *inode = vvp_object_inode(clob);
-   struct cl_page *page = NULL;
-   struct page *vmpage;
-   int result;
-
-   result = find_get_pages_contig(inode->i_mapping,
-  id.vpi_index, 1,
-  );
-   if (result > 0) {
-   lock_page(vmpage);
-   page = cl_vmpage_page(vmpage, clob);
-   unlock_page(vmpage);
-   put_page(vmpage);
-   }
-
-   seq_printf(f, "%8x@" DFID ": ", id.vpi_index,
-  PFID(lu_object_fid(>co_lu)));
-   if (page) {
-   vvp_pgcache_page_show(priv->env, f, page);
-   cl_page_put(priv->env, page);
-   } else {
-   seq_puts(f, "missing\n");
-   }
-   lu_object_ref_del(>co_lu, "dump", current);
-

[PATCH 08/10] staging: lustre: move misc-device registration closer to related code.

2018-04-30 Thread NeilBrown

The ioctl handler for the misc device is in  lnet/libcfs/module.c
but is it registered in lnet/libcfs/linux/linux-module.c.

Keeping related code together make maintenance easier, so move the
code.

Signed-off-by: NeilBrown 
---
 .../staging/lustre/include/linux/libcfs/libcfs.h   |2 -
 .../lustre/lnet/libcfs/linux/linux-module.c|   28 --
 drivers/staging/lustre/lnet/libcfs/module.c|   31 +++-
 3 files changed, 30 insertions(+), 31 deletions(-)

diff --git a/drivers/staging/lustre/include/linux/libcfs/libcfs.h 
b/drivers/staging/lustre/include/linux/libcfs/libcfs.h
index 6e7754b2f296..9263e151451b 100644
--- a/drivers/staging/lustre/include/linux/libcfs/libcfs.h
+++ b/drivers/staging/lustre/include/linux/libcfs/libcfs.h
@@ -141,11 +141,9 @@ int libcfs_deregister_ioctl(struct libcfs_ioctl_handler 
*hand);
 int libcfs_ioctl_getdata(struct libcfs_ioctl_hdr **hdr_pp,
 const struct libcfs_ioctl_hdr __user *uparam);
 int libcfs_ioctl_data_adjust(struct libcfs_ioctl_data *data);
-int libcfs_ioctl(unsigned long cmd, void __user *arg);
 
 #define _LIBCFS_H
 
-extern struct miscdevice libcfs_dev;
 /**
  * The path of debug log dump upcall script.
  */
diff --git a/drivers/staging/lustre/lnet/libcfs/linux/linux-module.c 
b/drivers/staging/lustre/lnet/libcfs/linux/linux-module.c
index c8908e816c4c..954b681f9db7 100644
--- a/drivers/staging/lustre/lnet/libcfs/linux/linux-module.c
+++ b/drivers/staging/lustre/lnet/libcfs/linux/linux-module.c
@@ -166,31 +166,3 @@ int libcfs_ioctl_getdata(struct libcfs_ioctl_hdr **hdr_pp,
kvfree(*hdr_pp);
return err;
 }
-
-static long
-libcfs_psdev_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
-{
-   if (!capable(CAP_SYS_ADMIN))
-   return -EACCES;
-
-   if (_IOC_TYPE(cmd) != IOC_LIBCFS_TYPE ||
-   _IOC_NR(cmd) < IOC_LIBCFS_MIN_NR  ||
-   _IOC_NR(cmd) > IOC_LIBCFS_MAX_NR) {
-   CDEBUG(D_IOCTL, "invalid ioctl ( type %d, nr %d, size %d )\n",
-  _IOC_TYPE(cmd), _IOC_NR(cmd), _IOC_SIZE(cmd));
-   return -EINVAL;
-   }
-
-   return libcfs_ioctl(cmd, (void __user *)arg);
-}
-
-static const struct file_operations libcfs_fops = {
-   .owner  = THIS_MODULE,
-   .unlocked_ioctl = libcfs_psdev_ioctl,
-};
-
-struct miscdevice libcfs_dev = {
-   .minor = MISC_DYNAMIC_MINOR,
-   .name = "lnet",
-   .fops = _fops,
-};
diff --git a/drivers/staging/lustre/lnet/libcfs/module.c 
b/drivers/staging/lustre/lnet/libcfs/module.c
index 4b9acd7bc5cf..3fb150a57f49 100644
--- a/drivers/staging/lustre/lnet/libcfs/module.c
+++ b/drivers/staging/lustre/lnet/libcfs/module.c
@@ -95,7 +95,7 @@ int libcfs_deregister_ioctl(struct libcfs_ioctl_handler *hand)
 }
 EXPORT_SYMBOL(libcfs_deregister_ioctl);
 
-int libcfs_ioctl(unsigned long cmd, void __user *uparam)
+static int libcfs_ioctl(unsigned long cmd, void __user *uparam)
 {
struct libcfs_ioctl_data *data = NULL;
struct libcfs_ioctl_hdr *hdr;
@@ -161,6 +161,35 @@ int libcfs_ioctl(unsigned long cmd, void __user *uparam)
return err;
 }
 
+
+static long
+libcfs_psdev_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
+{
+   if (!capable(CAP_SYS_ADMIN))
+   return -EACCES;
+
+   if (_IOC_TYPE(cmd) != IOC_LIBCFS_TYPE ||
+   _IOC_NR(cmd) < IOC_LIBCFS_MIN_NR  ||
+   _IOC_NR(cmd) > IOC_LIBCFS_MAX_NR) {
+   CDEBUG(D_IOCTL, "invalid ioctl ( type %d, nr %d, size %d )\n",
+  _IOC_TYPE(cmd), _IOC_NR(cmd), _IOC_SIZE(cmd));
+   return -EINVAL;
+   }
+
+   return libcfs_ioctl(cmd, (void __user *)arg);
+}
+
+static const struct file_operations libcfs_fops = {
+   .owner  = THIS_MODULE,
+   .unlocked_ioctl = libcfs_psdev_ioctl,
+};
+
+struct miscdevice libcfs_dev = {
+   .minor = MISC_DYNAMIC_MINOR,
+   .name = "lnet",
+   .fops = _fops,
+};
+
 int lprocfs_call_handler(void *data, int write, loff_t *ppos,
 void __user *buffer, size_t *lenp,
 int (*handler)(void *data, int write, loff_t pos,

[PATCH 05/10] staging: lustre: fold lu_object_new() into lu_object_find_at()

2018-04-30 Thread NeilBrown

lu_object_new() duplicates a lot of code that is in
lu_object_find_at().
There is no real need for a separate function, it is simpler just
to skip the bits of lu_object_find_at() that we don't
want in the LOC_F_NEW case.

Signed-off-by: NeilBrown 
---
 drivers/staging/lustre/lustre/obdclass/lu_object.c |   44 +---
 1 file changed, 12 insertions(+), 32 deletions(-)

diff --git a/drivers/staging/lustre/lustre/obdclass/lu_object.c 
b/drivers/staging/lustre/lustre/obdclass/lu_object.c
index 93daa52e2535..9721b3af8ea8 100644
--- a/drivers/staging/lustre/lustre/obdclass/lu_object.c
+++ b/drivers/staging/lustre/lustre/obdclass/lu_object.c
@@ -678,29 +678,6 @@ static void lu_object_limit(const struct lu_env *env, 
struct lu_device *dev)
  false);
 }
 
-static struct lu_object *lu_object_new(const struct lu_env *env,
-  struct lu_device *dev,
-  const struct lu_fid *f,
-  const struct lu_object_conf *conf)
-{
-   struct lu_object*o;
-   struct cfs_hash   *hs;
-   struct cfs_hash_bd  bd;
-
-   o = lu_object_alloc(env, dev, f, conf);
-   if (IS_ERR(o))
-   return o;
-
-   hs = dev->ld_site->ls_obj_hash;
-   cfs_hash_bd_get_and_lock(hs, (void *)f, , 1);
-   cfs_hash_bd_add_locked(hs, , >lo_header->loh_hash);
-   cfs_hash_bd_unlock(hs, , 1);
-
-   lu_object_limit(env, dev);
-
-   return o;
-}
-
 /**
  * Much like lu_object_find(), but top level device of object is specifically
  * \a dev rather than top level device of the site. This interface allows
@@ -736,18 +713,18 @@ struct lu_object *lu_object_find_at(const struct lu_env 
*env,
 * just alloc and insert directly.
 *
 */
-   if (conf && conf->loc_flags & LOC_F_NEW)
-   return lu_object_new(env, dev, f, conf);
-
s  = dev->ld_site;
hs = s->ls_obj_hash;
-   cfs_hash_bd_get_and_lock(hs, (void *)f, , 0);
-   o = htable_lookup(s, , f, );
-   cfs_hash_bd_unlock(hs, , 0);
 
-   if (!IS_ERR(o) || PTR_ERR(o) != -ENOENT)
-   return o;
+   cfs_hash_bd_get(hs, f, );
+   if (!(conf && conf->loc_flags & LOC_F_NEW)) {
+   cfs_hash_bd_lock(hs, , 0);
+   o = htable_lookup(s, , f, );
+   cfs_hash_bd_unlock(hs, , 0);
 
+   if (!IS_ERR(o) || PTR_ERR(o) != -ENOENT)
+   return o;
+   }
/*
 * Allocate new object. This may result in rather complicated
 * operations, including fld queries, inode loading, etc.
@@ -760,7 +737,10 @@ struct lu_object *lu_object_find_at(const struct lu_env 
*env,
 
cfs_hash_bd_lock(hs, , 1);
 
-   shadow = htable_lookup(s, , f, );
+   if (conf && conf->loc_flags & LOC_F_NEW)
+   shadow = ERR_PTR(-ENOENT);
+   else
+   shadow = htable_lookup(s, , f, );
if (likely(PTR_ERR(shadow) == -ENOENT)) {
cfs_hash_bd_add_locked(hs, , >lo_header->loh_hash);
cfs_hash_bd_unlock(hs, , 1);

[PATCH 05/10] staging: lustre: fold lu_object_new() into lu_object_find_at()

2018-04-30 Thread NeilBrown

lu_object_new() duplicates a lot of code that is in
lu_object_find_at().
There is no real need for a separate function, it is simpler just
to skip the bits of lu_object_find_at() that we don't
want in the LOC_F_NEW case.

Signed-off-by: NeilBrown 
---
 drivers/staging/lustre/lustre/obdclass/lu_object.c |   44 +---
 1 file changed, 12 insertions(+), 32 deletions(-)

diff --git a/drivers/staging/lustre/lustre/obdclass/lu_object.c 
b/drivers/staging/lustre/lustre/obdclass/lu_object.c
index 93daa52e2535..9721b3af8ea8 100644
--- a/drivers/staging/lustre/lustre/obdclass/lu_object.c
+++ b/drivers/staging/lustre/lustre/obdclass/lu_object.c
@@ -678,29 +678,6 @@ static void lu_object_limit(const struct lu_env *env, 
struct lu_device *dev)
  false);
 }
 
-static struct lu_object *lu_object_new(const struct lu_env *env,
-  struct lu_device *dev,
-  const struct lu_fid *f,
-  const struct lu_object_conf *conf)
-{
-   struct lu_object*o;
-   struct cfs_hash   *hs;
-   struct cfs_hash_bd  bd;
-
-   o = lu_object_alloc(env, dev, f, conf);
-   if (IS_ERR(o))
-   return o;
-
-   hs = dev->ld_site->ls_obj_hash;
-   cfs_hash_bd_get_and_lock(hs, (void *)f, , 1);
-   cfs_hash_bd_add_locked(hs, , >lo_header->loh_hash);
-   cfs_hash_bd_unlock(hs, , 1);
-
-   lu_object_limit(env, dev);
-
-   return o;
-}
-
 /**
  * Much like lu_object_find(), but top level device of object is specifically
  * \a dev rather than top level device of the site. This interface allows
@@ -736,18 +713,18 @@ struct lu_object *lu_object_find_at(const struct lu_env 
*env,
 * just alloc and insert directly.
 *
 */
-   if (conf && conf->loc_flags & LOC_F_NEW)
-   return lu_object_new(env, dev, f, conf);
-
s  = dev->ld_site;
hs = s->ls_obj_hash;
-   cfs_hash_bd_get_and_lock(hs, (void *)f, , 0);
-   o = htable_lookup(s, , f, );
-   cfs_hash_bd_unlock(hs, , 0);
 
-   if (!IS_ERR(o) || PTR_ERR(o) != -ENOENT)
-   return o;
+   cfs_hash_bd_get(hs, f, );
+   if (!(conf && conf->loc_flags & LOC_F_NEW)) {
+   cfs_hash_bd_lock(hs, , 0);
+   o = htable_lookup(s, , f, );
+   cfs_hash_bd_unlock(hs, , 0);
 
+   if (!IS_ERR(o) || PTR_ERR(o) != -ENOENT)
+   return o;
+   }
/*
 * Allocate new object. This may result in rather complicated
 * operations, including fld queries, inode loading, etc.
@@ -760,7 +737,10 @@ struct lu_object *lu_object_find_at(const struct lu_env 
*env,
 
cfs_hash_bd_lock(hs, , 1);
 
-   shadow = htable_lookup(s, , f, );
+   if (conf && conf->loc_flags & LOC_F_NEW)
+   shadow = ERR_PTR(-ENOENT);
+   else
+   shadow = htable_lookup(s, , f, );
if (likely(PTR_ERR(shadow) == -ENOENT)) {
cfs_hash_bd_add_locked(hs, , >lo_header->loh_hash);
cfs_hash_bd_unlock(hs, , 1);

[PATCH 02/10] staging: lustre: make struct lu_site_bkt_data private

2018-04-30 Thread NeilBrown

This data structure only needs to be public so that
various modules can access a wait queue to wait for object
destruction.
If we provide a function to get the wait queue, rather than the
whole bucket, the structure can be made private.

Signed-off-by: NeilBrown 
---
 drivers/staging/lustre/lustre/include/lu_object.h  |   36 +-
 drivers/staging/lustre/lustre/llite/lcommon_cl.c   |8 ++-
 drivers/staging/lustre/lustre/lov/lov_object.c |8 ++-
 drivers/staging/lustre/lustre/obdclass/lu_object.c |   50 +---
 4 files changed, 54 insertions(+), 48 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/lu_object.h 
b/drivers/staging/lustre/lustre/include/lu_object.h
index c3b0ed518819..f29bbca5af65 100644
--- a/drivers/staging/lustre/lustre/include/lu_object.h
+++ b/drivers/staging/lustre/lustre/include/lu_object.h
@@ -549,31 +549,7 @@ struct lu_object_header {
 };
 
 struct fld;
-
-struct lu_site_bkt_data {
-   /**
-* number of object in this bucket on the lsb_lru list.
-*/
-   longlsb_lru_len;
-   /**
-* LRU list, updated on each access to object. Protected by
-* bucket lock of lu_site::ls_obj_hash.
-*
-* "Cold" end of LRU is lu_site::ls_lru.next. Accessed object are
-* moved to the lu_site::ls_lru.prev (this is due to the non-existence
-* of list_for_each_entry_safe_reverse()).
-*/
-   struct list_headlsb_lru;
-   /**
-* Wait-queue signaled when an object in this site is ultimately
-* destroyed (lu_object_free()). It is used by lu_object_find() to
-* wait before re-trying when object in the process of destruction is
-* found in the hash table.
-*
-* \see htable_lookup().
-*/
-   wait_queue_head_t  lsb_marche_funebre;
-};
+struct lu_site_bkt_data;
 
 enum {
LU_SS_CREATED= 0,
@@ -642,14 +618,8 @@ struct lu_site {
struct percpu_counterls_lru_len_counter;
 };
 
-static inline struct lu_site_bkt_data *
-lu_site_bkt_from_fid(struct lu_site *site, struct lu_fid *fid)
-{
-   struct cfs_hash_bd bd;
-
-   cfs_hash_bd_get(site->ls_obj_hash, fid, );
-   return cfs_hash_bd_extra_get(site->ls_obj_hash, );
-}
+wait_queue_head_t *
+lu_site_wq_from_fid(struct lu_site *site, struct lu_fid *fid);
 
 static inline struct seq_server_site *lu_site2seq(const struct lu_site *s)
 {
diff --git a/drivers/staging/lustre/lustre/llite/lcommon_cl.c 
b/drivers/staging/lustre/lustre/llite/lcommon_cl.c
index df5c0c0ae703..d5b42fb1d601 100644
--- a/drivers/staging/lustre/lustre/llite/lcommon_cl.c
+++ b/drivers/staging/lustre/lustre/llite/lcommon_cl.c
@@ -211,12 +211,12 @@ static void cl_object_put_last(struct lu_env *env, struct 
cl_object *obj)
 
if (unlikely(atomic_read(>loh_ref) != 1)) {
struct lu_site *site = obj->co_lu.lo_dev->ld_site;
-   struct lu_site_bkt_data *bkt;
+   wait_queue_head_t *wq;
 
-   bkt = lu_site_bkt_from_fid(site, >loh_fid);
+   wq = lu_site_wq_from_fid(site, >loh_fid);
 
init_waitqueue_entry(, current);
-   add_wait_queue(>lsb_marche_funebre, );
+   add_wait_queue(wq, );
 
while (1) {
set_current_state(TASK_UNINTERRUPTIBLE);
@@ -226,7 +226,7 @@ static void cl_object_put_last(struct lu_env *env, struct 
cl_object *obj)
}
 
set_current_state(TASK_RUNNING);
-   remove_wait_queue(>lsb_marche_funebre, );
+   remove_wait_queue(wq, );
}
 
cl_object_put(env, obj);
diff --git a/drivers/staging/lustre/lustre/lov/lov_object.c 
b/drivers/staging/lustre/lustre/lov/lov_object.c
index f7c69680cb7d..adc90f310fd7 100644
--- a/drivers/staging/lustre/lustre/lov/lov_object.c
+++ b/drivers/staging/lustre/lustre/lov/lov_object.c
@@ -370,7 +370,7 @@ static void lov_subobject_kill(const struct lu_env *env, 
struct lov_object *lov,
struct cl_object*sub;
struct lov_layout_raid0 *r0;
struct lu_site*site;
-   struct lu_site_bkt_data *bkt;
+   wait_queue_head_t *wq;
wait_queue_entry_t*waiter;
 
r0  = >u.raid0;
@@ -378,7 +378,7 @@ static void lov_subobject_kill(const struct lu_env *env, 
struct lov_object *lov,
 
sub  = lovsub2cl(los);
site = sub->co_lu.lo_dev->ld_site;
-   bkt  = lu_site_bkt_from_fid(site, >co_lu.lo_header->loh_fid);
+   wq   = lu_site_wq_from_fid(site, >co_lu.lo_header->loh_fid);
 
cl_object_kill(env, sub);
/* release a reference to the sub-object and ... */
@@ -391,7 +391,7 @@ static void lov_subobject_kill(const struct lu_env *env, 
struct lov_object *lov,
if (r0->lo_sub[idx] == los) {
waiter = _env_info(env)->lti_waiter;
init_waitqueue_entry(waiter, current);
-

[PATCH 06/10] staging: lustre: llite: use more private data in dump_pgcache

2018-04-30 Thread NeilBrown

The dump_page_cache debugfs file allocates and frees an 'env' in each
call to vvp_pgcache_start,next,show.  This is likely to be fast, but
does introduce the need to check for errors.

It is reasonable to allocate a single 'env' when the file is opened,
and use that throughout.

So create 'seq_private' structure which stores the sbi, env, and
refcheck, and attach this to the seqfile.

Then use it throughout instead of allocating 'env' repeatedly.

Signed-off-by: NeilBrown 
---
 drivers/staging/lustre/lustre/llite/vvp_dev.c |  150 -
 1 file changed, 72 insertions(+), 78 deletions(-)

diff --git a/drivers/staging/lustre/lustre/llite/vvp_dev.c 
b/drivers/staging/lustre/lustre/llite/vvp_dev.c
index 987c03b058e6..a2619dc04a7f 100644
--- a/drivers/staging/lustre/lustre/llite/vvp_dev.c
+++ b/drivers/staging/lustre/lustre/llite/vvp_dev.c
@@ -390,6 +390,12 @@ struct vvp_pgcache_id {
struct lu_object_header *vpi_obj;
 };
 
+struct seq_private {
+   struct ll_sb_info   *sbi;
+   struct lu_env   *env;
+   u16 refcheck;
+};
+
 static void vvp_pgcache_id_unpack(loff_t pos, struct vvp_pgcache_id *id)
 {
BUILD_BUG_ON(sizeof(pos) != sizeof(__u64));
@@ -531,95 +537,71 @@ static void vvp_pgcache_page_show(const struct lu_env 
*env,
 
 static int vvp_pgcache_show(struct seq_file *f, void *v)
 {
+   struct seq_private  *priv = f->private;
loff_t pos;
-   struct ll_sb_info   *sbi;
struct cl_object*clob;
-   struct lu_env  *env;
struct vvp_pgcache_idid;
-   u16 refcheck;
-   int   result;
 
-   env = cl_env_get();
-   if (!IS_ERR(env)) {
-   pos = *(loff_t *)v;
-   vvp_pgcache_id_unpack(pos, );
-   sbi = f->private;
-   clob = vvp_pgcache_obj(env, >ll_cl->cd_lu_dev, );
-   if (clob) {
-   struct inode *inode = vvp_object_inode(clob);
-   struct cl_page *page = NULL;
-   struct page *vmpage;
-
-   result = find_get_pages_contig(inode->i_mapping,
-  id.vpi_index, 1,
-  );
-   if (result > 0) {
-   lock_page(vmpage);
-   page = cl_vmpage_page(vmpage, clob);
-   unlock_page(vmpage);
-   put_page(vmpage);
-   }
+   pos = *(loff_t *)v;
+   vvp_pgcache_id_unpack(pos, );
+   clob = vvp_pgcache_obj(priv->env, >sbi->ll_cl->cd_lu_dev, );
+   if (clob) {
+   struct inode *inode = vvp_object_inode(clob);
+   struct cl_page *page = NULL;
+   struct page *vmpage;
+   int result;
+
+   result = find_get_pages_contig(inode->i_mapping,
+  id.vpi_index, 1,
+  );
+   if (result > 0) {
+   lock_page(vmpage);
+   page = cl_vmpage_page(vmpage, clob);
+   unlock_page(vmpage);
+   put_page(vmpage);
+   }
 
-   seq_printf(f, "%8x@" DFID ": ", id.vpi_index,
-  PFID(lu_object_fid(>co_lu)));
-   if (page) {
-   vvp_pgcache_page_show(env, f, page);
-   cl_page_put(env, page);
-   } else {
-   seq_puts(f, "missing\n");
-   }
-   lu_object_ref_del(>co_lu, "dump", current);
-   cl_object_put(env, clob);
+   seq_printf(f, "%8x@" DFID ": ", id.vpi_index,
+  PFID(lu_object_fid(>co_lu)));
+   if (page) {
+   vvp_pgcache_page_show(priv->env, f, page);
+   cl_page_put(priv->env, page);
} else {
-   seq_printf(f, "%llx missing\n", pos);
+   seq_puts(f, "missing\n");
}
-   cl_env_put(env, );
-   result = 0;
+   lu_object_ref_del(>co_lu, "dump", current);
+   cl_object_put(priv->env, clob);
} else {
-   result = PTR_ERR(env);
+   seq_printf(f, "%llx missing\n", pos);
}
-   return result;
+   return 0;
 }
 
 static void *vvp_pgcache_start(struct seq_file *f, loff_t *pos)
 {
-   struct ll_sb_info *sbi;
-   struct lu_env *env;
-   u16 refcheck;
-
-   sbi = f->private;
+   struct seq_private  *priv = f->private;
 
-   env = cl_env_get();
-   if (!IS_ERR(env)) {
-   sbi = f->private;
-

[PATCH 04/10] staging: lustre: lu_object: move retry logic inside htable_lookup

2018-04-30 Thread NeilBrown

The current retry logic, to wait when a 'dying' object is found,
spans multiple functions.  The process is attached to a waitqueue
and set TASK_UNINTERRUPTIBLE in htable_lookup, and this status
is passed back through lu_object_find_try() to lu_object_find_at()
where schedule() is called and the process is removed from the queue.

This can be simplified by moving all the logic (including
hashtable locking) inside htable_lookup(), which now never returns
EAGAIN.

Note that htable_lookup() is called with the hash bucket lock
held, and will drop and retake it if it needs to schedule.

I made this a 'goto' loop rather than a 'while(1)' loop as the
diff is easier to read.

Signed-off-by: NeilBrown 
---
 drivers/staging/lustre/lustre/obdclass/lu_object.c |   73 +++-
 1 file changed, 27 insertions(+), 46 deletions(-)

diff --git a/drivers/staging/lustre/lustre/obdclass/lu_object.c 
b/drivers/staging/lustre/lustre/obdclass/lu_object.c
index 2bf089817157..93daa52e2535 100644
--- a/drivers/staging/lustre/lustre/obdclass/lu_object.c
+++ b/drivers/staging/lustre/lustre/obdclass/lu_object.c
@@ -586,16 +586,21 @@ EXPORT_SYMBOL(lu_object_print);
 static struct lu_object *htable_lookup(struct lu_site *s,
   struct cfs_hash_bd *bd,
   const struct lu_fid *f,
-  wait_queue_entry_t *waiter,
   __u64 *version)
 {
+   struct cfs_hash *hs = s->ls_obj_hash;
struct lu_site_bkt_data *bkt;
struct lu_object_header *h;
struct hlist_node   *hnode;
-   __u64  ver = cfs_hash_bd_version_get(bd);
+   __u64 ver;
+   wait_queue_entry_t waiter;
 
-   if (*version == ver)
+retry:
+   ver = cfs_hash_bd_version_get(bd);
+
+   if (*version == ver) {
return ERR_PTR(-ENOENT);
+   }
 
*version = ver;
bkt = cfs_hash_bd_extra_get(s->ls_obj_hash, bd);
@@ -625,11 +630,15 @@ static struct lu_object *htable_lookup(struct lu_site *s,
 * drained), and moreover, lookup has to wait until object is freed.
 */
 
-   init_waitqueue_entry(waiter, current);
-   add_wait_queue(>lsb_marche_funebre, waiter);
+   init_waitqueue_entry(, current);
+   add_wait_queue(>lsb_marche_funebre, );
set_current_state(TASK_UNINTERRUPTIBLE);
lprocfs_counter_incr(s->ls_stats, LU_SS_CACHE_DEATH_RACE);
-   return ERR_PTR(-EAGAIN);
+   cfs_hash_bd_unlock(hs, bd, 1);
+   schedule();
+   remove_wait_queue(>lsb_marche_funebre, );
+   cfs_hash_bd_lock(hs, bd, 1);
+   goto retry;
 }
 
 /**
@@ -693,13 +702,14 @@ static struct lu_object *lu_object_new(const struct 
lu_env *env,
 }
 
 /**
- * Core logic of lu_object_find*() functions.
+ * Much like lu_object_find(), but top level device of object is specifically
+ * \a dev rather than top level device of the site. This interface allows
+ * objects of different "stacking" to be created within the same site.
  */
-static struct lu_object *lu_object_find_try(const struct lu_env *env,
-   struct lu_device *dev,
-   const struct lu_fid *f,
-   const struct lu_object_conf *conf,
-   wait_queue_entry_t *waiter)
+struct lu_object *lu_object_find_at(const struct lu_env *env,
+   struct lu_device *dev,
+   const struct lu_fid *f,
+   const struct lu_object_conf *conf)
 {
struct lu_object  *o;
struct lu_object  *shadow;
@@ -725,17 +735,16 @@ static struct lu_object *lu_object_find_try(const struct 
lu_env *env,
 * It is unnecessary to perform lookup-alloc-lookup-insert, instead,
 * just alloc and insert directly.
 *
-* If dying object is found during index search, add @waiter to the
-* site wait-queue and return ERR_PTR(-EAGAIN).
 */
if (conf && conf->loc_flags & LOC_F_NEW)
return lu_object_new(env, dev, f, conf);
 
s  = dev->ld_site;
hs = s->ls_obj_hash;
-   cfs_hash_bd_get_and_lock(hs, (void *)f, , 1);
-   o = htable_lookup(s, , f, waiter, );
-   cfs_hash_bd_unlock(hs, , 1);
+   cfs_hash_bd_get_and_lock(hs, (void *)f, , 0);
+   o = htable_lookup(s, , f, );
+   cfs_hash_bd_unlock(hs, , 0);
+
if (!IS_ERR(o) || PTR_ERR(o) != -ENOENT)
return o;
 
@@ -751,7 +760,7 @@ static struct lu_object *lu_object_find_try(const struct 
lu_env *env,
 
cfs_hash_bd_lock(hs, , 1);
 
-   shadow = htable_lookup(s, , f, waiter, );
+   shadow = htable_lookup(s, , f, );
if (likely(PTR_ERR(shadow) == -ENOENT)) {
cfs_hash_bd_add_locked(hs, , >lo_header->loh_hash);

[PATCH 02/10] staging: lustre: make struct lu_site_bkt_data private

2018-04-30 Thread NeilBrown

This data structure only needs to be public so that
various modules can access a wait queue to wait for object
destruction.
If we provide a function to get the wait queue, rather than the
whole bucket, the structure can be made private.

Signed-off-by: NeilBrown 
---
 drivers/staging/lustre/lustre/include/lu_object.h  |   36 +-
 drivers/staging/lustre/lustre/llite/lcommon_cl.c   |8 ++-
 drivers/staging/lustre/lustre/lov/lov_object.c |8 ++-
 drivers/staging/lustre/lustre/obdclass/lu_object.c |   50 +---
 4 files changed, 54 insertions(+), 48 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/lu_object.h 
b/drivers/staging/lustre/lustre/include/lu_object.h
index c3b0ed518819..f29bbca5af65 100644
--- a/drivers/staging/lustre/lustre/include/lu_object.h
+++ b/drivers/staging/lustre/lustre/include/lu_object.h
@@ -549,31 +549,7 @@ struct lu_object_header {
 };
 
 struct fld;
-
-struct lu_site_bkt_data {
-   /**
-* number of object in this bucket on the lsb_lru list.
-*/
-   longlsb_lru_len;
-   /**
-* LRU list, updated on each access to object. Protected by
-* bucket lock of lu_site::ls_obj_hash.
-*
-* "Cold" end of LRU is lu_site::ls_lru.next. Accessed object are
-* moved to the lu_site::ls_lru.prev (this is due to the non-existence
-* of list_for_each_entry_safe_reverse()).
-*/
-   struct list_headlsb_lru;
-   /**
-* Wait-queue signaled when an object in this site is ultimately
-* destroyed (lu_object_free()). It is used by lu_object_find() to
-* wait before re-trying when object in the process of destruction is
-* found in the hash table.
-*
-* \see htable_lookup().
-*/
-   wait_queue_head_t  lsb_marche_funebre;
-};
+struct lu_site_bkt_data;
 
 enum {
LU_SS_CREATED= 0,
@@ -642,14 +618,8 @@ struct lu_site {
struct percpu_counterls_lru_len_counter;
 };
 
-static inline struct lu_site_bkt_data *
-lu_site_bkt_from_fid(struct lu_site *site, struct lu_fid *fid)
-{
-   struct cfs_hash_bd bd;
-
-   cfs_hash_bd_get(site->ls_obj_hash, fid, );
-   return cfs_hash_bd_extra_get(site->ls_obj_hash, );
-}
+wait_queue_head_t *
+lu_site_wq_from_fid(struct lu_site *site, struct lu_fid *fid);
 
 static inline struct seq_server_site *lu_site2seq(const struct lu_site *s)
 {
diff --git a/drivers/staging/lustre/lustre/llite/lcommon_cl.c 
b/drivers/staging/lustre/lustre/llite/lcommon_cl.c
index df5c0c0ae703..d5b42fb1d601 100644
--- a/drivers/staging/lustre/lustre/llite/lcommon_cl.c
+++ b/drivers/staging/lustre/lustre/llite/lcommon_cl.c
@@ -211,12 +211,12 @@ static void cl_object_put_last(struct lu_env *env, struct 
cl_object *obj)
 
if (unlikely(atomic_read(>loh_ref) != 1)) {
struct lu_site *site = obj->co_lu.lo_dev->ld_site;
-   struct lu_site_bkt_data *bkt;
+   wait_queue_head_t *wq;
 
-   bkt = lu_site_bkt_from_fid(site, >loh_fid);
+   wq = lu_site_wq_from_fid(site, >loh_fid);
 
init_waitqueue_entry(, current);
-   add_wait_queue(>lsb_marche_funebre, );
+   add_wait_queue(wq, );
 
while (1) {
set_current_state(TASK_UNINTERRUPTIBLE);
@@ -226,7 +226,7 @@ static void cl_object_put_last(struct lu_env *env, struct 
cl_object *obj)
}
 
set_current_state(TASK_RUNNING);
-   remove_wait_queue(>lsb_marche_funebre, );
+   remove_wait_queue(wq, );
}
 
cl_object_put(env, obj);
diff --git a/drivers/staging/lustre/lustre/lov/lov_object.c 
b/drivers/staging/lustre/lustre/lov/lov_object.c
index f7c69680cb7d..adc90f310fd7 100644
--- a/drivers/staging/lustre/lustre/lov/lov_object.c
+++ b/drivers/staging/lustre/lustre/lov/lov_object.c
@@ -370,7 +370,7 @@ static void lov_subobject_kill(const struct lu_env *env, 
struct lov_object *lov,
struct cl_object*sub;
struct lov_layout_raid0 *r0;
struct lu_site*site;
-   struct lu_site_bkt_data *bkt;
+   wait_queue_head_t *wq;
wait_queue_entry_t*waiter;
 
r0  = >u.raid0;
@@ -378,7 +378,7 @@ static void lov_subobject_kill(const struct lu_env *env, 
struct lov_object *lov,
 
sub  = lovsub2cl(los);
site = sub->co_lu.lo_dev->ld_site;
-   bkt  = lu_site_bkt_from_fid(site, >co_lu.lo_header->loh_fid);
+   wq   = lu_site_wq_from_fid(site, >co_lu.lo_header->loh_fid);
 
cl_object_kill(env, sub);
/* release a reference to the sub-object and ... */
@@ -391,7 +391,7 @@ static void lov_subobject_kill(const struct lu_env *env, 
struct lov_object *lov,
if (r0->lo_sub[idx] == los) {
waiter = _env_info(env)->lti_waiter;
init_waitqueue_entry(waiter, current);
-

[PATCH 06/10] staging: lustre: llite: use more private data in dump_pgcache

2018-04-30 Thread NeilBrown

The dump_page_cache debugfs file allocates and frees an 'env' in each
call to vvp_pgcache_start,next,show.  This is likely to be fast, but
does introduce the need to check for errors.

It is reasonable to allocate a single 'env' when the file is opened,
and use that throughout.

So create 'seq_private' structure which stores the sbi, env, and
refcheck, and attach this to the seqfile.

Then use it throughout instead of allocating 'env' repeatedly.

Signed-off-by: NeilBrown 
---
 drivers/staging/lustre/lustre/llite/vvp_dev.c |  150 -
 1 file changed, 72 insertions(+), 78 deletions(-)

diff --git a/drivers/staging/lustre/lustre/llite/vvp_dev.c 
b/drivers/staging/lustre/lustre/llite/vvp_dev.c
index 987c03b058e6..a2619dc04a7f 100644
--- a/drivers/staging/lustre/lustre/llite/vvp_dev.c
+++ b/drivers/staging/lustre/lustre/llite/vvp_dev.c
@@ -390,6 +390,12 @@ struct vvp_pgcache_id {
struct lu_object_header *vpi_obj;
 };
 
+struct seq_private {
+   struct ll_sb_info   *sbi;
+   struct lu_env   *env;
+   u16 refcheck;
+};
+
 static void vvp_pgcache_id_unpack(loff_t pos, struct vvp_pgcache_id *id)
 {
BUILD_BUG_ON(sizeof(pos) != sizeof(__u64));
@@ -531,95 +537,71 @@ static void vvp_pgcache_page_show(const struct lu_env 
*env,
 
 static int vvp_pgcache_show(struct seq_file *f, void *v)
 {
+   struct seq_private  *priv = f->private;
loff_t pos;
-   struct ll_sb_info   *sbi;
struct cl_object*clob;
-   struct lu_env  *env;
struct vvp_pgcache_idid;
-   u16 refcheck;
-   int   result;
 
-   env = cl_env_get();
-   if (!IS_ERR(env)) {
-   pos = *(loff_t *)v;
-   vvp_pgcache_id_unpack(pos, );
-   sbi = f->private;
-   clob = vvp_pgcache_obj(env, >ll_cl->cd_lu_dev, );
-   if (clob) {
-   struct inode *inode = vvp_object_inode(clob);
-   struct cl_page *page = NULL;
-   struct page *vmpage;
-
-   result = find_get_pages_contig(inode->i_mapping,
-  id.vpi_index, 1,
-  );
-   if (result > 0) {
-   lock_page(vmpage);
-   page = cl_vmpage_page(vmpage, clob);
-   unlock_page(vmpage);
-   put_page(vmpage);
-   }
+   pos = *(loff_t *)v;
+   vvp_pgcache_id_unpack(pos, );
+   clob = vvp_pgcache_obj(priv->env, >sbi->ll_cl->cd_lu_dev, );
+   if (clob) {
+   struct inode *inode = vvp_object_inode(clob);
+   struct cl_page *page = NULL;
+   struct page *vmpage;
+   int result;
+
+   result = find_get_pages_contig(inode->i_mapping,
+  id.vpi_index, 1,
+  );
+   if (result > 0) {
+   lock_page(vmpage);
+   page = cl_vmpage_page(vmpage, clob);
+   unlock_page(vmpage);
+   put_page(vmpage);
+   }
 
-   seq_printf(f, "%8x@" DFID ": ", id.vpi_index,
-  PFID(lu_object_fid(>co_lu)));
-   if (page) {
-   vvp_pgcache_page_show(env, f, page);
-   cl_page_put(env, page);
-   } else {
-   seq_puts(f, "missing\n");
-   }
-   lu_object_ref_del(>co_lu, "dump", current);
-   cl_object_put(env, clob);
+   seq_printf(f, "%8x@" DFID ": ", id.vpi_index,
+  PFID(lu_object_fid(>co_lu)));
+   if (page) {
+   vvp_pgcache_page_show(priv->env, f, page);
+   cl_page_put(priv->env, page);
} else {
-   seq_printf(f, "%llx missing\n", pos);
+   seq_puts(f, "missing\n");
}
-   cl_env_put(env, );
-   result = 0;
+   lu_object_ref_del(>co_lu, "dump", current);
+   cl_object_put(priv->env, clob);
} else {
-   result = PTR_ERR(env);
+   seq_printf(f, "%llx missing\n", pos);
}
-   return result;
+   return 0;
 }
 
 static void *vvp_pgcache_start(struct seq_file *f, loff_t *pos)
 {
-   struct ll_sb_info *sbi;
-   struct lu_env *env;
-   u16 refcheck;
-
-   sbi = f->private;
+   struct seq_private  *priv = f->private;
 
-   env = cl_env_get();
-   if (!IS_ERR(env)) {
-   sbi = f->private;
-   if

[PATCH 04/10] staging: lustre: lu_object: move retry logic inside htable_lookup

2018-04-30 Thread NeilBrown

The current retry logic, to wait when a 'dying' object is found,
spans multiple functions.  The process is attached to a waitqueue
and set TASK_UNINTERRUPTIBLE in htable_lookup, and this status
is passed back through lu_object_find_try() to lu_object_find_at()
where schedule() is called and the process is removed from the queue.

This can be simplified by moving all the logic (including
hashtable locking) inside htable_lookup(), which now never returns
EAGAIN.

Note that htable_lookup() is called with the hash bucket lock
held, and will drop and retake it if it needs to schedule.

I made this a 'goto' loop rather than a 'while(1)' loop as the
diff is easier to read.

Signed-off-by: NeilBrown 
---
 drivers/staging/lustre/lustre/obdclass/lu_object.c |   73 +++-
 1 file changed, 27 insertions(+), 46 deletions(-)

diff --git a/drivers/staging/lustre/lustre/obdclass/lu_object.c 
b/drivers/staging/lustre/lustre/obdclass/lu_object.c
index 2bf089817157..93daa52e2535 100644
--- a/drivers/staging/lustre/lustre/obdclass/lu_object.c
+++ b/drivers/staging/lustre/lustre/obdclass/lu_object.c
@@ -586,16 +586,21 @@ EXPORT_SYMBOL(lu_object_print);
 static struct lu_object *htable_lookup(struct lu_site *s,
   struct cfs_hash_bd *bd,
   const struct lu_fid *f,
-  wait_queue_entry_t *waiter,
   __u64 *version)
 {
+   struct cfs_hash *hs = s->ls_obj_hash;
struct lu_site_bkt_data *bkt;
struct lu_object_header *h;
struct hlist_node   *hnode;
-   __u64  ver = cfs_hash_bd_version_get(bd);
+   __u64 ver;
+   wait_queue_entry_t waiter;
 
-   if (*version == ver)
+retry:
+   ver = cfs_hash_bd_version_get(bd);
+
+   if (*version == ver) {
return ERR_PTR(-ENOENT);
+   }
 
*version = ver;
bkt = cfs_hash_bd_extra_get(s->ls_obj_hash, bd);
@@ -625,11 +630,15 @@ static struct lu_object *htable_lookup(struct lu_site *s,
 * drained), and moreover, lookup has to wait until object is freed.
 */
 
-   init_waitqueue_entry(waiter, current);
-   add_wait_queue(>lsb_marche_funebre, waiter);
+   init_waitqueue_entry(, current);
+   add_wait_queue(>lsb_marche_funebre, );
set_current_state(TASK_UNINTERRUPTIBLE);
lprocfs_counter_incr(s->ls_stats, LU_SS_CACHE_DEATH_RACE);
-   return ERR_PTR(-EAGAIN);
+   cfs_hash_bd_unlock(hs, bd, 1);
+   schedule();
+   remove_wait_queue(>lsb_marche_funebre, );
+   cfs_hash_bd_lock(hs, bd, 1);
+   goto retry;
 }
 
 /**
@@ -693,13 +702,14 @@ static struct lu_object *lu_object_new(const struct 
lu_env *env,
 }
 
 /**
- * Core logic of lu_object_find*() functions.
+ * Much like lu_object_find(), but top level device of object is specifically
+ * \a dev rather than top level device of the site. This interface allows
+ * objects of different "stacking" to be created within the same site.
  */
-static struct lu_object *lu_object_find_try(const struct lu_env *env,
-   struct lu_device *dev,
-   const struct lu_fid *f,
-   const struct lu_object_conf *conf,
-   wait_queue_entry_t *waiter)
+struct lu_object *lu_object_find_at(const struct lu_env *env,
+   struct lu_device *dev,
+   const struct lu_fid *f,
+   const struct lu_object_conf *conf)
 {
struct lu_object  *o;
struct lu_object  *shadow;
@@ -725,17 +735,16 @@ static struct lu_object *lu_object_find_try(const struct 
lu_env *env,
 * It is unnecessary to perform lookup-alloc-lookup-insert, instead,
 * just alloc and insert directly.
 *
-* If dying object is found during index search, add @waiter to the
-* site wait-queue and return ERR_PTR(-EAGAIN).
 */
if (conf && conf->loc_flags & LOC_F_NEW)
return lu_object_new(env, dev, f, conf);
 
s  = dev->ld_site;
hs = s->ls_obj_hash;
-   cfs_hash_bd_get_and_lock(hs, (void *)f, , 1);
-   o = htable_lookup(s, , f, waiter, );
-   cfs_hash_bd_unlock(hs, , 1);
+   cfs_hash_bd_get_and_lock(hs, (void *)f, , 0);
+   o = htable_lookup(s, , f, );
+   cfs_hash_bd_unlock(hs, , 0);
+
if (!IS_ERR(o) || PTR_ERR(o) != -ENOENT)
return o;
 
@@ -751,7 +760,7 @@ static struct lu_object *lu_object_find_try(const struct 
lu_env *env,
 
cfs_hash_bd_lock(hs, , 1);
 
-   shadow = htable_lookup(s, , f, waiter, );
+   shadow = htable_lookup(s, , f, );
if (likely(PTR_ERR(shadow) == -ENOENT)) {
cfs_hash_bd_add_locked(hs, , >lo_header->loh_hash);
cfs_hash_bd_unlock(hs, ,

[PATCH 03/10] staging: lustre: lu_object: discard extra lru count.

2018-04-30 Thread NeilBrown

lu_object maintains 2 lru counts.
One is a per-bucket lsb_lru_len.
The other is the per-cpu ls_lru_len_counter.

The only times the per-bucket counters are use are:
- a debug message when an object is added
- in lu_site_stats_get when all the counters are combined.

The debug message is not essential, and the per-cpu counter
can be used to get the combined total.

So discard the per-bucket lsb_lru_len.

Signed-off-by: NeilBrown 
---
 drivers/staging/lustre/lustre/obdclass/lu_object.c |   24 
 1 file changed, 9 insertions(+), 15 deletions(-)

diff --git a/drivers/staging/lustre/lustre/obdclass/lu_object.c 
b/drivers/staging/lustre/lustre/obdclass/lu_object.c
index 2a8a25d6edb5..2bf089817157 100644
--- a/drivers/staging/lustre/lustre/obdclass/lu_object.c
+++ b/drivers/staging/lustre/lustre/obdclass/lu_object.c
@@ -57,10 +57,6 @@
 #include 
 
 struct lu_site_bkt_data {
-   /**
-* number of object in this bucket on the lsb_lru list.
-*/
-   longlsb_lru_len;
/**
 * LRU list, updated on each access to object. Protected by
 * bucket lock of lu_site::ls_obj_hash.
@@ -187,10 +183,9 @@ void lu_object_put(const struct lu_env *env, struct 
lu_object *o)
if (!lu_object_is_dying(top)) {
LASSERT(list_empty(>loh_lru));
list_add_tail(>loh_lru, >lsb_lru);
-   bkt->lsb_lru_len++;
percpu_counter_inc(>ls_lru_len_counter);
-   CDEBUG(D_INODE, "Add %p to site lru. hash: %p, bkt: %p, 
lru_len: %ld\n",
-  o, site->ls_obj_hash, bkt, bkt->lsb_lru_len);
+   CDEBUG(D_INODE, "Add %p to site lru. hash: %p, bkt: %p\n",
+  o, site->ls_obj_hash, bkt);
cfs_hash_bd_unlock(site->ls_obj_hash, , 1);
return;
}
@@ -238,7 +233,6 @@ void lu_object_unhash(const struct lu_env *env, struct 
lu_object *o)
 
list_del_init(>loh_lru);
bkt = cfs_hash_bd_extra_get(obj_hash, );
-   bkt->lsb_lru_len--;
percpu_counter_dec(>ls_lru_len_counter);
}
cfs_hash_bd_del_locked(obj_hash, , >loh_hash);
@@ -422,7 +416,6 @@ int lu_site_purge_objects(const struct lu_env *env, struct 
lu_site *s,
cfs_hash_bd_del_locked(s->ls_obj_hash,
   , >loh_hash);
list_move(>loh_lru, );
-   bkt->lsb_lru_len--;
percpu_counter_dec(>ls_lru_len_counter);
if (did_sth == 0)
did_sth = 1;
@@ -621,7 +614,6 @@ static struct lu_object *htable_lookup(struct lu_site *s,
lprocfs_counter_incr(s->ls_stats, LU_SS_CACHE_HIT);
if (!list_empty(>loh_lru)) {
list_del_init(>loh_lru);
-   bkt->lsb_lru_len--;
percpu_counter_dec(>ls_lru_len_counter);
}
return lu_object_top(h);
@@ -1834,19 +1826,21 @@ struct lu_site_stats {
unsigned intlss_busy;
 };
 
-static void lu_site_stats_get(struct cfs_hash *hs,
+static void lu_site_stats_get(const struct lu_site *s,
  struct lu_site_stats *stats, int populated)
 {
+   struct cfs_hash *hs = s->ls_obj_hash;
struct cfs_hash_bd bd;
unsigned int i;
+   /* percpu_counter_read_positive() won't accept a const pointer */
+   struct lu_site *s2 = (struct lu_site *)s;
 
+   stats->lss_busy += cfs_hash_size_get(hs) -
+   percpu_counter_read_positive(>ls_lru_len_counter);
cfs_hash_for_each_bucket(hs, , i) {
-   struct lu_site_bkt_data *bkt = cfs_hash_bd_extra_get(hs, );
struct hlist_head   *hhead;
 
cfs_hash_bd_lock(hs, , 1);
-   stats->lss_busy  +=
-   cfs_hash_bd_count_get() - bkt->lsb_lru_len;
stats->lss_total += cfs_hash_bd_count_get();
stats->lss_max_search = max((int)stats->lss_max_search,
cfs_hash_bd_depmax_get());
@@ -2039,7 +2033,7 @@ int lu_site_stats_print(const struct lu_site *s, struct 
seq_file *m)
struct lu_site_stats stats;
 
memset(, 0, sizeof(stats));
-   lu_site_stats_get(s->ls_obj_hash, , 1);
+   lu_site_stats_get(s, , 1);
 
seq_printf(m, "%d/%d %d/%ld %d %d %d %d %d %d %d\n",
   stats.lss_busy,

[PATCH 03/10] staging: lustre: lu_object: discard extra lru count.

2018-04-30 Thread NeilBrown

lu_object maintains 2 lru counts.
One is a per-bucket lsb_lru_len.
The other is the per-cpu ls_lru_len_counter.

The only times the per-bucket counters are use are:
- a debug message when an object is added
- in lu_site_stats_get when all the counters are combined.

The debug message is not essential, and the per-cpu counter
can be used to get the combined total.

So discard the per-bucket lsb_lru_len.

Signed-off-by: NeilBrown 
---
 drivers/staging/lustre/lustre/obdclass/lu_object.c |   24 
 1 file changed, 9 insertions(+), 15 deletions(-)

diff --git a/drivers/staging/lustre/lustre/obdclass/lu_object.c 
b/drivers/staging/lustre/lustre/obdclass/lu_object.c
index 2a8a25d6edb5..2bf089817157 100644
--- a/drivers/staging/lustre/lustre/obdclass/lu_object.c
+++ b/drivers/staging/lustre/lustre/obdclass/lu_object.c
@@ -57,10 +57,6 @@
 #include 
 
 struct lu_site_bkt_data {
-   /**
-* number of object in this bucket on the lsb_lru list.
-*/
-   longlsb_lru_len;
/**
 * LRU list, updated on each access to object. Protected by
 * bucket lock of lu_site::ls_obj_hash.
@@ -187,10 +183,9 @@ void lu_object_put(const struct lu_env *env, struct 
lu_object *o)
if (!lu_object_is_dying(top)) {
LASSERT(list_empty(>loh_lru));
list_add_tail(>loh_lru, >lsb_lru);
-   bkt->lsb_lru_len++;
percpu_counter_inc(>ls_lru_len_counter);
-   CDEBUG(D_INODE, "Add %p to site lru. hash: %p, bkt: %p, 
lru_len: %ld\n",
-  o, site->ls_obj_hash, bkt, bkt->lsb_lru_len);
+   CDEBUG(D_INODE, "Add %p to site lru. hash: %p, bkt: %p\n",
+  o, site->ls_obj_hash, bkt);
cfs_hash_bd_unlock(site->ls_obj_hash, , 1);
return;
}
@@ -238,7 +233,6 @@ void lu_object_unhash(const struct lu_env *env, struct 
lu_object *o)
 
list_del_init(>loh_lru);
bkt = cfs_hash_bd_extra_get(obj_hash, );
-   bkt->lsb_lru_len--;
percpu_counter_dec(>ls_lru_len_counter);
}
cfs_hash_bd_del_locked(obj_hash, , >loh_hash);
@@ -422,7 +416,6 @@ int lu_site_purge_objects(const struct lu_env *env, struct 
lu_site *s,
cfs_hash_bd_del_locked(s->ls_obj_hash,
   , >loh_hash);
list_move(>loh_lru, );
-   bkt->lsb_lru_len--;
percpu_counter_dec(>ls_lru_len_counter);
if (did_sth == 0)
did_sth = 1;
@@ -621,7 +614,6 @@ static struct lu_object *htable_lookup(struct lu_site *s,
lprocfs_counter_incr(s->ls_stats, LU_SS_CACHE_HIT);
if (!list_empty(>loh_lru)) {
list_del_init(>loh_lru);
-   bkt->lsb_lru_len--;
percpu_counter_dec(>ls_lru_len_counter);
}
return lu_object_top(h);
@@ -1834,19 +1826,21 @@ struct lu_site_stats {
unsigned intlss_busy;
 };
 
-static void lu_site_stats_get(struct cfs_hash *hs,
+static void lu_site_stats_get(const struct lu_site *s,
  struct lu_site_stats *stats, int populated)
 {
+   struct cfs_hash *hs = s->ls_obj_hash;
struct cfs_hash_bd bd;
unsigned int i;
+   /* percpu_counter_read_positive() won't accept a const pointer */
+   struct lu_site *s2 = (struct lu_site *)s;
 
+   stats->lss_busy += cfs_hash_size_get(hs) -
+   percpu_counter_read_positive(>ls_lru_len_counter);
cfs_hash_for_each_bucket(hs, , i) {
-   struct lu_site_bkt_data *bkt = cfs_hash_bd_extra_get(hs, );
struct hlist_head   *hhead;
 
cfs_hash_bd_lock(hs, , 1);
-   stats->lss_busy  +=
-   cfs_hash_bd_count_get() - bkt->lsb_lru_len;
stats->lss_total += cfs_hash_bd_count_get();
stats->lss_max_search = max((int)stats->lss_max_search,
cfs_hash_bd_depmax_get());
@@ -2039,7 +2033,7 @@ int lu_site_stats_print(const struct lu_site *s, struct 
seq_file *m)
struct lu_site_stats stats;
 
memset(, 0, sizeof(stats));
-   lu_site_stats_get(s->ls_obj_hash, , 1);
+   lu_site_stats_get(s, , 1);
 
seq_printf(m, "%d/%d %d/%ld %d %d %d %d %d %d %d\n",
   stats.lss_busy,

[PATCH 00/10] staging: lustre: assorted improvements.

2018-04-30 Thread NeilBrown

First 6 patches are clean-up patches that I pulled out
of my rhashtable series.  I think these stand alone as
good cleanups, and having them upstream makes the rhashtable
series shorter to ease further review.

Second 2 are revised versions of patches I sent previously that
had conflicts with other patches that landed first.

Last is a bugfix for an issue James mentioned a while back.

Thanks,
NeilBrown


---

NeilBrown (10):
  staging: lustre: ldlm: store name directly in namespace.
  staging: lustre: make struct lu_site_bkt_data private
  staging: lustre: lu_object: discard extra lru count.
  staging: lustre: lu_object: move retry logic inside htable_lookup
  staging: lustre: fold lu_object_new() into lu_object_find_at()
  staging: lustre: llite: use more private data in dump_pgcache
  staging: lustre: llite: remove redundant lookup in dump_pgcache
  staging: lustre: move misc-device registration closer to related code.
  staging: lustre: move remaining code from linux-module.c to module.c
  staging: lustre: fix error deref in ll_splice_alias().


 .../staging/lustre/include/linux/libcfs/libcfs.h   |6 -
 drivers/staging/lustre/lnet/libcfs/Makefile|1 
 .../lustre/lnet/libcfs/linux/linux-module.c|  196 
 drivers/staging/lustre/lnet/libcfs/module.c|  162 -
 drivers/staging/lustre/lustre/include/lu_object.h  |   36 
 drivers/staging/lustre/lustre/include/lustre_dlm.h |5 -
 drivers/staging/lustre/lustre/ldlm/ldlm_resource.c |5 +
 drivers/staging/lustre/lustre/llite/lcommon_cl.c   |8 -
 drivers/staging/lustre/lustre/llite/namei.c|8 +
 drivers/staging/lustre/lustre/llite/vvp_dev.c  |  169 -
 drivers/staging/lustre/lustre/lov/lov_object.c |8 -
 drivers/staging/lustre/lustre/obdclass/lu_object.c |  169 -
 12 files changed, 341 insertions(+), 432 deletions(-)
 delete mode 100644 drivers/staging/lustre/lnet/libcfs/linux/linux-module.c

--
Signature

[PATCH 01/10] staging: lustre: ldlm: store name directly in namespace.

2018-04-30 Thread NeilBrown

Rather than storing the name of a namespace in the
hash table, store it directly in the namespace.
This will allow the hashtable to be changed to use
rhashtable.

Signed-off-by: NeilBrown 
---
 drivers/staging/lustre/lustre/include/lustre_dlm.h |5 -
 drivers/staging/lustre/lustre/ldlm/ldlm_resource.c |5 +
 2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/drivers/staging/lustre/lustre/include/lustre_dlm.h 
b/drivers/staging/lustre/lustre/include/lustre_dlm.h
index d668d86423a4..b3532adac31c 100644
--- a/drivers/staging/lustre/lustre/include/lustre_dlm.h
+++ b/drivers/staging/lustre/lustre/include/lustre_dlm.h
@@ -362,6 +362,9 @@ struct ldlm_namespace {
/** Flag indicating if namespace is on client instead of server */
enum ldlm_side  ns_client;
 
+   /** name of this namespace */
+   char*ns_name;
+
/** Resource hash table for namespace. */
struct cfs_hash *ns_rs_hash;
 
@@ -878,7 +881,7 @@ static inline bool ldlm_has_layout(struct ldlm_lock *lock)
 static inline char *
 ldlm_ns_name(struct ldlm_namespace *ns)
 {
-   return ns->ns_rs_hash->hs_name;
+   return ns->ns_name;
 }
 
 static inline struct ldlm_namespace *
diff --git a/drivers/staging/lustre/lustre/ldlm/ldlm_resource.c 
b/drivers/staging/lustre/lustre/ldlm/ldlm_resource.c
index 6c615b6e9bdc..43bbc5fd94cc 100644
--- a/drivers/staging/lustre/lustre/ldlm/ldlm_resource.c
+++ b/drivers/staging/lustre/lustre/ldlm/ldlm_resource.c
@@ -688,6 +688,9 @@ struct ldlm_namespace *ldlm_namespace_new(struct obd_device 
*obd, char *name,
ns->ns_obd  = obd;
ns->ns_appetite = apt;
ns->ns_client   = client;
+   ns->ns_name = kstrdup(name, GFP_KERNEL);
+   if (!ns->ns_name)
+   goto out_hash;
 
INIT_LIST_HEAD(>ns_list_chain);
INIT_LIST_HEAD(>ns_unused_list);
@@ -730,6 +733,7 @@ struct ldlm_namespace *ldlm_namespace_new(struct obd_device 
*obd, char *name,
ldlm_namespace_sysfs_unregister(ns);
ldlm_namespace_cleanup(ns, 0);
 out_hash:
+   kfree(ns->ns_name);
cfs_hash_putref(ns->ns_rs_hash);
 out_ns:
kfree(ns);
@@ -993,6 +997,7 @@ void ldlm_namespace_free_post(struct ldlm_namespace *ns)
ldlm_namespace_debugfs_unregister(ns);
ldlm_namespace_sysfs_unregister(ns);
cfs_hash_putref(ns->ns_rs_hash);
+   kfree(ns->ns_name);
/* Namespace \a ns should be not on list at this time, otherwise
 * this will cause issues related to using freed \a ns in poold
 * thread.

[PATCH 00/10] staging: lustre: assorted improvements.

2018-04-30 Thread NeilBrown

First 6 patches are clean-up patches that I pulled out
of my rhashtable series.  I think these stand alone as
good cleanups, and having them upstream makes the rhashtable
series shorter to ease further review.

Second 2 are revised versions of patches I sent previously that
had conflicts with other patches that landed first.

Last is a bugfix for an issue James mentioned a while back.

Thanks,
NeilBrown


---

NeilBrown (10):
  staging: lustre: ldlm: store name directly in namespace.
  staging: lustre: make struct lu_site_bkt_data private
  staging: lustre: lu_object: discard extra lru count.
  staging: lustre: lu_object: move retry logic inside htable_lookup
  staging: lustre: fold lu_object_new() into lu_object_find_at()
  staging: lustre: llite: use more private data in dump_pgcache
  staging: lustre: llite: remove redundant lookup in dump_pgcache
  staging: lustre: move misc-device registration closer to related code.
  staging: lustre: move remaining code from linux-module.c to module.c
  staging: lustre: fix error deref in ll_splice_alias().


 .../staging/lustre/include/linux/libcfs/libcfs.h   |6 -
 drivers/staging/lustre/lnet/libcfs/Makefile|1 
 .../lustre/lnet/libcfs/linux/linux-module.c|  196 
 drivers/staging/lustre/lnet/libcfs/module.c|  162 -
 drivers/staging/lustre/lustre/include/lu_object.h  |   36 
 drivers/staging/lustre/lustre/include/lustre_dlm.h |5 -
 drivers/staging/lustre/lustre/ldlm/ldlm_resource.c |5 +
 drivers/staging/lustre/lustre/llite/lcommon_cl.c   |8 -
 drivers/staging/lustre/lustre/llite/namei.c|8 +
 drivers/staging/lustre/lustre/llite/vvp_dev.c  |  169 -
 drivers/staging/lustre/lustre/lov/lov_object.c |8 -
 drivers/staging/lustre/lustre/obdclass/lu_object.c |  169 -
 12 files changed, 341 insertions(+), 432 deletions(-)
 delete mode 100644 drivers/staging/lustre/lnet/libcfs/linux/linux-module.c

--
Signature

[PATCH 01/10] staging: lustre: ldlm: store name directly in namespace.

2018-04-30 Thread NeilBrown

Rather than storing the name of a namespace in the
hash table, store it directly in the namespace.
This will allow the hashtable to be changed to use
rhashtable.

Signed-off-by: NeilBrown 
---
 drivers/staging/lustre/lustre/include/lustre_dlm.h |5 -
 drivers/staging/lustre/lustre/ldlm/ldlm_resource.c |5 +
 2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/drivers/staging/lustre/lustre/include/lustre_dlm.h 
b/drivers/staging/lustre/lustre/include/lustre_dlm.h
index d668d86423a4..b3532adac31c 100644
--- a/drivers/staging/lustre/lustre/include/lustre_dlm.h
+++ b/drivers/staging/lustre/lustre/include/lustre_dlm.h
@@ -362,6 +362,9 @@ struct ldlm_namespace {
/** Flag indicating if namespace is on client instead of server */
enum ldlm_side  ns_client;
 
+   /** name of this namespace */
+   char*ns_name;
+
/** Resource hash table for namespace. */
struct cfs_hash *ns_rs_hash;
 
@@ -878,7 +881,7 @@ static inline bool ldlm_has_layout(struct ldlm_lock *lock)
 static inline char *
 ldlm_ns_name(struct ldlm_namespace *ns)
 {
-   return ns->ns_rs_hash->hs_name;
+   return ns->ns_name;
 }
 
 static inline struct ldlm_namespace *
diff --git a/drivers/staging/lustre/lustre/ldlm/ldlm_resource.c 
b/drivers/staging/lustre/lustre/ldlm/ldlm_resource.c
index 6c615b6e9bdc..43bbc5fd94cc 100644
--- a/drivers/staging/lustre/lustre/ldlm/ldlm_resource.c
+++ b/drivers/staging/lustre/lustre/ldlm/ldlm_resource.c
@@ -688,6 +688,9 @@ struct ldlm_namespace *ldlm_namespace_new(struct obd_device 
*obd, char *name,
ns->ns_obd  = obd;
ns->ns_appetite = apt;
ns->ns_client   = client;
+   ns->ns_name = kstrdup(name, GFP_KERNEL);
+   if (!ns->ns_name)
+   goto out_hash;
 
INIT_LIST_HEAD(>ns_list_chain);
INIT_LIST_HEAD(>ns_unused_list);
@@ -730,6 +733,7 @@ struct ldlm_namespace *ldlm_namespace_new(struct obd_device 
*obd, char *name,
ldlm_namespace_sysfs_unregister(ns);
ldlm_namespace_cleanup(ns, 0);
 out_hash:
+   kfree(ns->ns_name);
cfs_hash_putref(ns->ns_rs_hash);
 out_ns:
kfree(ns);
@@ -993,6 +997,7 @@ void ldlm_namespace_free_post(struct ldlm_namespace *ns)
ldlm_namespace_debugfs_unregister(ns);
ldlm_namespace_sysfs_unregister(ns);
cfs_hash_putref(ns->ns_rs_hash);
+   kfree(ns->ns_name);
/* Namespace \a ns should be not on list at this time, otherwise
 * this will cause issues related to using freed \a ns in poold
 * thread.

Re: [kernel-team] Re: [PATCH RFC v5 4/6] trace/irqsoff: Split reset into seperate functions

2018-04-30 Thread Joel Fernandes

On Mon, Apr 30, 2018 at 8:46 PM Randy Dunlap  wrote:

> On 04/30/2018 06:42 PM, Joel Fernandes wrote:
> > Split reset functions into seperate functions in preparation
> > of future patches that need to do tracer specific reset.
> >

> Hi,
> Since you are updating patches anyway, please
> s/seperate/separate/.

Thanks Randy, will do.

- Joel

Re: [kernel-team] Re: [PATCH RFC v5 4/6] trace/irqsoff: Split reset into seperate functions

2018-04-30 Thread Joel Fernandes

On Mon, Apr 30, 2018 at 8:46 PM Randy Dunlap  wrote:

> On 04/30/2018 06:42 PM, Joel Fernandes wrote:
> > Split reset functions into seperate functions in preparation
> > of future patches that need to do tracer specific reset.
> >

> Hi,
> Since you are updating patches anyway, please
> s/seperate/separate/.

Thanks Randy, will do.

- Joel

Re: [PATCH RFC v5 4/6] trace/irqsoff: Split reset into seperate functions

2018-04-30 Thread Randy Dunlap

On 04/30/2018 06:42 PM, Joel Fernandes wrote:
> Split reset functions into seperate functions in preparation
> of future patches that need to do tracer specific reset.
> 

Hi,
Since you are updating patches anyway, please
s/seperate/separate/.

thanks,
-- 
~Randy

Re: [PATCH RFC v5 4/6] trace/irqsoff: Split reset into seperate functions

2018-04-30 Thread Randy Dunlap

On 04/30/2018 06:42 PM, Joel Fernandes wrote:
> Split reset functions into seperate functions in preparation
> of future patches that need to do tracer specific reset.
> 

Hi,
Since you are updating patches anyway, please
s/seperate/separate/.

thanks,
-- 
~Randy

Warning for driver i915 for 4.17.0-rcX

2018-04-30 Thread Larry Finger


With kernel 4.17.0-rc3, I noted the following warning from driver i915.

kernel: [ cut here ]
kernel: Could not determine valid watermarks for inherited state
kernel: WARNING: CPU: 3 PID: 224 at drivers/gpu/drm/i915/intel_display.c:14584 
intel_modeset_init+0x3be/0x1060 [i915]
kernel: Modules linked in: i915(+) xhci_pci i2c_algo_bit ehci_pci xhci_hcd 
serio_raw drm_kms_helper ehci_hcd syscopyarea sysfillrect sysimgblt
kernel: CPU: 3 PID: 224 Comm: systemd-udevd Not tainted 
4.17.0-rc0-08000-g2f39cfca0161-dirty #188
kernel: Hardware name: TOSHIBA TECRA A50-A/TECRA A50-A, BIOS Version 4.50 
09/29/2014

kernel: RIP: 0010:intel_modeset_init+0x3be/0x1060 [i915]
kernel: RSP: 0018:c9000112fab8 EFLAGS: 00010296
kernel: RAX: 0038 RBX: 88021dc1 RCX: 0006
kernel: RDX: 0007 RSI: 0082 RDI: 88022f2d5bd0
kernel: RBP: 88021c8b3000 R08: 0289 R09: 0004
kernel: R10: c9000112f948 R11: 0001 R12: 88021ef3d800
kernel: R13: ffea R14:  R15: 88021dc10358
kernel: FS:  7f830913c940() GS:88022f2c() 
knlGS:
kernel: CS:  0010 DS:  ES:  CR0: 80050033
kernel: CR2: 7ffc91722f58 CR3: 00021e940003 CR4: 001606e0
kernel: Call Trace:
kernel:  i915_driver_load+0xa87/0xed0 [i915]
kernel:  local_pci_probe+0x42/0xa0
kernel:  pci_device_probe+0x125/0x190
kernel:  driver_probe_device+0x30b/0x480
kernel:  __driver_attach+0xb8/0xe0
kernel:  ? driver_probe_device+0x40/0x480
kernel:  ? driver_probe_device+0x480/0x480
kernel:  bus_for_each_dev+0x65/0x90
kernel:  bus_add_driver+0x161/0x260
kernel:  ? 0xa0149000
kernel:  driver_register+0x57/0xc0
kernel:  ? 0xa0149000
kernel:  do_one_initcall+0x4e/0x18d
kernel:  ? kmem_cache_alloc_trace+0xfe/0x210
kernel:  ? do_init_module+0x22/0x20a
kernel:  do_init_module+0x5b/0x20a
kernel:  load_module+0x18a1/0x1e20
kernel:  ? SYSC_finit_module+0xb7/0xd0
kernel:  SYSC_finit_module+0xb7/0xd0
kernel:  do_syscall_64+0x6e/0x120
kernel:  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
kernel: RIP: 0033:0x7f8307f67529
kernel: RSP: 002b:7ffc91737ab8 EFLAGS: 0246 ORIG_RAX: 0139
kernel: RAX: ffda RBX: 559b3cf6f350 RCX: 7f8307f67529
kernel: RDX:  RSI: 7f83088cd83d RDI: 0014
kernel: RBP: 7f83088cd83d R08:  R09: 559b3cf44f50
kernel: R10: 0014 R11: 0246 R12: 0002
kernel: R13: 559b3cf634a0 R14:  R15: 03938700
kernel: Code: 00 f7 c6 00 00 18 00 0f 84 d7 0a 00 00 85 c9 0f 94 c1 41 88 8c 24 
8d 02 00 00 e9 f6 08 00 00 48 c7 c7 b0 2f 41 a0 e8 f2 9b cd e0

kernel: ---[ end trace c0feea6402f4c999 ]---

This warning was bisected to

commit a2936e3d9a9cb2ce192455cdec3a8cfccc26b486 (refs/bisect/bad)
Author: Ville Syrjälä 
Date:   Thu Nov 23 21:04:49 2017 +0200

drm/i915: Use drm_mode_get_hv_timing() to populate plane clip rectangle

Cc: Laurent Pinchart 
Signed-off-by: Ville Syrjälä 
Reviewed-by: Daniel Vetter 
Reviewed-by: Thierry Reding 

The output of 'lspci -nn -vvv' for the gracpics device is as follows:

00:02.0 VGA compatible controller [0300]: Intel Corporation 4th Gen Core 
Processor Integrated Graphics Controller [8086:0416] (rev 06) (prog-if 00 [VGA 
controller])

Subsystem: Toshiba America Info Systems Device [1179:0002]
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- SERR- 
Latency: 0
Interrupt: pin A routed to IRQ 28
Region 0: Memory at e000 (64-bit, non-prefetchable) [size=4M]
Region 2: Memory at d000 (64-bit, prefetchable) [size=256M]
Region 4: I/O ports at 4000 [size=64]
[virtual] Expansion ROM at 000c [disabled] [size=128K]
Capabilities: [90] MSI: Enable+ Count=1/1 Maskable- 64bit-
Address: fee00018  Data: 
Capabilities: [d0] Power Management version 2
Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA 
PME(D0-,D1-,D2-,D3hot-,D3cold-)

Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [a4] PCI Advanced Features
AFCap: TP+ FLR+
AFCtrl: FLR-
AFStatus: TP-
Kernel driver in use: i915
Kernel modules: i915

This warning does not seem to cause any problems.

Thanks,

Larry

Warning for driver i915 for 4.17.0-rcX

2018-04-30 Thread Larry Finger


With kernel 4.17.0-rc3, I noted the following warning from driver i915.

kernel: [ cut here ]
kernel: Could not determine valid watermarks for inherited state
kernel: WARNING: CPU: 3 PID: 224 at drivers/gpu/drm/i915/intel_display.c:14584 
intel_modeset_init+0x3be/0x1060 [i915]
kernel: Modules linked in: i915(+) xhci_pci i2c_algo_bit ehci_pci xhci_hcd 
serio_raw drm_kms_helper ehci_hcd syscopyarea sysfillrect sysimgblt
kernel: CPU: 3 PID: 224 Comm: systemd-udevd Not tainted 
4.17.0-rc0-08000-g2f39cfca0161-dirty #188
kernel: Hardware name: TOSHIBA TECRA A50-A/TECRA A50-A, BIOS Version 4.50 
09/29/2014

kernel: RIP: 0010:intel_modeset_init+0x3be/0x1060 [i915]
kernel: RSP: 0018:c9000112fab8 EFLAGS: 00010296
kernel: RAX: 0038 RBX: 88021dc1 RCX: 0006
kernel: RDX: 0007 RSI: 0082 RDI: 88022f2d5bd0
kernel: RBP: 88021c8b3000 R08: 0289 R09: 0004
kernel: R10: c9000112f948 R11: 0001 R12: 88021ef3d800
kernel: R13: ffea R14:  R15: 88021dc10358
kernel: FS:  7f830913c940() GS:88022f2c() 
knlGS:
kernel: CS:  0010 DS:  ES:  CR0: 80050033
kernel: CR2: 7ffc91722f58 CR3: 00021e940003 CR4: 001606e0
kernel: Call Trace:
kernel:  i915_driver_load+0xa87/0xed0 [i915]
kernel:  local_pci_probe+0x42/0xa0
kernel:  pci_device_probe+0x125/0x190
kernel:  driver_probe_device+0x30b/0x480
kernel:  __driver_attach+0xb8/0xe0
kernel:  ? driver_probe_device+0x40/0x480
kernel:  ? driver_probe_device+0x480/0x480
kernel:  bus_for_each_dev+0x65/0x90
kernel:  bus_add_driver+0x161/0x260
kernel:  ? 0xa0149000
kernel:  driver_register+0x57/0xc0
kernel:  ? 0xa0149000
kernel:  do_one_initcall+0x4e/0x18d
kernel:  ? kmem_cache_alloc_trace+0xfe/0x210
kernel:  ? do_init_module+0x22/0x20a
kernel:  do_init_module+0x5b/0x20a
kernel:  load_module+0x18a1/0x1e20
kernel:  ? SYSC_finit_module+0xb7/0xd0
kernel:  SYSC_finit_module+0xb7/0xd0
kernel:  do_syscall_64+0x6e/0x120
kernel:  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
kernel: RIP: 0033:0x7f8307f67529
kernel: RSP: 002b:7ffc91737ab8 EFLAGS: 0246 ORIG_RAX: 0139
kernel: RAX: ffda RBX: 559b3cf6f350 RCX: 7f8307f67529
kernel: RDX:  RSI: 7f83088cd83d RDI: 0014
kernel: RBP: 7f83088cd83d R08:  R09: 559b3cf44f50
kernel: R10: 0014 R11: 0246 R12: 0002
kernel: R13: 559b3cf634a0 R14:  R15: 03938700
kernel: Code: 00 f7 c6 00 00 18 00 0f 84 d7 0a 00 00 85 c9 0f 94 c1 41 88 8c 24 
8d 02 00 00 e9 f6 08 00 00 48 c7 c7 b0 2f 41 a0 e8 f2 9b cd e0

kernel: ---[ end trace c0feea6402f4c999 ]---

This warning was bisected to

commit a2936e3d9a9cb2ce192455cdec3a8cfccc26b486 (refs/bisect/bad)
Author: Ville Syrjälä 
Date:   Thu Nov 23 21:04:49 2017 +0200

drm/i915: Use drm_mode_get_hv_timing() to populate plane clip rectangle

Cc: Laurent Pinchart 
Signed-off-by: Ville Syrjälä 
Reviewed-by: Daniel Vetter 
Reviewed-by: Thierry Reding 

The output of 'lspci -nn -vvv' for the gracpics device is as follows:

00:02.0 VGA compatible controller [0300]: Intel Corporation 4th Gen Core 
Processor Integrated Graphics Controller [8086:0416] (rev 06) (prog-if 00 [VGA 
controller])

Subsystem: Toshiba America Info Systems Device [1179:0002]
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- SERR- 
Latency: 0
Interrupt: pin A routed to IRQ 28
Region 0: Memory at e000 (64-bit, non-prefetchable) [size=4M]
Region 2: Memory at d000 (64-bit, prefetchable) [size=256M]
Region 4: I/O ports at 4000 [size=64]
[virtual] Expansion ROM at 000c [disabled] [size=128K]
Capabilities: [90] MSI: Enable+ Count=1/1 Maskable- 64bit-
Address: fee00018  Data: 
Capabilities: [d0] Power Management version 2
Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA 
PME(D0-,D1-,D2-,D3hot-,D3cold-)

Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [a4] PCI Advanced Features
AFCap: TP+ FLR+
AFCtrl: FLR-
AFStatus: TP-
Kernel driver in use: i915
Kernel modules: i915

This warning does not seem to cause any problems.

Thanks,

Larry

Re: [PATCH v2] libata: blacklist Micron SSD

2018-04-30 Thread Martin K. Petersen


Sudip,

> v1: Only M500IT MU01 was blacklisted.
>
> v2: Whitelist M500IT BG02 and M500DC and then blacklist all other Micron. 

I think my preference would be to blacklist M500IT with the MU01
firmware (which Micron said was affected) and rely on the "Micron*"
fallthrough further down for the rest.

I have not gotten firm confirmation on ZRAT behavior so for now we
should probably just do:

+ { "Micron_M500IT_*", "MU01",  ATA_HORKAGE_NO_NCQ_TRIM, },

-- 
Martin K. Petersen  Oracle Linux Engineering

Re: [PATCH v2] libata: blacklist Micron SSD

2018-04-30 Thread Martin K. Petersen


Sudip,

> v1: Only M500IT MU01 was blacklisted.
>
> v2: Whitelist M500IT BG02 and M500DC and then blacklist all other Micron. 

I think my preference would be to blacklist M500IT with the MU01
firmware (which Micron said was affected) and rely on the "Micron*"
fallthrough further down for the rest.

I have not gotten firm confirmation on ZRAT behavior so for now we
should probably just do:

+ { "Micron_M500IT_*", "MU01",  ATA_HORKAGE_NO_NCQ_TRIM, },

-- 
Martin K. Petersen  Oracle Linux Engineering

Re: [PATCH 2/2] backlight: Remove ld9040 driver

2018-04-30 Thread Jingoo Han



> -Original Message-
> From: Krzysztof Kozlowski 
> Sent: Monday, April 30, 2018 1:30 PM
> To: Lee Jones ; Daniel Thompson
> ; Jingoo Han ;
> Bartlomiej Zolnierkiewicz ; linux-
> ker...@vger.kernel.org; dri-de...@lists.freedesktop.org; linux-
> fb...@vger.kernel.org
> Cc: Krzysztof Kozlowski ; Marek Szyprowski
> ; Inki Dae 
> Subject: [PATCH 2/2] backlight: Remove ld9040 driver
> 
> The driver for LD9040 AMOLED LCD panel was superseded with DRM driver
> panel-samsung-ld9040.c.  It does not support DeviceTree and respective
> possible user (Exynos4210 Universal C210) is DeviceTree-only and uses
> DRM version of driver..
> 
> Suggested-by: Marek Szyprowski 
> Cc: Marek Szyprowski 
> Cc: Inki Dae 
> Signed-off-by: Krzysztof Kozlowski 
> ---
>  drivers/video/backlight/Kconfig|   8 -
>  drivers/video/backlight/Makefile   |   1 -
>  drivers/video/backlight/ld9040.c   | 811
-
> 
>  drivers/video/backlight/ld9040_gamma.h | 202 
>  4 files changed, 1022 deletions(-)
>  delete mode 100644 drivers/video/backlight/ld9040.c
>  delete mode 100644 drivers/video/backlight/ld9040_gamma.h

Acked-by: Jingoo Han 

Best regards,
Jingoo Han

[.]

Re: [PATCH 2/2] backlight: Remove ld9040 driver

2018-04-30 Thread Jingoo Han



> -Original Message-
> From: Krzysztof Kozlowski 
> Sent: Monday, April 30, 2018 1:30 PM
> To: Lee Jones ; Daniel Thompson
> ; Jingoo Han ;
> Bartlomiej Zolnierkiewicz ; linux-
> ker...@vger.kernel.org; dri-de...@lists.freedesktop.org; linux-
> fb...@vger.kernel.org
> Cc: Krzysztof Kozlowski ; Marek Szyprowski
> ; Inki Dae 
> Subject: [PATCH 2/2] backlight: Remove ld9040 driver
> 
> The driver for LD9040 AMOLED LCD panel was superseded with DRM driver
> panel-samsung-ld9040.c.  It does not support DeviceTree and respective
> possible user (Exynos4210 Universal C210) is DeviceTree-only and uses
> DRM version of driver..
> 
> Suggested-by: Marek Szyprowski 
> Cc: Marek Szyprowski 
> Cc: Inki Dae 
> Signed-off-by: Krzysztof Kozlowski 
> ---
>  drivers/video/backlight/Kconfig|   8 -
>  drivers/video/backlight/Makefile   |   1 -
>  drivers/video/backlight/ld9040.c   | 811
-
> 
>  drivers/video/backlight/ld9040_gamma.h | 202 
>  4 files changed, 1022 deletions(-)
>  delete mode 100644 drivers/video/backlight/ld9040.c
>  delete mode 100644 drivers/video/backlight/ld9040_gamma.h

Acked-by: Jingoo Han 

Best regards,
Jingoo Han

[.]

Re: [PATCH 1/2] backlight: Remove s6e63m0 driver

2018-04-30 Thread Jingoo Han

On Monday, April 30, 2018 1:30 PM, Krzysztof Kozlowski wrote:
> 
> The driver for S6E63M0 AMOLED LCD panel is not used.  It does not
> support DeviceTree and respective possible users (S5Pv210 Aquila and
> Goni boards) are DeviceTree-only.
> 
> Suggested-by: Marek Szyprowski 
> Cc: Marek Szyprowski 
> Cc: Inki Dae 
> Signed-off-by: Krzysztof Kozlowski 
> ---
>  drivers/video/backlight/Kconfig |   8 -
>  drivers/video/backlight/Makefile|   1 -
>  drivers/video/backlight/s6e63m0.c   | 857

> 

Acked-by: Jingoo Han 

Best regards,
Jingoo Han

[.]

Re: [PATCH 1/2] backlight: Remove s6e63m0 driver

2018-04-30 Thread Jingoo Han

On Monday, April 30, 2018 1:30 PM, Krzysztof Kozlowski wrote:
> 
> The driver for S6E63M0 AMOLED LCD panel is not used.  It does not
> support DeviceTree and respective possible users (S5Pv210 Aquila and
> Goni boards) are DeviceTree-only.
> 
> Suggested-by: Marek Szyprowski 
> Cc: Marek Szyprowski 
> Cc: Inki Dae 
> Signed-off-by: Krzysztof Kozlowski 
> ---
>  drivers/video/backlight/Kconfig |   8 -
>  drivers/video/backlight/Makefile|   1 -
>  drivers/video/backlight/s6e63m0.c   | 857

> 

Acked-by: Jingoo Han 

Best regards,
Jingoo Han

[.]

[PATCH 3/4 v2] rculist: add list_for_each_entry_from_rcu()

2018-04-30 Thread NeilBrown


list_for_each_entry_from_rcu() is an RCU version of
list_for_each_entry_from().  It walks a linked list under rcu
protection, from a given start point.

It is similar to list_for_each_entry_continue_rcu() but starts *at*
the given position rather than *after* it.

Naturally, the start point must be known to be in the list.

Also update the documentation for list_for_each_entry_continue_rcu()
to match the documentation for the new list_for_each_entry_from_rcu(),
and add list_for_each_entry_from_rcu() and the already existing
hlist_for_each_entry_from_rcu() to section 7 of whatisRCU.txt.

Signed-off-by: NeilBrown 
---
 Documentation/RCU/whatisRCU.txt |  2 ++
 include/linux/rculist.h | 32 +++-
 2 files changed, 33 insertions(+), 1 deletion(-)

diff --git a/Documentation/RCU/whatisRCU.txt b/Documentation/RCU/whatisRCU.txt
index a27fbfb0efb8..b7d38bd212d2 100644
--- a/Documentation/RCU/whatisRCU.txt
+++ b/Documentation/RCU/whatisRCU.txt
@@ -814,11 +814,13 @@ RCU list traversal:
list_next_rcu
list_for_each_entry_rcu
list_for_each_entry_continue_rcu
+   list_for_each_entry_from_rcu
hlist_first_rcu
hlist_next_rcu
hlist_pprev_rcu
hlist_for_each_entry_rcu
hlist_for_each_entry_rcu_bh
+   hlist_for_each_entry_from_rcu
hlist_for_each_entry_continue_rcu
hlist_for_each_entry_continue_rcu_bh
hlist_nulls_first_rcu
diff --git a/include/linux/rculist.h b/include/linux/rculist.h
index 127f534fec94..4786c2235b98 100644
--- a/include/linux/rculist.h
+++ b/include/linux/rculist.h
@@ -396,13 +396,43 @@ static inline void list_splice_tail_init_rcu(struct 
list_head *list,
  * @member:the name of the list_head within the struct.
  *
  * Continue to iterate over list of given type, continuing after
- * the current position.
+ * the current position which must have been in the list when the RCU read
+ * lock was taken.
+ * This would typically require either that you obtained the node from a
+ * previous walk of the list in the same RCU read-side critical section, or
+ * that you held some sort of non-RCU reference (such as a reference count)
+ * to keep the node alive *and* in the list.
+ *
+ * This iterator is similar to list_for_each_entry_from_rcu() except
+ * this starts after the given position and that one starts at the given
+ * position.
  */
 #define list_for_each_entry_continue_rcu(pos, head, member)\
for (pos = list_entry_rcu(pos->member.next, typeof(*pos), member); \
 >member != (head);\
 pos = list_entry_rcu(pos->member.next, typeof(*pos), member))
 
+/**
+ * list_for_each_entry_from_rcu - iterate over a list from current point
+ * @pos:   the type * to use as a loop cursor.
+ * @head:  the head for your list.
+ * @member:the name of the list_node within the struct.
+ *
+ * Iterate over the tail of a list starting from a given position,
+ * which must have been in the list when the RCU read lock was taken.
+ * This would typically require either that you obtained the node from a
+ * previous walk of the list in the same RCU read-side critical section, or
+ * that you held some sort of non-RCU reference (such as a reference count)
+ * to keep the node alive *and* in the list.
+ *
+ * This iterator is similar to list_for_each_entry_continue_rcu() except
+ * this starts from the given position and that one starts from the position
+ * after the given position.
+ */
+#define list_for_each_entry_from_rcu(pos, head, member)
\
+   for (; &(pos)->member != (head);
\
+   pos = list_entry_rcu(pos->member.next, typeof(*(pos)), member))
+
 /**
  * hlist_del_rcu - deletes entry from hash list without re-initialization
  * @n: the element to delete from the hash list.
-- 
2.14.0.rc0.dirty



signature.asc
Description: PGP signature

[PATCH 3/4 v2] rculist: add list_for_each_entry_from_rcu()

2018-04-30 Thread NeilBrown


list_for_each_entry_from_rcu() is an RCU version of
list_for_each_entry_from().  It walks a linked list under rcu
protection, from a given start point.

It is similar to list_for_each_entry_continue_rcu() but starts *at*
the given position rather than *after* it.

Naturally, the start point must be known to be in the list.

Also update the documentation for list_for_each_entry_continue_rcu()
to match the documentation for the new list_for_each_entry_from_rcu(),
and add list_for_each_entry_from_rcu() and the already existing
hlist_for_each_entry_from_rcu() to section 7 of whatisRCU.txt.

Signed-off-by: NeilBrown 
---
 Documentation/RCU/whatisRCU.txt |  2 ++
 include/linux/rculist.h | 32 +++-
 2 files changed, 33 insertions(+), 1 deletion(-)

diff --git a/Documentation/RCU/whatisRCU.txt b/Documentation/RCU/whatisRCU.txt
index a27fbfb0efb8..b7d38bd212d2 100644
--- a/Documentation/RCU/whatisRCU.txt
+++ b/Documentation/RCU/whatisRCU.txt
@@ -814,11 +814,13 @@ RCU list traversal:
list_next_rcu
list_for_each_entry_rcu
list_for_each_entry_continue_rcu
+   list_for_each_entry_from_rcu
hlist_first_rcu
hlist_next_rcu
hlist_pprev_rcu
hlist_for_each_entry_rcu
hlist_for_each_entry_rcu_bh
+   hlist_for_each_entry_from_rcu
hlist_for_each_entry_continue_rcu
hlist_for_each_entry_continue_rcu_bh
hlist_nulls_first_rcu
diff --git a/include/linux/rculist.h b/include/linux/rculist.h
index 127f534fec94..4786c2235b98 100644
--- a/include/linux/rculist.h
+++ b/include/linux/rculist.h
@@ -396,13 +396,43 @@ static inline void list_splice_tail_init_rcu(struct 
list_head *list,
  * @member:the name of the list_head within the struct.
  *
  * Continue to iterate over list of given type, continuing after
- * the current position.
+ * the current position which must have been in the list when the RCU read
+ * lock was taken.
+ * This would typically require either that you obtained the node from a
+ * previous walk of the list in the same RCU read-side critical section, or
+ * that you held some sort of non-RCU reference (such as a reference count)
+ * to keep the node alive *and* in the list.
+ *
+ * This iterator is similar to list_for_each_entry_from_rcu() except
+ * this starts after the given position and that one starts at the given
+ * position.
  */
 #define list_for_each_entry_continue_rcu(pos, head, member)\
for (pos = list_entry_rcu(pos->member.next, typeof(*pos), member); \
 >member != (head);\
 pos = list_entry_rcu(pos->member.next, typeof(*pos), member))
 
+/**
+ * list_for_each_entry_from_rcu - iterate over a list from current point
+ * @pos:   the type * to use as a loop cursor.
+ * @head:  the head for your list.
+ * @member:the name of the list_node within the struct.
+ *
+ * Iterate over the tail of a list starting from a given position,
+ * which must have been in the list when the RCU read lock was taken.
+ * This would typically require either that you obtained the node from a
+ * previous walk of the list in the same RCU read-side critical section, or
+ * that you held some sort of non-RCU reference (such as a reference count)
+ * to keep the node alive *and* in the list.
+ *
+ * This iterator is similar to list_for_each_entry_continue_rcu() except
+ * this starts from the given position and that one starts from the position
+ * after the given position.
+ */
+#define list_for_each_entry_from_rcu(pos, head, member)
\
+   for (; &(pos)->member != (head);
\
+   pos = list_entry_rcu(pos->member.next, typeof(*(pos)), member))
+
 /**
  * hlist_del_rcu - deletes entry from hash list without re-initialization
  * @n: the element to delete from the hash list.
-- 
2.14.0.rc0.dirty



signature.asc
Description: PGP signature

Re: [PATCH 3/5] ib_srpt: depend on INFINIBAND_ADDR_TRANS

2018-04-30 Thread Greg Thelen

On Mon, Apr 30, 2018 at 4:35 PM Jason Gunthorpe  wrote:

> On Wed, Apr 25, 2018 at 03:33:39PM -0700, Greg Thelen wrote:
> > INFINIBAND_SRPT code depends on INFINIBAND_ADDR_TRANS provided symbols.
> > So declare the kconfig dependency.  This is necessary to allow for
> > enabling INFINIBAND without INFINIBAND_ADDR_TRANS.
> >
> > Signed-off-by: Greg Thelen 
> > Cc: Tarick Bedeir 
> >  drivers/infiniband/ulp/srpt/Kconfig | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/infiniband/ulp/srpt/Kconfig
b/drivers/infiniband/ulp/srpt/Kconfig
> > index 31ee83d528d9..fb8b7182f05e 100644
> > +++ b/drivers/infiniband/ulp/srpt/Kconfig
> > @@ -1,6 +1,6 @@
> >  config INFINIBAND_SRPT
> >   tristate "InfiniBand SCSI RDMA Protocol target support"
> > - depends on INFINIBAND && TARGET_CORE
> > + depends on INFINIBAND && INFINIBAND_ADDR_TRANS && TARGET_CORE

> Isn't INFINIBAND && INFINIBAND_ADDR_TRANS a bit redundant? Can't have
> INFINIBAND_ADDR_TRANS without INFINIBAND.

By kconfig INFINIBAND_ADDR_TRANS depends on INFINIBAND.  So yes, it seems
redundant.  I don't know if anyone has designs to break this dependency and
allow for ADDR_TRANS without INFINIBAND.  Assuming not, I'd be willing to
amend my series removing redundant INFINIBAND and a followup series to
remove it from similar depends.  Though I'm not familiar with rdma dev tree
lifecycle.  Is rdma/for-rc a throw away branch (akin to linux-next), or
will it be merged into linus/master?  If throwaway, then we can amend
its patches, otherwise followups will be needed.

Let me know what you'd prefer.  Thanks.

FYI from v4.17-rc3:
drivers/staging/lustre/lnet/Kconfig:  depends on LNET && PCI && INFINIBAND
&& INFINIBAND_ADDR_TRANS
net/9p/Kconfig:   depends on INET && INFINIBAND && INFINIBAND_ADDR_TRANS
net/rds/Kconfig:  depends on RDS && INFINIBAND && INFINIBAND_ADDR_TRANS
net/sunrpc/Kconfig:   depends on SUNRPC && INFINIBAND &&
INFINIBAND_ADDR_TRANS

Re: [PATCH 3/5] ib_srpt: depend on INFINIBAND_ADDR_TRANS

2018-04-30 Thread Greg Thelen

On Mon, Apr 30, 2018 at 4:35 PM Jason Gunthorpe  wrote:

> On Wed, Apr 25, 2018 at 03:33:39PM -0700, Greg Thelen wrote:
> > INFINIBAND_SRPT code depends on INFINIBAND_ADDR_TRANS provided symbols.
> > So declare the kconfig dependency.  This is necessary to allow for
> > enabling INFINIBAND without INFINIBAND_ADDR_TRANS.
> >
> > Signed-off-by: Greg Thelen 
> > Cc: Tarick Bedeir 
> >  drivers/infiniband/ulp/srpt/Kconfig | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/infiniband/ulp/srpt/Kconfig
b/drivers/infiniband/ulp/srpt/Kconfig
> > index 31ee83d528d9..fb8b7182f05e 100644
> > +++ b/drivers/infiniband/ulp/srpt/Kconfig
> > @@ -1,6 +1,6 @@
> >  config INFINIBAND_SRPT
> >   tristate "InfiniBand SCSI RDMA Protocol target support"
> > - depends on INFINIBAND && TARGET_CORE
> > + depends on INFINIBAND && INFINIBAND_ADDR_TRANS && TARGET_CORE

> Isn't INFINIBAND && INFINIBAND_ADDR_TRANS a bit redundant? Can't have
> INFINIBAND_ADDR_TRANS without INFINIBAND.

By kconfig INFINIBAND_ADDR_TRANS depends on INFINIBAND.  So yes, it seems
redundant.  I don't know if anyone has designs to break this dependency and
allow for ADDR_TRANS without INFINIBAND.  Assuming not, I'd be willing to
amend my series removing redundant INFINIBAND and a followup series to
remove it from similar depends.  Though I'm not familiar with rdma dev tree
lifecycle.  Is rdma/for-rc a throw away branch (akin to linux-next), or
will it be merged into linus/master?  If throwaway, then we can amend
its patches, otherwise followups will be needed.

Let me know what you'd prefer.  Thanks.

FYI from v4.17-rc3:
drivers/staging/lustre/lnet/Kconfig:  depends on LNET && PCI && INFINIBAND
&& INFINIBAND_ADDR_TRANS
net/9p/Kconfig:   depends on INET && INFINIBAND && INFINIBAND_ADDR_TRANS
net/rds/Kconfig:  depends on RDS && INFINIBAND && INFINIBAND_ADDR_TRANS
net/sunrpc/Kconfig:   depends on SUNRPC && INFINIBAND &&
INFINIBAND_ADDR_TRANS

[PATCH][next] pinctrl: actions: Fix Kconfig dependency and help text

2018-04-30 Thread Manivannan Sadhasivam

1. Fix Kconfig dependency for Actions Semi S900 pinctrl driver which
generates below warning in x86:

WARNING: unmet direct dependencies detected for PINCTRL_OWL
  Depends on [n]: PINCTRL [=y] && (ARCH_ACTIONS || COMPILE_TEST [=n]) && OF [=n]
  Selected by [y]:
  - PINCTRL_S900 [=y] && PINCTRL [=y]

2. Add help text for OWL pinctrl driver

Signed-off-by: Manivannan Sadhasivam 
---
 drivers/pinctrl/actions/Kconfig | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/pinctrl/actions/Kconfig b/drivers/pinctrl/actions/Kconfig
index 1c7309c90f0d..ede97cdbbc12 100644
--- a/drivers/pinctrl/actions/Kconfig
+++ b/drivers/pinctrl/actions/Kconfig
@@ -1,12 +1,14 @@
 config PINCTRL_OWL
-   bool
+   bool "Actions Semi OWL pinctrl driver"
depends on (ARCH_ACTIONS || COMPILE_TEST) && OF
select PINMUX
select PINCONF
select GENERIC_PINCONF
+   help
+ Say Y here to enable Actions Semi OWL pinctrl driver
 
 config PINCTRL_S900
bool "Actions Semi S900 pinctrl driver"
-   select PINCTRL_OWL
+   depends on PINCTRL_OWL
help
  Say Y here to enable Actions Semi S900 pinctrl driver
-- 
2.14.1

[PATCH][next] pinctrl: actions: Fix Kconfig dependency and help text

2018-04-30 Thread Manivannan Sadhasivam

1. Fix Kconfig dependency for Actions Semi S900 pinctrl driver which
generates below warning in x86:

WARNING: unmet direct dependencies detected for PINCTRL_OWL
  Depends on [n]: PINCTRL [=y] && (ARCH_ACTIONS || COMPILE_TEST [=n]) && OF [=n]
  Selected by [y]:
  - PINCTRL_S900 [=y] && PINCTRL [=y]

2. Add help text for OWL pinctrl driver

Signed-off-by: Manivannan Sadhasivam 
---
 drivers/pinctrl/actions/Kconfig | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/pinctrl/actions/Kconfig b/drivers/pinctrl/actions/Kconfig
index 1c7309c90f0d..ede97cdbbc12 100644
--- a/drivers/pinctrl/actions/Kconfig
+++ b/drivers/pinctrl/actions/Kconfig
@@ -1,12 +1,14 @@
 config PINCTRL_OWL
-   bool
+   bool "Actions Semi OWL pinctrl driver"
depends on (ARCH_ACTIONS || COMPILE_TEST) && OF
select PINMUX
select PINCONF
select GENERIC_PINCONF
+   help
+ Say Y here to enable Actions Semi OWL pinctrl driver
 
 config PINCTRL_S900
bool "Actions Semi S900 pinctrl driver"
-   select PINCTRL_OWL
+   depends on PINCTRL_OWL
help
  Say Y here to enable Actions Semi S900 pinctrl driver
-- 
2.14.1

[GIT] XArray v12

2018-04-30 Thread Matthew Wilcox


I've made version 12 of the XArray and page cache conversion available at
git://git.infradead.org/users/willy/linux-dax.git xarray-20180430

Changes since v11:

 - At Goldwyn's request, renamed xas_for_each_tag -> xas_for_each_tagged,
   xas_find_tag -> xas_find_tagged and xas_next_tag -> xas_next_tagged
 - Fix performance regression (relative to radix_tree_tag_clear) when
   using xas_clear_tag to clear an already-cleared tag.
 - Use __test_and_set_bit in node_set_tag() rather than testing
   node_get_tag() before calling node_set_tag().
 - Added asm-generic/bitops/non-atomic.h to tools/include
 - Removed xas_create() from the exported API.  All callers can use xas_load
   instead.  It makes the callers more understanable and it reduces the
   size of the API.
 - Documented xas_create_range().
 - Improved the documentation for xas_store(), explaining the return value
   for a multi-index xa_state.
 - Re-re-did the memfd patches on top of the current state of play.
 - Used xas_set_order() to zero out all entries for a THP page instead
   of a loop in page_cache_delete().  Goldwyn pointed out the loop was
   ugly, and then so did everybody at LSFMM.
 - Rewrote the nilfs patch to be closer to the original radix tree-based
   code since I have no way of verifying it and the maintainer isn't
   responding to requests to see if it works.
 - f2fs dropped its copy of __set_page_dirty_buffers, so dropped my
   modification of it.
 - Fixed a missing irq-disable in shmem_free_swap().

Matthew Wilcox (63):
  xarray: Replace exceptional entries
  xarray: Change definition of sibling entries
  xarray: Add definition of struct xarray
  xarray: Define struct xa_node
  xarray: Add documentation
  xarray: Add xa_load
  xarray: Add XArray tags
  xarray: Add xa_store
  xarray: Add xa_cmpxchg and xa_insert
  xarray: Add xa_for_each
  xarray: Add xa_extract
  xarray: Add xa_destroy
  xarray: Add xas_next and xas_prev
  xarray: Add xas_create_range
  xarray: Add MAINTAINERS entry
  page cache: Rearrange address_space
  page cache: Convert hole search to XArray
  page cache: Add and replace pages using the XArray
  page cache: Convert page deletion to XArray
  page cache: Convert page cache lookups to XArray
  page cache: Convert delete_batch to XArray
  page cache: Remove stray radix comment
  page cache: Convert filemap_range_has_page to XArray
  mm: Convert page-writeback to XArray
  mm: Convert workingset to XArray
  mm: Convert truncate to XArray
  mm: Convert add_to_swap_cache to XArray
  mm: Convert delete_from_swap_cache to XArray
  mm: Convert __do_page_cache_readahead to XArray
  mm: Convert page migration to XArray
  mm: Convert huge_memory to XArray
  mm: Convert collapse_shmem to XArray
  mm: Convert khugepaged_scan_shmem to XArray
  pagevec: Use xa_tag_t
  shmem: Convert replace to XArray
  shmem: Convert shmem_confirm_swap to XArray
  shmem: Convert find_swap_entry to XArray
  shmem: Convert shmem_add_to_page_cache to XArray
  shmem: Convert shmem_alloc_hugepage to XArray
  shmem: Convert shmem_free_swap to XArray
  shmem: Convert shmem_partial_swap_usage to XArray
  memfd: Convert memfd_wait_for_pins to XArray
  memfd: Convert memfd_tag_pins to XArray
  shmem: Comment fixups
  btrfs: Convert page cache to XArray
  fs: Convert buffer to XArray
  fs: Convert writeback to XArray
  nilfs2: Convert to XArray
  f2fs: Convert to XArray
  lustre: Convert to XArray
  dax: Fix use of zero page
  dax: dax_insert_mapping_entry always succeeds
  dax: Rename some functions
  dax: Hash on XArray instead of mapping
  dax: Convert dax_insert_pfn_mkwrite to XArray
  dax: Convert __dax_invalidate_entry to XArray
  dax: Convert dax writeback to XArray
  dax: Convert page fault handlers to XArray
  dax: Return fault code from dax_load_hole
  page cache: Finish XArray conversion
  radix tree: Remove unused functions
  radix tree: Remove radix_tree_update_node_t
  radix tree: Remove radix_tree_clear_tags

 .clang-format   |1 -
 Documentation/core-api/index.rst|1 +
 Documentation/core-api/xarray.rst   |  360 +
 MAINTAINERS |   12 +
 arch/powerpc/include/asm/book3s/64/pgtable.h|4 +-
 arch/powerpc/include/asm/nohash/64/pgtable.h|4 +-
 drivers/gpu/drm/i915/i915_gem.c |   17 +-
 drivers/staging/lustre/lustre/llite/glimpse.c   |   12 +-
 drivers/staging/lustre/lustre/mdc/mdc_request.c |   16 +-
 fs/btrfs/compression.c  |6 +-
 fs/btrfs/extent_io.c|   12 +-
 fs/buffer.c |   14 +-
 fs/dax.c|  725 --
 fs/ext

[GIT] XArray v12

2018-04-30 Thread Matthew Wilcox


I've made version 12 of the XArray and page cache conversion available at
git://git.infradead.org/users/willy/linux-dax.git xarray-20180430

Changes since v11:

 - At Goldwyn's request, renamed xas_for_each_tag -> xas_for_each_tagged,
   xas_find_tag -> xas_find_tagged and xas_next_tag -> xas_next_tagged
 - Fix performance regression (relative to radix_tree_tag_clear) when
   using xas_clear_tag to clear an already-cleared tag.
 - Use __test_and_set_bit in node_set_tag() rather than testing
   node_get_tag() before calling node_set_tag().
 - Added asm-generic/bitops/non-atomic.h to tools/include
 - Removed xas_create() from the exported API.  All callers can use xas_load
   instead.  It makes the callers more understanable and it reduces the
   size of the API.
 - Documented xas_create_range().
 - Improved the documentation for xas_store(), explaining the return value
   for a multi-index xa_state.
 - Re-re-did the memfd patches on top of the current state of play.
 - Used xas_set_order() to zero out all entries for a THP page instead
   of a loop in page_cache_delete().  Goldwyn pointed out the loop was
   ugly, and then so did everybody at LSFMM.
 - Rewrote the nilfs patch to be closer to the original radix tree-based
   code since I have no way of verifying it and the maintainer isn't
   responding to requests to see if it works.
 - f2fs dropped its copy of __set_page_dirty_buffers, so dropped my
   modification of it.
 - Fixed a missing irq-disable in shmem_free_swap().

Matthew Wilcox (63):
  xarray: Replace exceptional entries
  xarray: Change definition of sibling entries
  xarray: Add definition of struct xarray
  xarray: Define struct xa_node
  xarray: Add documentation
  xarray: Add xa_load
  xarray: Add XArray tags
  xarray: Add xa_store
  xarray: Add xa_cmpxchg and xa_insert
  xarray: Add xa_for_each
  xarray: Add xa_extract
  xarray: Add xa_destroy
  xarray: Add xas_next and xas_prev
  xarray: Add xas_create_range
  xarray: Add MAINTAINERS entry
  page cache: Rearrange address_space
  page cache: Convert hole search to XArray
  page cache: Add and replace pages using the XArray
  page cache: Convert page deletion to XArray
  page cache: Convert page cache lookups to XArray
  page cache: Convert delete_batch to XArray
  page cache: Remove stray radix comment
  page cache: Convert filemap_range_has_page to XArray
  mm: Convert page-writeback to XArray
  mm: Convert workingset to XArray
  mm: Convert truncate to XArray
  mm: Convert add_to_swap_cache to XArray
  mm: Convert delete_from_swap_cache to XArray
  mm: Convert __do_page_cache_readahead to XArray
  mm: Convert page migration to XArray
  mm: Convert huge_memory to XArray
  mm: Convert collapse_shmem to XArray
  mm: Convert khugepaged_scan_shmem to XArray
  pagevec: Use xa_tag_t
  shmem: Convert replace to XArray
  shmem: Convert shmem_confirm_swap to XArray
  shmem: Convert find_swap_entry to XArray
  shmem: Convert shmem_add_to_page_cache to XArray
  shmem: Convert shmem_alloc_hugepage to XArray
  shmem: Convert shmem_free_swap to XArray
  shmem: Convert shmem_partial_swap_usage to XArray
  memfd: Convert memfd_wait_for_pins to XArray
  memfd: Convert memfd_tag_pins to XArray
  shmem: Comment fixups
  btrfs: Convert page cache to XArray
  fs: Convert buffer to XArray
  fs: Convert writeback to XArray
  nilfs2: Convert to XArray
  f2fs: Convert to XArray
  lustre: Convert to XArray
  dax: Fix use of zero page
  dax: dax_insert_mapping_entry always succeeds
  dax: Rename some functions
  dax: Hash on XArray instead of mapping
  dax: Convert dax_insert_pfn_mkwrite to XArray
  dax: Convert __dax_invalidate_entry to XArray
  dax: Convert dax writeback to XArray
  dax: Convert page fault handlers to XArray
  dax: Return fault code from dax_load_hole
  page cache: Finish XArray conversion
  radix tree: Remove unused functions
  radix tree: Remove radix_tree_update_node_t
  radix tree: Remove radix_tree_clear_tags

 .clang-format   |1 -
 Documentation/core-api/index.rst|1 +
 Documentation/core-api/xarray.rst   |  360 +
 MAINTAINERS |   12 +
 arch/powerpc/include/asm/book3s/64/pgtable.h|4 +-
 arch/powerpc/include/asm/nohash/64/pgtable.h|4 +-
 drivers/gpu/drm/i915/i915_gem.c |   17 +-
 drivers/staging/lustre/lustre/llite/glimpse.c   |   12 +-
 drivers/staging/lustre/lustre/mdc/mdc_request.c |   16 +-
 fs/btrfs/compression.c  |6 +-
 fs/btrfs/extent_io.c|   12 +-
 fs/buffer.c |   14 +-
 fs/dax.c|  725 --
 fs/ext

Re: [PATCH RFC v5 5/6] tracepoint: Make rcuidle tracepoint callers use SRCU

2018-04-30 Thread Joel Fernandes

On Mon, Apr 30, 2018 at 6:42 PM Joel Fernandes  wrote:

> In recent tests with IRQ on/off tracepoints, a large performance
> overhead ~10% is noticed when running hackbench. This is root caused to
> calls to rcu_irq_enter_irqson and rcu_irq_exit_irqson from the
> tracepoint code. Following a long discussion on the list [1] about this,
> we concluded that srcu is a better alternative for use during rcu idle.
> Although it does involve extra barriers, its lighter than the sched-rcu
> version which has to do additional RCU calls to notify RCU idle about
> entry into RCU sections.

> In this patch, we change the underlying implementation of the
> trace_*_rcuidle API to use SRCU. This has shown to improve performance
> alot for the high frequency irq enable/disable tracepoints.

> Test: Tested idle and preempt/irq tracepoints.

> [1] https://patchwork.kernel.org/patch/10344297/

> Cc: Steven Rostedt 
> Cc: Peter Zilstra 
> Cc: Ingo Molnar 
> Cc: Mathieu Desnoyers 
> Cc: Tom Zanussi 
> Cc: Namhyung Kim 
> Cc: Thomas Glexiner 
> Cc: Boqun Feng 
> Cc: Paul McKenney 
> Cc: Frederic Weisbecker 
> Cc: Randy Dunlap 
> Cc: Masami Hiramatsu 
> Cc: Fenguang Wu 
> Cc: Baohong Liu 
> Cc: Vedang Patel 
> Cc: kernel-t...@android.com
> Signed-off-by: Joel Fernandes 
> ---
>   include/linux/tracepoint.h | 46 +++---
>   kernel/tracepoint.c| 10 -
>   2 files changed, 47 insertions(+), 9 deletions(-)

> diff --git a/include/linux/tracepoint.h b/include/linux/tracepoint.h
> index c94f466d57ef..4135e08fb5f1 100644
> --- a/include/linux/tracepoint.h
> +++ b/include/linux/tracepoint.h
> @@ -15,6 +15,7 @@
>*/

>   #include 
> +#include 
>   #include 
>   #include 
>   #include 
> @@ -33,6 +34,8 @@ struct trace_eval_map {

>   #define TRACEPOINT_DEFAULT_PRIO10

> +extern struct srcu_struct tracepoint_srcu;
> +
>   extern int
>   tracepoint_probe_register(struct tracepoint *tp, void *probe, void
*data);
>   extern int
> @@ -77,6 +80,9 @@ int unregister_tracepoint_module_notifier(struct
notifier_block *nb)
>*/
>   static inline void tracepoint_synchronize_unregister(void)
>   {
> +#ifdef CONFIG_TRACEPOINTS
> +   synchronize_srcu(_srcu);
> +#endif
>  synchronize_sched();
>   }

> @@ -129,18 +135,38 @@ extern void syscall_unregfunc(void);
>* as "(void *, void)". The DECLARE_TRACE_NOARGS() will pass in just
>* "void *data", where as the DECLARE_TRACE() will pass in "void *data,
proto".
>*/
> -#define __DO_TRACE(tp, proto, args, cond, rcucheck)\
> +#define __DO_TRACE(tp, proto, args, cond, rcuidle) \
>  do {\
>  struct tracepoint_func *it_func_ptr;\
>  void *it_func;  \
>  void *__data;   \
> +   int __maybe_unused idx = 0; \
>  \
>  if (!(cond))\
>  return; \
> -   if (rcucheck)   \
> -   rcu_irq_enter_irqson(); \
> -   rcu_read_lock_sched_notrace();  \
> -   it_func_ptr = rcu_dereference_sched((tp)->funcs);   \
> +   \
> +   /*  \
> +* For rcuidle callers, use srcu since sched-rcu\
> +* doesn't work from the idle path. \
> +*/ \
> +   if (rcuidle) {  \
> +   if (in_nmi()) { \
> +   WARN_ON_ONCE(1);\
> +   return; /* no srcu from nmi */  \
> +   }   \
> +   \
> +   /* To keep it consistent with !rcuidle path */  \
> +   preempt_disable_notrace();  \
> +

Re: [PATCH RFC v5 5/6] tracepoint: Make rcuidle tracepoint callers use SRCU

2018-04-30 Thread Joel Fernandes

On Mon, Apr 30, 2018 at 6:42 PM Joel Fernandes  wrote:

> In recent tests with IRQ on/off tracepoints, a large performance
> overhead ~10% is noticed when running hackbench. This is root caused to
> calls to rcu_irq_enter_irqson and rcu_irq_exit_irqson from the
> tracepoint code. Following a long discussion on the list [1] about this,
> we concluded that srcu is a better alternative for use during rcu idle.
> Although it does involve extra barriers, its lighter than the sched-rcu
> version which has to do additional RCU calls to notify RCU idle about
> entry into RCU sections.

> In this patch, we change the underlying implementation of the
> trace_*_rcuidle API to use SRCU. This has shown to improve performance
> alot for the high frequency irq enable/disable tracepoints.

> Test: Tested idle and preempt/irq tracepoints.

> [1] https://patchwork.kernel.org/patch/10344297/

> Cc: Steven Rostedt 
> Cc: Peter Zilstra 
> Cc: Ingo Molnar 
> Cc: Mathieu Desnoyers 
> Cc: Tom Zanussi 
> Cc: Namhyung Kim 
> Cc: Thomas Glexiner 
> Cc: Boqun Feng 
> Cc: Paul McKenney 
> Cc: Frederic Weisbecker 
> Cc: Randy Dunlap 
> Cc: Masami Hiramatsu 
> Cc: Fenguang Wu 
> Cc: Baohong Liu 
> Cc: Vedang Patel 
> Cc: kernel-t...@android.com
> Signed-off-by: Joel Fernandes 
> ---
>   include/linux/tracepoint.h | 46 +++---
>   kernel/tracepoint.c| 10 -
>   2 files changed, 47 insertions(+), 9 deletions(-)

> diff --git a/include/linux/tracepoint.h b/include/linux/tracepoint.h
> index c94f466d57ef..4135e08fb5f1 100644
> --- a/include/linux/tracepoint.h
> +++ b/include/linux/tracepoint.h
> @@ -15,6 +15,7 @@
>*/

>   #include 
> +#include 
>   #include 
>   #include 
>   #include 
> @@ -33,6 +34,8 @@ struct trace_eval_map {

>   #define TRACEPOINT_DEFAULT_PRIO10

> +extern struct srcu_struct tracepoint_srcu;
> +
>   extern int
>   tracepoint_probe_register(struct tracepoint *tp, void *probe, void
*data);
>   extern int
> @@ -77,6 +80,9 @@ int unregister_tracepoint_module_notifier(struct
notifier_block *nb)
>*/
>   static inline void tracepoint_synchronize_unregister(void)
>   {
> +#ifdef CONFIG_TRACEPOINTS
> +   synchronize_srcu(_srcu);
> +#endif
>  synchronize_sched();
>   }

> @@ -129,18 +135,38 @@ extern void syscall_unregfunc(void);
>* as "(void *, void)". The DECLARE_TRACE_NOARGS() will pass in just
>* "void *data", where as the DECLARE_TRACE() will pass in "void *data,
proto".
>*/
> -#define __DO_TRACE(tp, proto, args, cond, rcucheck)\
> +#define __DO_TRACE(tp, proto, args, cond, rcuidle) \
>  do {\
>  struct tracepoint_func *it_func_ptr;\
>  void *it_func;  \
>  void *__data;   \
> +   int __maybe_unused idx = 0; \
>  \
>  if (!(cond))\
>  return; \
> -   if (rcucheck)   \
> -   rcu_irq_enter_irqson(); \
> -   rcu_read_lock_sched_notrace();  \
> -   it_func_ptr = rcu_dereference_sched((tp)->funcs);   \
> +   \
> +   /*  \
> +* For rcuidle callers, use srcu since sched-rcu\
> +* doesn't work from the idle path. \
> +*/ \
> +   if (rcuidle) {  \
> +   if (in_nmi()) { \
> +   WARN_ON_ONCE(1);\
> +   return; /* no srcu from nmi */  \
> +   }   \
> +   \
> +   /* To keep it consistent with !rcuidle path */  \
> +   preempt_disable_notrace();  \
> +   \
> +   idx = srcu_read_lock_notrace(_srcu); \
> +   it_func_ptr = srcu_dereference((tp)->funcs, \
> +   _srcu);  \

This last bit is supposed to be srcu_dereference_notrace. The hunk to use
that is actually in patch 6/6 , sorry about that. I've fixed it in my tree
and it means patches

[PATCH] regulator: ltc3676: Assure PGOOD mask is set before changing voltage

2018-04-30 Thread Marek Vasut

Make sure the DVBxB bit 5, PGOOD mask, is set before changing voltage
on the buck converters. If the PGOOD mask bit is not set, the PMIC may
deassert the PGOOD signal during the voltage transition.

On systems that use the PGOOD signal as a power OK indication for the
board or SoC, which should be the case on correct designs, deasserting
the PGOOD signal will lead to system reset or shutdown, which is not
the expected behavior when changing PMIC buck converter voltage.

Signed-off-by: Marek Vasut 
Cc: Javier Martinez Canillas 
Cc: Mark Brown 
---
NOTE: This was observed on an iMX6Q design during DVFS. When the cpufreq
  driver changed the frequency and scaled the voltage up, the system
  froze. This was because the PGOOD went down and the voltage rails
  lost power shortly after.
---
 drivers/regulator/ltc3676.c | 20 +++-
 1 file changed, 19 insertions(+), 1 deletion(-)

diff --git a/drivers/regulator/ltc3676.c b/drivers/regulator/ltc3676.c
index 662ee05ea44d..9dec1609ff66 100644
--- a/drivers/regulator/ltc3676.c
+++ b/drivers/regulator/ltc3676.c
@@ -52,6 +52,7 @@
 #define LTC3676_CLIRQ 0x1F
 
 #define LTC3676_DVBxA_REF_SELECT   BIT(5)
+#define LTC3676_DVBxB_PGOOD_MASK   BIT(5)
 
 #define LTC3676_IRQSTAT_PGOOD_TIMEOUT  BIT(3)
 #define LTC3676_IRQSTAT_UNDERVOLT_WARN BIT(4)
@@ -123,6 +124,23 @@ static int ltc3676_set_suspend_mode(struct regulator_dev 
*rdev,
  mask, val);
 }
 
+static int ltc3676_set_voltage_sel(struct regulator_dev *rdev, unsigned 
selector)
+{
+   struct ltc3676 *ltc3676 = rdev_get_drvdata(rdev);
+   struct device *dev = ltc3676->dev;
+   int ret, dcdc = rdev_get_id(rdev);
+
+   dev_dbg(dev, "%s id=%d selector=%d\n", __func__, dcdc, selector);
+
+   ret = regmap_update_bits(ltc3676->regmap, rdev->desc->vsel_reg + 1,
+LTC3676_DVBxB_PGOOD_MASK,
+LTC3676_DVBxB_PGOOD_MASK);
+   if (ret)
+   return ret;
+
+   return regulator_set_voltage_sel_regmap(rdev, selector);
+}
+
 static inline unsigned int ltc3676_scale(unsigned int uV, u32 r1, u32 r2)
 {
uint64_t tmp;
@@ -166,7 +184,7 @@ static const struct regulator_ops 
ltc3676_linear_regulator_ops = {
.disable = regulator_disable_regmap,
.is_enabled = regulator_is_enabled_regmap,
.list_voltage = regulator_list_voltage_linear,
-   .set_voltage_sel = regulator_set_voltage_sel_regmap,
+   .set_voltage_sel = ltc3676_set_voltage_sel,
.get_voltage_sel = regulator_get_voltage_sel_regmap,
.set_suspend_voltage = ltc3676_set_suspend_voltage,
.set_suspend_mode = ltc3676_set_suspend_mode,
-- 
2.16.2

[PATCH] regulator: ltc3676: Assure PGOOD mask is set before changing voltage

2018-04-30 Thread Marek Vasut

Make sure the DVBxB bit 5, PGOOD mask, is set before changing voltage
on the buck converters. If the PGOOD mask bit is not set, the PMIC may
deassert the PGOOD signal during the voltage transition.

On systems that use the PGOOD signal as a power OK indication for the
board or SoC, which should be the case on correct designs, deasserting
the PGOOD signal will lead to system reset or shutdown, which is not
the expected behavior when changing PMIC buck converter voltage.

Signed-off-by: Marek Vasut 
Cc: Javier Martinez Canillas 
Cc: Mark Brown 
---
NOTE: This was observed on an iMX6Q design during DVFS. When the cpufreq
  driver changed the frequency and scaled the voltage up, the system
  froze. This was because the PGOOD went down and the voltage rails
  lost power shortly after.
---
 drivers/regulator/ltc3676.c | 20 +++-
 1 file changed, 19 insertions(+), 1 deletion(-)

diff --git a/drivers/regulator/ltc3676.c b/drivers/regulator/ltc3676.c
index 662ee05ea44d..9dec1609ff66 100644
--- a/drivers/regulator/ltc3676.c
+++ b/drivers/regulator/ltc3676.c
@@ -52,6 +52,7 @@
 #define LTC3676_CLIRQ 0x1F
 
 #define LTC3676_DVBxA_REF_SELECT   BIT(5)
+#define LTC3676_DVBxB_PGOOD_MASK   BIT(5)
 
 #define LTC3676_IRQSTAT_PGOOD_TIMEOUT  BIT(3)
 #define LTC3676_IRQSTAT_UNDERVOLT_WARN BIT(4)
@@ -123,6 +124,23 @@ static int ltc3676_set_suspend_mode(struct regulator_dev 
*rdev,
  mask, val);
 }
 
+static int ltc3676_set_voltage_sel(struct regulator_dev *rdev, unsigned 
selector)
+{
+   struct ltc3676 *ltc3676 = rdev_get_drvdata(rdev);
+   struct device *dev = ltc3676->dev;
+   int ret, dcdc = rdev_get_id(rdev);
+
+   dev_dbg(dev, "%s id=%d selector=%d\n", __func__, dcdc, selector);
+
+   ret = regmap_update_bits(ltc3676->regmap, rdev->desc->vsel_reg + 1,
+LTC3676_DVBxB_PGOOD_MASK,
+LTC3676_DVBxB_PGOOD_MASK);
+   if (ret)
+   return ret;
+
+   return regulator_set_voltage_sel_regmap(rdev, selector);
+}
+
 static inline unsigned int ltc3676_scale(unsigned int uV, u32 r1, u32 r2)
 {
uint64_t tmp;
@@ -166,7 +184,7 @@ static const struct regulator_ops 
ltc3676_linear_regulator_ops = {
.disable = regulator_disable_regmap,
.is_enabled = regulator_is_enabled_regmap,
.list_voltage = regulator_list_voltage_linear,
-   .set_voltage_sel = regulator_set_voltage_sel_regmap,
+   .set_voltage_sel = ltc3676_set_voltage_sel,
.get_voltage_sel = regulator_get_voltage_sel_regmap,
.set_suspend_voltage = ltc3676_set_suspend_voltage,
.set_suspend_mode = ltc3676_set_suspend_mode,
-- 
2.16.2

Re: [PATCH] z3fold: fix reclaim lock-ups

2018-04-30 Thread Guenter Roeck


Hi Vitaly,

On 04/30/2018 03:58 AM, Vitaly Wool wrote:

Do not try to optimize in-page object layout while the page is
under reclaim. This fixes lock-ups on reclaim and improves reclaim
performance at the same time.



A heads-up: z3fold is still crashing (due to a NULL pointer access) under
heavy memory pressure with this patch applied. That doesn't mean the patch
should not be applied - the new crash is different - but there is more work
to do.

See https://bugs.chromium.org/p/chromium/issues/detail?id=822360#c21 for a
crash log. This was seen with chromeos-4.14 with (I hope) all relevant z3fold
patches applied. I am trying to reproduce the problem on top of mainline.

Guenter


Reported-by: Guenter Roeck 
Signed-off-by: Vitaly Wool 
---
  mm/z3fold.c | 42 ++
  1 file changed, 30 insertions(+), 12 deletions(-)

diff --git a/mm/z3fold.c b/mm/z3fold.c
index c0bca6153b95..901c0b07cbda 100644
--- a/mm/z3fold.c
+++ b/mm/z3fold.c
@@ -144,7 +144,8 @@ enum z3fold_page_flags {
PAGE_HEADLESS = 0,
MIDDLE_CHUNK_MAPPED,
NEEDS_COMPACTING,
-   PAGE_STALE
+   PAGE_STALE,
+   UNDER_RECLAIM
  };
  
  /*

@@ -173,6 +174,7 @@ static struct z3fold_header *init_z3fold_page(struct page 
*page,
clear_bit(MIDDLE_CHUNK_MAPPED, >private);
clear_bit(NEEDS_COMPACTING, >private);
clear_bit(PAGE_STALE, >private);
+   clear_bit(UNDER_RECLAIM, >private);
  
  	spin_lock_init(>page_lock);

kref_init(>refcount);
@@ -756,6 +758,10 @@ static void z3fold_free(struct z3fold_pool *pool, unsigned 
long handle)
atomic64_dec(>pages_nr);
return;
}
+   if (test_bit(UNDER_RECLAIM, >private)) {
+   z3fold_page_unlock(zhdr);
+   return;
+   }
if (test_and_set_bit(NEEDS_COMPACTING, >private)) {
z3fold_page_unlock(zhdr);
return;
@@ -840,6 +846,8 @@ static int z3fold_reclaim_page(struct z3fold_pool *pool, 
unsigned int retries)
kref_get(>refcount);
list_del_init(>buddy);
zhdr->cpu = -1;
+   set_bit(UNDER_RECLAIM, >private);
+   break;
}
  
  		list_del_init(>lru);

@@ -887,25 +895,35 @@ static int z3fold_reclaim_page(struct z3fold_pool *pool, 
unsigned int retries)
goto next;
}
  next:
-   spin_lock(>lock);
if (test_bit(PAGE_HEADLESS, >private)) {
if (ret == 0) {
-   spin_unlock(>lock);
free_z3fold_page(page);
return 0;
}
-   } else if (kref_put(>refcount, release_z3fold_page)) {
-   atomic64_dec(>pages_nr);
+   spin_lock(>lock);
+   list_add(>lru, >lru);
+   spin_unlock(>lock);
+   } else {
+   z3fold_page_lock(zhdr);
+   clear_bit(UNDER_RECLAIM, >private);
+if (kref_put(>refcount,
+   release_z3fold_page_locked)) {
+   atomic64_dec(>pages_nr);
+   return 0;
+   }
+   /*
+* if we are here, the page is still not completely
+* free. Take the global pool lock then to be able
+* to add it back to the lru list
+*/
+   spin_lock(>lock);
+   list_add(>lru, >lru);
spin_unlock(>lock);
-   return 0;
+   z3fold_page_unlock(zhdr);
}
  
-		/*

-* Add to the beginning of LRU.
-* Pool lock has to be kept here to ensure the page has
-* not already been released
-*/
-   list_add(>lru, >lru);
+   /* We started off locked to we need to lock the pool back */
+   spin_lock(>lock);
}
spin_unlock(>lock);
return -EAGAIN;

Re: [PATCH] z3fold: fix reclaim lock-ups

2018-04-30 Thread Guenter Roeck


Hi Vitaly,

On 04/30/2018 03:58 AM, Vitaly Wool wrote:

Do not try to optimize in-page object layout while the page is
under reclaim. This fixes lock-ups on reclaim and improves reclaim
performance at the same time.



A heads-up: z3fold is still crashing (due to a NULL pointer access) under
heavy memory pressure with this patch applied. That doesn't mean the patch
should not be applied - the new crash is different - but there is more work
to do.

See https://bugs.chromium.org/p/chromium/issues/detail?id=822360#c21 for a
crash log. This was seen with chromeos-4.14 with (I hope) all relevant z3fold
patches applied. I am trying to reproduce the problem on top of mainline.

Guenter


Reported-by: Guenter Roeck 
Signed-off-by: Vitaly Wool 
---
  mm/z3fold.c | 42 ++
  1 file changed, 30 insertions(+), 12 deletions(-)

diff --git a/mm/z3fold.c b/mm/z3fold.c
index c0bca6153b95..901c0b07cbda 100644
--- a/mm/z3fold.c
+++ b/mm/z3fold.c
@@ -144,7 +144,8 @@ enum z3fold_page_flags {
PAGE_HEADLESS = 0,
MIDDLE_CHUNK_MAPPED,
NEEDS_COMPACTING,
-   PAGE_STALE
+   PAGE_STALE,
+   UNDER_RECLAIM
  };
  
  /*

@@ -173,6 +174,7 @@ static struct z3fold_header *init_z3fold_page(struct page 
*page,
clear_bit(MIDDLE_CHUNK_MAPPED, >private);
clear_bit(NEEDS_COMPACTING, >private);
clear_bit(PAGE_STALE, >private);
+   clear_bit(UNDER_RECLAIM, >private);
  
  	spin_lock_init(>page_lock);

kref_init(>refcount);
@@ -756,6 +758,10 @@ static void z3fold_free(struct z3fold_pool *pool, unsigned 
long handle)
atomic64_dec(>pages_nr);
return;
}
+   if (test_bit(UNDER_RECLAIM, >private)) {
+   z3fold_page_unlock(zhdr);
+   return;
+   }
if (test_and_set_bit(NEEDS_COMPACTING, >private)) {
z3fold_page_unlock(zhdr);
return;
@@ -840,6 +846,8 @@ static int z3fold_reclaim_page(struct z3fold_pool *pool, 
unsigned int retries)
kref_get(>refcount);
list_del_init(>buddy);
zhdr->cpu = -1;
+   set_bit(UNDER_RECLAIM, >private);
+   break;
}
  
  		list_del_init(>lru);

@@ -887,25 +895,35 @@ static int z3fold_reclaim_page(struct z3fold_pool *pool, 
unsigned int retries)
goto next;
}
  next:
-   spin_lock(>lock);
if (test_bit(PAGE_HEADLESS, >private)) {
if (ret == 0) {
-   spin_unlock(>lock);
free_z3fold_page(page);
return 0;
}
-   } else if (kref_put(>refcount, release_z3fold_page)) {
-   atomic64_dec(>pages_nr);
+   spin_lock(>lock);
+   list_add(>lru, >lru);
+   spin_unlock(>lock);
+   } else {
+   z3fold_page_lock(zhdr);
+   clear_bit(UNDER_RECLAIM, >private);
+if (kref_put(>refcount,
+   release_z3fold_page_locked)) {
+   atomic64_dec(>pages_nr);
+   return 0;
+   }
+   /*
+* if we are here, the page is still not completely
+* free. Take the global pool lock then to be able
+* to add it back to the lru list
+*/
+   spin_lock(>lock);
+   list_add(>lru, >lru);
spin_unlock(>lock);
-   return 0;
+   z3fold_page_unlock(zhdr);
}
  
-		/*

-* Add to the beginning of LRU.
-* Pool lock has to be kept here to ensure the page has
-* not already been released
-*/
-   list_add(>lru, >lru);
+   /* We started off locked to we need to lock the pool back */
+   spin_lock(>lock);
}
spin_unlock(>lock);
return -EAGAIN;

[PATCH RFC v5 1/6] softirq: reorder trace_softirqs_on to prevent lockdep splat

2018-04-30 Thread Joel Fernandes

I'm able to reproduce a lockdep splat when CONFIG_PROVE_LOCKING=y and
CONFIG_PREEMPTIRQ_EVENTS=y.

$ echo 1 > /d/tracing/events/preemptirq/preempt_enable/enable

Cc: Steven Rostedt 
Cc: Peter Zilstra 
Cc: Ingo Molnar 
Cc: Mathieu Desnoyers 
Cc: Tom Zanussi 
Cc: Namhyung Kim 
Cc: Thomas Glexiner 
Cc: Boqun Feng 
Cc: Paul McKenney 
Cc: Frederic Weisbecker 
Cc: Randy Dunlap 
Cc: Masami Hiramatsu 
Cc: Fenguang Wu 
Cc: Baohong Liu 
Cc: Vedang Patel 
Cc: kernel-t...@android.com
Signed-off-by: Joel Fernandes 
---
 kernel/softirq.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/kernel/softirq.c b/kernel/softirq.c
index 24d243ef8e71..47e2f61938c0 100644
--- a/kernel/softirq.c
+++ b/kernel/softirq.c
@@ -139,9 +139,13 @@ static void __local_bh_enable(unsigned int cnt)
 {
lockdep_assert_irqs_disabled();
 
+   if (preempt_count() == cnt)
+   trace_preempt_on(CALLER_ADDR0, get_lock_parent_ip());
+
if (softirq_count() == (cnt & SOFTIRQ_MASK))
trace_softirqs_on(_RET_IP_);
-   preempt_count_sub(cnt);
+
+   __preempt_count_sub(cnt);
 }
 
 /*
-- 
2.17.0.441.gb46fe60e1d-goog

[PATCH RFC v5 1/6] softirq: reorder trace_softirqs_on to prevent lockdep splat

2018-04-30 Thread Joel Fernandes

I'm able to reproduce a lockdep splat when CONFIG_PROVE_LOCKING=y and
CONFIG_PREEMPTIRQ_EVENTS=y.

$ echo 1 > /d/tracing/events/preemptirq/preempt_enable/enable

Cc: Steven Rostedt 
Cc: Peter Zilstra 
Cc: Ingo Molnar 
Cc: Mathieu Desnoyers 
Cc: Tom Zanussi 
Cc: Namhyung Kim 
Cc: Thomas Glexiner 
Cc: Boqun Feng 
Cc: Paul McKenney 
Cc: Frederic Weisbecker 
Cc: Randy Dunlap 
Cc: Masami Hiramatsu 
Cc: Fenguang Wu 
Cc: Baohong Liu 
Cc: Vedang Patel 
Cc: kernel-t...@android.com
Signed-off-by: Joel Fernandes 
---
 kernel/softirq.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/kernel/softirq.c b/kernel/softirq.c
index 24d243ef8e71..47e2f61938c0 100644
--- a/kernel/softirq.c
+++ b/kernel/softirq.c
@@ -139,9 +139,13 @@ static void __local_bh_enable(unsigned int cnt)
 {
lockdep_assert_irqs_disabled();
 
+   if (preempt_count() == cnt)
+   trace_preempt_on(CALLER_ADDR0, get_lock_parent_ip());
+
if (softirq_count() == (cnt & SOFTIRQ_MASK))
trace_softirqs_on(_RET_IP_);
-   preempt_count_sub(cnt);
+
+   __preempt_count_sub(cnt);
 }
 
 /*
-- 
2.17.0.441.gb46fe60e1d-goog

[PATCH RFC v5 2/6] srcu: Add notrace variants of srcu_read_{lock,unlock}

2018-04-30 Thread Joel Fernandes

From: "Paul E. McKenney" 

This is needed for a future tracepoint patch that uses srcu, and to make
sure it doesn't call into lockdep.

tracepoint code already calls notrace variants for rcu_read_lock_sched
so this patch does the same for srcu which will be used in a later
patch. Keeps it consistent with rcu-sched.

[Joel: Added commit message]

Cc: Steven Rostedt 
Cc: Peter Zilstra 
Cc: Ingo Molnar 
Cc: Mathieu Desnoyers 
Cc: Tom Zanussi 
Cc: Namhyung Kim 
Cc: Thomas Glexiner 
Cc: Boqun Feng 
Cc: Paul McKenney 
Cc: Frederic Weisbecker 
Cc: Randy Dunlap 
Cc: Masami Hiramatsu 
Cc: Fenguang Wu 
Cc: Baohong Liu 
Cc: Vedang Patel 
Cc: kernel-t...@android.com
Signed-off-by: Paul McKenney 
Signed-off-by: Joel Fernandes 
---
 include/linux/srcu.h | 17 +
 1 file changed, 17 insertions(+)

diff --git a/include/linux/srcu.h b/include/linux/srcu.h
index 33c1c698df09..2ec618979b20 100644
--- a/include/linux/srcu.h
+++ b/include/linux/srcu.h
@@ -161,6 +161,16 @@ static inline int srcu_read_lock(struct srcu_struct *sp) 
__acquires(sp)
return retval;
 }
 
+/* Used by tracing, cannot be traced and cannot invoke lockdep. */
+static inline notrace int
+srcu_read_lock_notrace(struct srcu_struct *sp) __acquires(sp)
+{
+   int retval;
+
+   retval = __srcu_read_lock(sp);
+   return retval;
+}
+
 /**
  * srcu_read_unlock - unregister a old reader from an SRCU-protected structure.
  * @sp: srcu_struct in which to unregister the old reader.
@@ -175,6 +185,13 @@ static inline void srcu_read_unlock(struct srcu_struct 
*sp, int idx)
__srcu_read_unlock(sp, idx);
 }
 
+/* Used by tracing, cannot be traced and cannot call lockdep. */
+static inline notrace void
+srcu_read_unlock_notrace(struct srcu_struct *sp, int idx) __releases(sp)
+{
+   __srcu_read_unlock(sp, idx);
+}
+
 /**
  * smp_mb__after_srcu_read_unlock - ensure full ordering after srcu_read_unlock
  *
-- 
2.17.0.441.gb46fe60e1d-goog

[PATCH RFC v5 0/6] Centralize and unify usage of preempt/irq tracepoints

2018-04-30 Thread Joel Fernandes

This is the next revision of preempt/irq tracepoint centralization and
unified usage across the kernel [1].
The preempt/irq tracepoints exist but not everything in the kernel is
using it. This makes things not work simultaneously (for ex, only either
lockdep or irqsoff events can be used at a time). This series is an attempt to
solve that, and also results in a nice clean up of kernel in general.
Several ifdefs are simpler, and the design is more unified and better.
Also as a result of this, we also speeded performance all rcuidle
tracepoints since their handling is simpler.

v5:
- Fixed performance issues due to rcu-idle handling

Joel Fernandes (5):
  softirq: reorder trace_softirqs_on to prevent lockdep splat
  srcu: Add notrace variant of srcu_dereference
  trace/irqsoff: Split reset into seperate functions
  tracepoint: Make rcuidle tracepoint callers use SRCU
  tracing: Centralize preemptirq tracepoints and unify their usage

Paul E. McKenney (1):
  srcu: Add notrace variants of srcu_read_{lock,unlock}

 include/linux/ftrace.h|  11 +-
 include/linux/irqflags.h  |  11 +-
 include/linux/lockdep.h   |   8 +-
 include/linux/preempt.h   |   2 +-
 include/linux/srcu.h  |  22 +++
 include/linux/tracepoint.h|  47 +-
 include/trace/events/preemptirq.h |  23 +--
 init/main.c   |   5 +-
 kernel/locking/lockdep.c  |  35 ++---
 kernel/sched/core.c   |   2 +-
 kernel/softirq.c  |   6 +-
 kernel/trace/Kconfig  |  22 ++-
 kernel/trace/Makefile |   2 +-
 kernel/trace/trace_irqsoff.c  | 235 +-
 kernel/trace/trace_preemptirq.c   |  71 +
 kernel/tracepoint.c   |  10 +-
 16 files changed, 283 insertions(+), 229 deletions(-)
 create mode 100644 kernel/trace/trace_preemptirq.c

Cc: Steven Rostedt 
Cc: Peter Zilstra 
Cc: Ingo Molnar 
Cc: Mathieu Desnoyers 
Cc: Tom Zanussi 
Cc: Namhyung Kim 
Cc: Thomas Glexiner 
Cc: Boqun Feng 
Cc: Paul McKenney 
Cc: Frederic Weisbecker 
Cc: Randy Dunlap 
Cc: Masami Hiramatsu 
Cc: Fenguang Wu 
Cc: Baohong Liu 
Cc: Vedang Patel 
Cc: kernel-t...@android.com
Signed-off-by: Joel Fernandes 
-- 
2.17.0.441.gb46fe60e1d-goog

[PATCH RFC v5 2/6] srcu: Add notrace variants of srcu_read_{lock,unlock}

2018-04-30 Thread Joel Fernandes

From: "Paul E. McKenney" 

This is needed for a future tracepoint patch that uses srcu, and to make
sure it doesn't call into lockdep.

tracepoint code already calls notrace variants for rcu_read_lock_sched
so this patch does the same for srcu which will be used in a later
patch. Keeps it consistent with rcu-sched.

[Joel: Added commit message]

Cc: Steven Rostedt 
Cc: Peter Zilstra 
Cc: Ingo Molnar 
Cc: Mathieu Desnoyers 
Cc: Tom Zanussi 
Cc: Namhyung Kim 
Cc: Thomas Glexiner 
Cc: Boqun Feng 
Cc: Paul McKenney 
Cc: Frederic Weisbecker 
Cc: Randy Dunlap 
Cc: Masami Hiramatsu 
Cc: Fenguang Wu 
Cc: Baohong Liu 
Cc: Vedang Patel 
Cc: kernel-t...@android.com
Signed-off-by: Paul McKenney 
Signed-off-by: Joel Fernandes 
---
 include/linux/srcu.h | 17 +
 1 file changed, 17 insertions(+)

diff --git a/include/linux/srcu.h b/include/linux/srcu.h
index 33c1c698df09..2ec618979b20 100644
--- a/include/linux/srcu.h
+++ b/include/linux/srcu.h
@@ -161,6 +161,16 @@ static inline int srcu_read_lock(struct srcu_struct *sp) 
__acquires(sp)
return retval;
 }
 
+/* Used by tracing, cannot be traced and cannot invoke lockdep. */
+static inline notrace int
+srcu_read_lock_notrace(struct srcu_struct *sp) __acquires(sp)
+{
+   int retval;
+
+   retval = __srcu_read_lock(sp);
+   return retval;
+}
+
 /**
  * srcu_read_unlock - unregister a old reader from an SRCU-protected structure.
  * @sp: srcu_struct in which to unregister the old reader.
@@ -175,6 +185,13 @@ static inline void srcu_read_unlock(struct srcu_struct 
*sp, int idx)
__srcu_read_unlock(sp, idx);
 }
 
+/* Used by tracing, cannot be traced and cannot call lockdep. */
+static inline notrace void
+srcu_read_unlock_notrace(struct srcu_struct *sp, int idx) __releases(sp)
+{
+   __srcu_read_unlock(sp, idx);
+}
+
 /**
  * smp_mb__after_srcu_read_unlock - ensure full ordering after srcu_read_unlock
  *
-- 
2.17.0.441.gb46fe60e1d-goog

[PATCH RFC v5 0/6] Centralize and unify usage of preempt/irq tracepoints

2018-04-30 Thread Joel Fernandes

This is the next revision of preempt/irq tracepoint centralization and
unified usage across the kernel [1].
The preempt/irq tracepoints exist but not everything in the kernel is
using it. This makes things not work simultaneously (for ex, only either
lockdep or irqsoff events can be used at a time). This series is an attempt to
solve that, and also results in a nice clean up of kernel in general.
Several ifdefs are simpler, and the design is more unified and better.
Also as a result of this, we also speeded performance all rcuidle
tracepoints since their handling is simpler.

v5:
- Fixed performance issues due to rcu-idle handling

Joel Fernandes (5):
  softirq: reorder trace_softirqs_on to prevent lockdep splat
  srcu: Add notrace variant of srcu_dereference
  trace/irqsoff: Split reset into seperate functions
  tracepoint: Make rcuidle tracepoint callers use SRCU
  tracing: Centralize preemptirq tracepoints and unify their usage

Paul E. McKenney (1):
  srcu: Add notrace variants of srcu_read_{lock,unlock}

 include/linux/ftrace.h|  11 +-
 include/linux/irqflags.h  |  11 +-
 include/linux/lockdep.h   |   8 +-
 include/linux/preempt.h   |   2 +-
 include/linux/srcu.h  |  22 +++
 include/linux/tracepoint.h|  47 +-
 include/trace/events/preemptirq.h |  23 +--
 init/main.c   |   5 +-
 kernel/locking/lockdep.c  |  35 ++---
 kernel/sched/core.c   |   2 +-
 kernel/softirq.c  |   6 +-
 kernel/trace/Kconfig  |  22 ++-
 kernel/trace/Makefile |   2 +-
 kernel/trace/trace_irqsoff.c  | 235 +-
 kernel/trace/trace_preemptirq.c   |  71 +
 kernel/tracepoint.c   |  10 +-
 16 files changed, 283 insertions(+), 229 deletions(-)
 create mode 100644 kernel/trace/trace_preemptirq.c

Cc: Steven Rostedt 
Cc: Peter Zilstra 
Cc: Ingo Molnar 
Cc: Mathieu Desnoyers 
Cc: Tom Zanussi 
Cc: Namhyung Kim 
Cc: Thomas Glexiner 
Cc: Boqun Feng 
Cc: Paul McKenney 
Cc: Frederic Weisbecker 
Cc: Randy Dunlap 
Cc: Masami Hiramatsu 
Cc: Fenguang Wu 
Cc: Baohong Liu 
Cc: Vedang Patel 
Cc: kernel-t...@android.com
Signed-off-by: Joel Fernandes 
-- 
2.17.0.441.gb46fe60e1d-goog

[PATCH RFC v5 3/6] srcu: Add notrace variant of srcu_dereference

2018-04-30 Thread Joel Fernandes

In this series, we are making lockdep use an rcuidle tracepoint. For
this reason we need a notrace variant of srcu_dereference since
otherwise we get lockdep splats since lockdep hooks may not have run
yet. This patch adds the needed variant.

Cc: Steven Rostedt 
Cc: Peter Zilstra 
Cc: Ingo Molnar 
Cc: Mathieu Desnoyers 
Cc: Tom Zanussi 
Cc: Namhyung Kim 
Cc: Thomas Glexiner 
Cc: Boqun Feng 
Cc: Paul McKenney 
Cc: Frederic Weisbecker 
Cc: Randy Dunlap 
Cc: Masami Hiramatsu 
Cc: Fenguang Wu 
Cc: Baohong Liu 
Cc: Vedang Patel 
Cc: kernel-t...@android.com
Signed-off-by: Joel Fernandes 
---
 include/linux/srcu.h | 5 +
 1 file changed, 5 insertions(+)

diff --git a/include/linux/srcu.h b/include/linux/srcu.h
index 2ec618979b20..a1c4947be877 100644
--- a/include/linux/srcu.h
+++ b/include/linux/srcu.h
@@ -135,6 +135,11 @@ static inline int srcu_read_lock_held(const struct 
srcu_struct *sp)
  */
 #define srcu_dereference(p, sp) srcu_dereference_check((p), (sp), 0)
 
+/**
+ * srcu_dereference_notrace - no tracing and no lockdep calls from here
+ */
+#define srcu_dereference_notrace(p, sp) srcu_dereference_check((p), (sp), 1)
+
 /**
  * srcu_read_lock - register a new reader for an SRCU-protected structure.
  * @sp: srcu_struct in which to register the new reader.
-- 
2.17.0.441.gb46fe60e1d-goog

[PATCH RFC v5 3/6] srcu: Add notrace variant of srcu_dereference

2018-04-30 Thread Joel Fernandes

In this series, we are making lockdep use an rcuidle tracepoint. For
this reason we need a notrace variant of srcu_dereference since
otherwise we get lockdep splats since lockdep hooks may not have run
yet. This patch adds the needed variant.

Cc: Steven Rostedt 
Cc: Peter Zilstra 
Cc: Ingo Molnar 
Cc: Mathieu Desnoyers 
Cc: Tom Zanussi 
Cc: Namhyung Kim 
Cc: Thomas Glexiner 
Cc: Boqun Feng 
Cc: Paul McKenney 
Cc: Frederic Weisbecker 
Cc: Randy Dunlap 
Cc: Masami Hiramatsu 
Cc: Fenguang Wu 
Cc: Baohong Liu 
Cc: Vedang Patel 
Cc: kernel-t...@android.com
Signed-off-by: Joel Fernandes 
---
 include/linux/srcu.h | 5 +
 1 file changed, 5 insertions(+)

diff --git a/include/linux/srcu.h b/include/linux/srcu.h
index 2ec618979b20..a1c4947be877 100644
--- a/include/linux/srcu.h
+++ b/include/linux/srcu.h
@@ -135,6 +135,11 @@ static inline int srcu_read_lock_held(const struct 
srcu_struct *sp)
  */
 #define srcu_dereference(p, sp) srcu_dereference_check((p), (sp), 0)
 
+/**
+ * srcu_dereference_notrace - no tracing and no lockdep calls from here
+ */
+#define srcu_dereference_notrace(p, sp) srcu_dereference_check((p), (sp), 1)
+
 /**
  * srcu_read_lock - register a new reader for an SRCU-protected structure.
  * @sp: srcu_struct in which to register the new reader.
-- 
2.17.0.441.gb46fe60e1d-goog

[PATCH RFC v5 5/6] tracepoint: Make rcuidle tracepoint callers use SRCU

2018-04-30 Thread Joel Fernandes

In recent tests with IRQ on/off tracepoints, a large performance
overhead ~10% is noticed when running hackbench. This is root caused to
calls to rcu_irq_enter_irqson and rcu_irq_exit_irqson from the
tracepoint code. Following a long discussion on the list [1] about this,
we concluded that srcu is a better alternative for use during rcu idle.
Although it does involve extra barriers, its lighter than the sched-rcu
version which has to do additional RCU calls to notify RCU idle about
entry into RCU sections.

In this patch, we change the underlying implementation of the
trace_*_rcuidle API to use SRCU. This has shown to improve performance
alot for the high frequency irq enable/disable tracepoints.

Test: Tested idle and preempt/irq tracepoints.

[1] https://patchwork.kernel.org/patch/10344297/

Cc: Steven Rostedt 
Cc: Peter Zilstra 
Cc: Ingo Molnar 
Cc: Mathieu Desnoyers 
Cc: Tom Zanussi 
Cc: Namhyung Kim 
Cc: Thomas Glexiner 
Cc: Boqun Feng 
Cc: Paul McKenney 
Cc: Frederic Weisbecker 
Cc: Randy Dunlap 
Cc: Masami Hiramatsu 
Cc: Fenguang Wu 
Cc: Baohong Liu 
Cc: Vedang Patel 
Cc: kernel-t...@android.com
Signed-off-by: Joel Fernandes 
---
 include/linux/tracepoint.h | 46 +++---
 kernel/tracepoint.c| 10 -
 2 files changed, 47 insertions(+), 9 deletions(-)

diff --git a/include/linux/tracepoint.h b/include/linux/tracepoint.h
index c94f466d57ef..4135e08fb5f1 100644
--- a/include/linux/tracepoint.h
+++ b/include/linux/tracepoint.h
@@ -15,6 +15,7 @@
  */
 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -33,6 +34,8 @@ struct trace_eval_map {
 
 #define TRACEPOINT_DEFAULT_PRIO10
 
+extern struct srcu_struct tracepoint_srcu;
+
 extern int
 tracepoint_probe_register(struct tracepoint *tp, void *probe, void *data);
 extern int
@@ -77,6 +80,9 @@ int unregister_tracepoint_module_notifier(struct 
notifier_block *nb)
  */
 static inline void tracepoint_synchronize_unregister(void)
 {
+#ifdef CONFIG_TRACEPOINTS
+   synchronize_srcu(_srcu);
+#endif
synchronize_sched();
 }
 
@@ -129,18 +135,38 @@ extern void syscall_unregfunc(void);
  * as "(void *, void)". The DECLARE_TRACE_NOARGS() will pass in just
  * "void *data", where as the DECLARE_TRACE() will pass in "void *data, proto".
  */
-#define __DO_TRACE(tp, proto, args, cond, rcucheck)\
+#define __DO_TRACE(tp, proto, args, cond, rcuidle) \
do {\
struct tracepoint_func *it_func_ptr;\
void *it_func;  \
void *__data;   \
+   int __maybe_unused idx = 0; \
\
if (!(cond))\
return; \
-   if (rcucheck)   \
-   rcu_irq_enter_irqson(); \
-   rcu_read_lock_sched_notrace();  \
-   it_func_ptr = rcu_dereference_sched((tp)->funcs);   \
+   \
+   /*  \
+* For rcuidle callers, use srcu since sched-rcu\
+* doesn't work from the idle path. \
+*/ \
+   if (rcuidle) {  \
+   if (in_nmi()) { \
+   WARN_ON_ONCE(1);\
+   return; /* no srcu from nmi */  \
+   }   \
+   \
+   /* To keep it consistent with !rcuidle path */  \
+   preempt_disable_notrace();  \
+   \
+   idx = srcu_read_lock_notrace(_srcu); \
+   it_func_ptr = srcu_dereference((tp)->funcs, \
+   _srcu);  \
+   } else {\

[PATCH RFC v5 4/6] trace/irqsoff: Split reset into seperate functions

2018-04-30 Thread Joel Fernandes

Split reset functions into seperate functions in preparation
of future patches that need to do tracer specific reset.

Cc: Steven Rostedt 
Cc: Peter Zilstra 
Cc: Ingo Molnar 
Cc: Mathieu Desnoyers 
Cc: Tom Zanussi 
Cc: Namhyung Kim 
Cc: Thomas Glexiner 
Cc: Boqun Feng 
Cc: Paul McKenney 
Cc: Frederic Weisbecker 
Cc: Randy Dunlap 
Cc: Masami Hiramatsu 
Cc: Fenguang Wu 
Cc: Baohong Liu 
Cc: Vedang Patel 
Cc: kernel-t...@android.com
Signed-off-by: Joel Fernandes 
---
 kernel/trace/trace_irqsoff.c | 22 +++---
 1 file changed, 19 insertions(+), 3 deletions(-)

diff --git a/kernel/trace/trace_irqsoff.c b/kernel/trace/trace_irqsoff.c
index 03ecb4465ee4..f8daa754cce2 100644
--- a/kernel/trace/trace_irqsoff.c
+++ b/kernel/trace/trace_irqsoff.c
@@ -634,7 +634,7 @@ static int __irqsoff_tracer_init(struct trace_array *tr)
return 0;
 }
 
-static void irqsoff_tracer_reset(struct trace_array *tr)
+static void __irqsoff_tracer_reset(struct trace_array *tr)
 {
int lat_flag = save_flags & TRACE_ITER_LATENCY_FMT;
int overwrite_flag = save_flags & TRACE_ITER_OVERWRITE;
@@ -665,6 +665,12 @@ static int irqsoff_tracer_init(struct trace_array *tr)
 
return __irqsoff_tracer_init(tr);
 }
+
+static void irqsoff_tracer_reset(struct trace_array *tr)
+{
+   __irqsoff_tracer_reset(tr);
+}
+
 static struct tracer irqsoff_tracer __read_mostly =
 {
.name   = "irqsoff",
@@ -697,11 +703,16 @@ static int preemptoff_tracer_init(struct trace_array *tr)
return __irqsoff_tracer_init(tr);
 }
 
+static void preemptoff_tracer_reset(struct trace_array *tr)
+{
+   __irqsoff_tracer_reset(tr);
+}
+
 static struct tracer preemptoff_tracer __read_mostly =
 {
.name   = "preemptoff",
.init   = preemptoff_tracer_init,
-   .reset  = irqsoff_tracer_reset,
+   .reset  = preemptoff_tracer_reset,
.start  = irqsoff_tracer_start,
.stop   = irqsoff_tracer_stop,
.print_max  = true,
@@ -731,11 +742,16 @@ static int preemptirqsoff_tracer_init(struct trace_array 
*tr)
return __irqsoff_tracer_init(tr);
 }
 
+static void preemptirqsoff_tracer_reset(struct trace_array *tr)
+{
+   __irqsoff_tracer_reset(tr);
+}
+
 static struct tracer preemptirqsoff_tracer __read_mostly =
 {
.name   = "preemptirqsoff",
.init   = preemptirqsoff_tracer_init,
-   .reset  = irqsoff_tracer_reset,
+   .reset  = preemptirqsoff_tracer_reset,
.start  = irqsoff_tracer_start,
.stop   = irqsoff_tracer_stop,
.print_max  = true,
-- 
2.17.0.441.gb46fe60e1d-goog

[PATCH RFC v5 5/6] tracepoint: Make rcuidle tracepoint callers use SRCU

2018-04-30 Thread Joel Fernandes

In recent tests with IRQ on/off tracepoints, a large performance
overhead ~10% is noticed when running hackbench. This is root caused to
calls to rcu_irq_enter_irqson and rcu_irq_exit_irqson from the
tracepoint code. Following a long discussion on the list [1] about this,
we concluded that srcu is a better alternative for use during rcu idle.
Although it does involve extra barriers, its lighter than the sched-rcu
version which has to do additional RCU calls to notify RCU idle about
entry into RCU sections.

In this patch, we change the underlying implementation of the
trace_*_rcuidle API to use SRCU. This has shown to improve performance
alot for the high frequency irq enable/disable tracepoints.

Test: Tested idle and preempt/irq tracepoints.

[1] https://patchwork.kernel.org/patch/10344297/

Cc: Steven Rostedt 
Cc: Peter Zilstra 
Cc: Ingo Molnar 
Cc: Mathieu Desnoyers 
Cc: Tom Zanussi 
Cc: Namhyung Kim 
Cc: Thomas Glexiner 
Cc: Boqun Feng 
Cc: Paul McKenney 
Cc: Frederic Weisbecker 
Cc: Randy Dunlap 
Cc: Masami Hiramatsu 
Cc: Fenguang Wu 
Cc: Baohong Liu 
Cc: Vedang Patel 
Cc: kernel-t...@android.com
Signed-off-by: Joel Fernandes 
---
 include/linux/tracepoint.h | 46 +++---
 kernel/tracepoint.c| 10 -
 2 files changed, 47 insertions(+), 9 deletions(-)

diff --git a/include/linux/tracepoint.h b/include/linux/tracepoint.h
index c94f466d57ef..4135e08fb5f1 100644
--- a/include/linux/tracepoint.h
+++ b/include/linux/tracepoint.h
@@ -15,6 +15,7 @@
  */
 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -33,6 +34,8 @@ struct trace_eval_map {
 
 #define TRACEPOINT_DEFAULT_PRIO10
 
+extern struct srcu_struct tracepoint_srcu;
+
 extern int
 tracepoint_probe_register(struct tracepoint *tp, void *probe, void *data);
 extern int
@@ -77,6 +80,9 @@ int unregister_tracepoint_module_notifier(struct 
notifier_block *nb)
  */
 static inline void tracepoint_synchronize_unregister(void)
 {
+#ifdef CONFIG_TRACEPOINTS
+   synchronize_srcu(_srcu);
+#endif
synchronize_sched();
 }
 
@@ -129,18 +135,38 @@ extern void syscall_unregfunc(void);
  * as "(void *, void)". The DECLARE_TRACE_NOARGS() will pass in just
  * "void *data", where as the DECLARE_TRACE() will pass in "void *data, proto".
  */
-#define __DO_TRACE(tp, proto, args, cond, rcucheck)\
+#define __DO_TRACE(tp, proto, args, cond, rcuidle) \
do {\
struct tracepoint_func *it_func_ptr;\
void *it_func;  \
void *__data;   \
+   int __maybe_unused idx = 0; \
\
if (!(cond))\
return; \
-   if (rcucheck)   \
-   rcu_irq_enter_irqson(); \
-   rcu_read_lock_sched_notrace();  \
-   it_func_ptr = rcu_dereference_sched((tp)->funcs);   \
+   \
+   /*  \
+* For rcuidle callers, use srcu since sched-rcu\
+* doesn't work from the idle path. \
+*/ \
+   if (rcuidle) {  \
+   if (in_nmi()) { \
+   WARN_ON_ONCE(1);\
+   return; /* no srcu from nmi */  \
+   }   \
+   \
+   /* To keep it consistent with !rcuidle path */  \
+   preempt_disable_notrace();  \
+   \
+   idx = srcu_read_lock_notrace(_srcu); \
+   it_func_ptr = srcu_dereference((tp)->funcs, \
+   _srcu);  \
+   } else {\
+   rcu_read_lock_sched_notrace();  \
+   it_func_ptr =   \
+   rcu_dereference_sched((tp)->funcs); \
+   }   \
+

[PATCH RFC v5 4/6] trace/irqsoff: Split reset into seperate functions

2018-04-30 Thread Joel Fernandes

Split reset functions into seperate functions in preparation
of future patches that need to do tracer specific reset.

Cc: Steven Rostedt 
Cc: Peter Zilstra 
Cc: Ingo Molnar 
Cc: Mathieu Desnoyers 
Cc: Tom Zanussi 
Cc: Namhyung Kim 
Cc: Thomas Glexiner 
Cc: Boqun Feng 
Cc: Paul McKenney 
Cc: Frederic Weisbecker 
Cc: Randy Dunlap 
Cc: Masami Hiramatsu 
Cc: Fenguang Wu 
Cc: Baohong Liu 
Cc: Vedang Patel 
Cc: kernel-t...@android.com
Signed-off-by: Joel Fernandes 
---
 kernel/trace/trace_irqsoff.c | 22 +++---
 1 file changed, 19 insertions(+), 3 deletions(-)

diff --git a/kernel/trace/trace_irqsoff.c b/kernel/trace/trace_irqsoff.c
index 03ecb4465ee4..f8daa754cce2 100644
--- a/kernel/trace/trace_irqsoff.c
+++ b/kernel/trace/trace_irqsoff.c
@@ -634,7 +634,7 @@ static int __irqsoff_tracer_init(struct trace_array *tr)
return 0;
 }
 
-static void irqsoff_tracer_reset(struct trace_array *tr)
+static void __irqsoff_tracer_reset(struct trace_array *tr)
 {
int lat_flag = save_flags & TRACE_ITER_LATENCY_FMT;
int overwrite_flag = save_flags & TRACE_ITER_OVERWRITE;
@@ -665,6 +665,12 @@ static int irqsoff_tracer_init(struct trace_array *tr)
 
return __irqsoff_tracer_init(tr);
 }
+
+static void irqsoff_tracer_reset(struct trace_array *tr)
+{
+   __irqsoff_tracer_reset(tr);
+}
+
 static struct tracer irqsoff_tracer __read_mostly =
 {
.name   = "irqsoff",
@@ -697,11 +703,16 @@ static int preemptoff_tracer_init(struct trace_array *tr)
return __irqsoff_tracer_init(tr);
 }
 
+static void preemptoff_tracer_reset(struct trace_array *tr)
+{
+   __irqsoff_tracer_reset(tr);
+}
+
 static struct tracer preemptoff_tracer __read_mostly =
 {
.name   = "preemptoff",
.init   = preemptoff_tracer_init,
-   .reset  = irqsoff_tracer_reset,
+   .reset  = preemptoff_tracer_reset,
.start  = irqsoff_tracer_start,
.stop   = irqsoff_tracer_stop,
.print_max  = true,
@@ -731,11 +742,16 @@ static int preemptirqsoff_tracer_init(struct trace_array 
*tr)
return __irqsoff_tracer_init(tr);
 }
 
+static void preemptirqsoff_tracer_reset(struct trace_array *tr)
+{
+   __irqsoff_tracer_reset(tr);
+}
+
 static struct tracer preemptirqsoff_tracer __read_mostly =
 {
.name   = "preemptirqsoff",
.init   = preemptirqsoff_tracer_init,
-   .reset  = irqsoff_tracer_reset,
+   .reset  = preemptirqsoff_tracer_reset,
.start  = irqsoff_tracer_start,
.stop   = irqsoff_tracer_stop,
.print_max  = true,
-- 
2.17.0.441.gb46fe60e1d-goog

[PATCH RFC v5 6/6] tracing: Centralize preemptirq tracepoints and unify their usage

2018-04-30 Thread Joel Fernandes

This patch detaches the preemptirq tracepoints from the tracers and
keeps it separate.

Advantages:
* Lockdep and irqsoff event can now run in parallel since they no longer
have their own calls.

* This unifies the usecase of adding hooks to an irqsoff and irqson
event, and a preemptoff and preempton event.
  3 users of the events exist:
  - Lockdep
  - irqsoff and preemptoff tracers
  - irqs and preempt trace events

The unification cleans up several ifdefs and makes the code in preempt
tracer and irqsoff tracers simpler. It gets rid of all the horrific
ifdeferry around PROVE_LOCKING and makes configuration of the different
users of the tracepoints more easy and understandable. It also gets rid
of the time_* function calls from the lockdep hooks used to call into
the preemptirq tracer which is not needed anymore. The negative delta in
lines of code in this patch is quite large too.

In the patch we introduce a new CONFIG option PREEMPTIRQ_TRACEPOINTS
as a single point for registering probes onto the tracepoints. With
this,
the web of config options for preempt/irq toggle tracepoints and its
users becomes:

 PREEMPT_TRACER   PREEMPTIRQ_EVENTS  IRQSOFF_TRACER PROVE_LOCKING
   | | \ |   |
   \(selects)/  \\ (selects) /
  TRACE_PREEMPT_TOGGLE   > TRACE_IRQFLAGS
  \  /
   \ (depends on)   /
 PREEMPTIRQ_TRACEPOINTS

One note, I have to check for lockdep recursion in the code that calls
the trace events API and bail out if we're in lockdep recursion
protection to prevent something like the following case: a spin_lock is
taken. Then lockdep_acquired is called.  That does a raw_local_irq_save
and then sets lockdep_recursion, and then calls __lockdep_acquired. In
this function, a call to get_lock_stats happens which calls
preempt_disable, which calls trace IRQS off somewhere which enters my
tracepoint code and sets the tracing_irq_cpu flag to prevent recursion.
This flag is then never cleared causing lockdep paths to never be
entered and thus causing splats and other bad things.

Cc: Steven Rostedt 
Cc: Peter Zilstra 
Cc: Ingo Molnar 
Cc: Mathieu Desnoyers 
Cc: Tom Zanussi 
Cc: Namhyung Kim 
Cc: Thomas Glexiner 
Cc: Boqun Feng 
Cc: Paul McKenney 
Cc: Frederic Weisbecker 
Cc: Randy Dunlap 
Cc: Masami Hiramatsu 
Cc: Fenguang Wu 
Cc: Baohong Liu 
Cc: Vedang Patel 
Cc: kernel-t...@android.com
Signed-off-by: Joel Fernandes 
---
 include/linux/ftrace.h|  11 +-
 include/linux/irqflags.h  |  11 +-
 include/linux/lockdep.h   |   8 +-
 include/linux/preempt.h   |   2 +-
 include/linux/tracepoint.h|   3 +-
 include/trace/events/preemptirq.h |  23 ++--
 init/main.c   |   5 +-
 kernel/locking/lockdep.c  |  35 ++---
 kernel/sched/core.c   |   2 +-
 kernel/trace/Kconfig  |  22 ++-
 kernel/trace/Makefile |   2 +-
 kernel/trace/trace_irqsoff.c  | 213 --
 kernel/trace/trace_preemptirq.c   |  71 ++
 13 files changed, 191 insertions(+), 217 deletions(-)
 create mode 100644 kernel/trace/trace_preemptirq.c

diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
index 9c3c9a319e48..5191030af0c0 100644
--- a/include/linux/ftrace.h
+++ b/include/linux/ftrace.h
@@ -709,16 +709,7 @@ static inline unsigned long get_lock_parent_ip(void)
return CALLER_ADDR2;
 }
 
-#ifdef CONFIG_IRQSOFF_TRACER
-  extern void time_hardirqs_on(unsigned long a0, unsigned long a1);
-  extern void time_hardirqs_off(unsigned long a0, unsigned long a1);
-#else
-  static inline void time_hardirqs_on(unsigned long a0, unsigned long a1) { }
-  static inline void time_hardirqs_off(unsigned long a0, unsigned long a1) { }
-#endif
-
-#if defined(CONFIG_PREEMPT_TRACER) || \
-   (defined(CONFIG_DEBUG_PREEMPT) && defined(CONFIG_PREEMPTIRQ_EVENTS))
+#ifdef CONFIG_TRACE_PREEMPT_TOGGLE
   extern void trace_preempt_on(unsigned long a0, unsigned long a1);
   extern void trace_preempt_off(unsigned long a0, unsigned long a1);
 #else
diff --git a/include/linux/irqflags.h b/include/linux/irqflags.h
index 9700f00bbc04..50edb9cbbd26 100644
--- a/include/linux/irqflags.h
+++ b/include/linux/irqflags.h
@@ -15,9 +15,16 @@
 #include 
 #include 
 
-#ifdef CONFIG_TRACE_IRQFLAGS
+/* Currently trace_softirqs_on/off is used only by lockdep */
+#ifdef CONFIG_PROVE_LOCKING
   extern void trace_softirqs_on(unsigned long ip);
   extern void trace_softirqs_off(unsigned long ip);
+#else
+# define trace_softirqs_on(ip) do { } while

[PATCH RFC v5 6/6] tracing: Centralize preemptirq tracepoints and unify their usage

2018-04-30 Thread Joel Fernandes

This patch detaches the preemptirq tracepoints from the tracers and
keeps it separate.

Advantages:
* Lockdep and irqsoff event can now run in parallel since they no longer
have their own calls.

* This unifies the usecase of adding hooks to an irqsoff and irqson
event, and a preemptoff and preempton event.
  3 users of the events exist:
  - Lockdep
  - irqsoff and preemptoff tracers
  - irqs and preempt trace events

The unification cleans up several ifdefs and makes the code in preempt
tracer and irqsoff tracers simpler. It gets rid of all the horrific
ifdeferry around PROVE_LOCKING and makes configuration of the different
users of the tracepoints more easy and understandable. It also gets rid
of the time_* function calls from the lockdep hooks used to call into
the preemptirq tracer which is not needed anymore. The negative delta in
lines of code in this patch is quite large too.

In the patch we introduce a new CONFIG option PREEMPTIRQ_TRACEPOINTS
as a single point for registering probes onto the tracepoints. With
this,
the web of config options for preempt/irq toggle tracepoints and its
users becomes:

 PREEMPT_TRACER   PREEMPTIRQ_EVENTS  IRQSOFF_TRACER PROVE_LOCKING
   | | \ |   |
   \(selects)/  \\ (selects) /
  TRACE_PREEMPT_TOGGLE   > TRACE_IRQFLAGS
  \  /
   \ (depends on)   /
 PREEMPTIRQ_TRACEPOINTS

One note, I have to check for lockdep recursion in the code that calls
the trace events API and bail out if we're in lockdep recursion
protection to prevent something like the following case: a spin_lock is
taken. Then lockdep_acquired is called.  That does a raw_local_irq_save
and then sets lockdep_recursion, and then calls __lockdep_acquired. In
this function, a call to get_lock_stats happens which calls
preempt_disable, which calls trace IRQS off somewhere which enters my
tracepoint code and sets the tracing_irq_cpu flag to prevent recursion.
This flag is then never cleared causing lockdep paths to never be
entered and thus causing splats and other bad things.

Cc: Steven Rostedt 
Cc: Peter Zilstra 
Cc: Ingo Molnar 
Cc: Mathieu Desnoyers 
Cc: Tom Zanussi 
Cc: Namhyung Kim 
Cc: Thomas Glexiner 
Cc: Boqun Feng 
Cc: Paul McKenney 
Cc: Frederic Weisbecker 
Cc: Randy Dunlap 
Cc: Masami Hiramatsu 
Cc: Fenguang Wu 
Cc: Baohong Liu 
Cc: Vedang Patel 
Cc: kernel-t...@android.com
Signed-off-by: Joel Fernandes 
---
 include/linux/ftrace.h|  11 +-
 include/linux/irqflags.h  |  11 +-
 include/linux/lockdep.h   |   8 +-
 include/linux/preempt.h   |   2 +-
 include/linux/tracepoint.h|   3 +-
 include/trace/events/preemptirq.h |  23 ++--
 init/main.c   |   5 +-
 kernel/locking/lockdep.c  |  35 ++---
 kernel/sched/core.c   |   2 +-
 kernel/trace/Kconfig  |  22 ++-
 kernel/trace/Makefile |   2 +-
 kernel/trace/trace_irqsoff.c  | 213 --
 kernel/trace/trace_preemptirq.c   |  71 ++
 13 files changed, 191 insertions(+), 217 deletions(-)
 create mode 100644 kernel/trace/trace_preemptirq.c

diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
index 9c3c9a319e48..5191030af0c0 100644
--- a/include/linux/ftrace.h
+++ b/include/linux/ftrace.h
@@ -709,16 +709,7 @@ static inline unsigned long get_lock_parent_ip(void)
return CALLER_ADDR2;
 }
 
-#ifdef CONFIG_IRQSOFF_TRACER
-  extern void time_hardirqs_on(unsigned long a0, unsigned long a1);
-  extern void time_hardirqs_off(unsigned long a0, unsigned long a1);
-#else
-  static inline void time_hardirqs_on(unsigned long a0, unsigned long a1) { }
-  static inline void time_hardirqs_off(unsigned long a0, unsigned long a1) { }
-#endif
-
-#if defined(CONFIG_PREEMPT_TRACER) || \
-   (defined(CONFIG_DEBUG_PREEMPT) && defined(CONFIG_PREEMPTIRQ_EVENTS))
+#ifdef CONFIG_TRACE_PREEMPT_TOGGLE
   extern void trace_preempt_on(unsigned long a0, unsigned long a1);
   extern void trace_preempt_off(unsigned long a0, unsigned long a1);
 #else
diff --git a/include/linux/irqflags.h b/include/linux/irqflags.h
index 9700f00bbc04..50edb9cbbd26 100644
--- a/include/linux/irqflags.h
+++ b/include/linux/irqflags.h
@@ -15,9 +15,16 @@
 #include 
 #include 
 
-#ifdef CONFIG_TRACE_IRQFLAGS
+/* Currently trace_softirqs_on/off is used only by lockdep */
+#ifdef CONFIG_PROVE_LOCKING
   extern void trace_softirqs_on(unsigned long ip);
   extern void trace_softirqs_off(unsigned long ip);
+#else
+# define trace_softirqs_on(ip) do { } while (0)
+# define trace_softirqs_off(ip)do { } while (0)
+#endif
+
+#ifdef CONFIG_TRACE_IRQFLAGS
   extern void trace_hardirqs_on(void);
   extern void trace_hardirqs_off(void);
 # define trace_hardirq_context(p)  ((p)->hardirq_context)
@@ -43,8 +50,6 @@ do {  \
 #else
 # define trace_hardirqs_on()   do { }

Re: [RFC v4 3/4] irqflags: Avoid unnecessary calls to trace_ if you can

2018-04-30 Thread Joel Fernandes

On Wed, Apr 18, 2018 at 2:02 AM Masami Hiramatsu 
wrote:

> On Mon, 16 Apr 2018 21:07:47 -0700
> Joel Fernandes  wrote:

> > With TRACE_IRQFLAGS, we call trace_ API too many times. We don't need
> > to if local_irq_restore or local_irq_save didn't actually do anything.
> >
> > This gives around a 4% improvement in performance when doing the
> > following command: "time find / > /dev/null"
> >
> > Also its best to avoid these calls where possible, since in this series,
> > the RCU code in tracepoint.h seems to be call these quite a bit and I'd
> > like to keep this overhead low.

> Can we assume that the "flags" has only 1 bit irq-disable flag?
> Since it skips calling raw_local_irq_restore(flags); too,
> if there is any state in the flags on any arch, it may change the
> result. In that case, we can do it as below (just skipping
trace_hardirqs_*)

> int disabled = irqs_disabled();

> if (!raw_irqs_disabled_flags(flags) && disabled)
>  trace_hardirqs_on();

> raw_local_irq_restore(flags);

> if (raw_irqs_disabled_flags(flags) && !disabled)
>  trace_hardirqs_off();

With changes to the tracepoint implementation which uses srcu instead of
rcu, I'm not able to see a performance improvement with the above patch.
For this reason, I will drop this patch from next series. thanks,

- Joel

Re: [RFC v4 3/4] irqflags: Avoid unnecessary calls to trace_ if you can

2018-04-30 Thread Joel Fernandes

On Wed, Apr 18, 2018 at 2:02 AM Masami Hiramatsu 
wrote:

> On Mon, 16 Apr 2018 21:07:47 -0700
> Joel Fernandes  wrote:

> > With TRACE_IRQFLAGS, we call trace_ API too many times. We don't need
> > to if local_irq_restore or local_irq_save didn't actually do anything.
> >
> > This gives around a 4% improvement in performance when doing the
> > following command: "time find / > /dev/null"
> >
> > Also its best to avoid these calls where possible, since in this series,
> > the RCU code in tracepoint.h seems to be call these quite a bit and I'd
> > like to keep this overhead low.

> Can we assume that the "flags" has only 1 bit irq-disable flag?
> Since it skips calling raw_local_irq_restore(flags); too,
> if there is any state in the flags on any arch, it may change the
> result. In that case, we can do it as below (just skipping
trace_hardirqs_*)

> int disabled = irqs_disabled();

> if (!raw_irqs_disabled_flags(flags) && disabled)
>  trace_hardirqs_on();

> raw_local_irq_restore(flags);

> if (raw_irqs_disabled_flags(flags) && !disabled)
>  trace_hardirqs_off();

With changes to the tracepoint implementation which uses srcu instead of
rcu, I'm not able to see a performance improvement with the above patch.
For this reason, I will drop this patch from next series. thanks,

- Joel

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 2246 matches

Mail list logo