date:20201120

[PATCH v2] vfio iommu type1: Improve vfio_iommu_type1_pin_pages performance

2020-11-20 Thread xuxiaoyang (C)

vfio_pin_pages() accepts an array of unrelated iova pfns and processes
each to return the physical pfn.  When dealing with large arrays of
contiguous iovas, vfio_iommu_type1_pin_pages is very inefficient because
it is processed page by page.In this case, we can divide the iova pfn
array into multiple continuous ranges and optimize them.  For example,
when the iova pfn array is {1,5,6,7,9}, it will be divided into three
groups {1}, {5,6,7}, {9} for processing.  When processing {5,6,7}, the
number of calls to pin_user_pages_remote is reduced from 3 times to once.
For single page or large array of discontinuous iovas, we still use
vfio_pin_page_external to deal with it to reduce the performance loss
caused by refactoring.

Signed-off-by: Xiaoyang Xu 
---
v1 -> v2:
 * make vfio_iommu_type1_pin_contiguous_pages use vfio_pin_page_external
 to pin single page when npage=1
 * make vfio_pin_contiguous_pages_external use set npage to mark
 consecutive pages as dirty. simplify the processing logic of unwind
 * remove unnecessary checks in vfio_get_contiguous_pages_length, put
 the least costly judgment logic at the top, and replace
 vfio_iova_get_vfio_pfn with vfio_find_vpfn

 drivers/vfio/vfio_iommu_type1.c | 231 
 1 file changed, 204 insertions(+), 27 deletions(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 67e827638995..080727b531c6 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -628,6 +628,196 @@ static int vfio_unpin_page_external(struct vfio_dma *dma, 
dma_addr_t iova,
return unlocked;
 }

+static int contiguous_vaddr_get_pfn(struct mm_struct *mm, unsigned long vaddr,
+   int prot, long npage, unsigned long 
*phys_pfn)
+{
+   struct page **pages = NULL;
+   unsigned int flags = 0;
+   int i, ret;
+
+   pages = kvmalloc_array(npage, sizeof(struct page *), GFP_KERNEL);
+   if (!pages)
+   return -ENOMEM;
+
+   if (prot & IOMMU_WRITE)
+   flags |= FOLL_WRITE;
+
+   mmap_read_lock(mm);
+   ret = pin_user_pages_remote(mm, vaddr, npage, flags | FOLL_LONGTERM,
+   pages, NULL, NULL);
+   mmap_read_unlock(mm);
+
+   for (i = 0; i < ret; i++)
+   *(phys_pfn + i) = page_to_pfn(pages[i]);
+
+   kvfree(pages);
+
+   return ret;
+}
+
+static int vfio_pin_contiguous_pages_external(struct vfio_iommu *iommu,
+   struct vfio_dma *dma,
+   unsigned long *user_pfn,
+   int npage, unsigned long *phys_pfn,
+   bool do_accounting)
+{
+   int ret, i, j, lock_acct = 0;
+   unsigned long remote_vaddr;
+   dma_addr_t iova;
+   struct mm_struct *mm;
+   struct vfio_pfn *vpfn;
+
+   mm = get_task_mm(dma->task);
+   if (!mm)
+   return -ENODEV;
+
+   iova = user_pfn[0] << PAGE_SHIFT;
+   remote_vaddr = dma->vaddr + iova - dma->iova;
+   ret = contiguous_vaddr_get_pfn(mm, remote_vaddr, dma->prot,
+   npage, phys_pfn);
+   mmput(mm);
+   if (ret <= 0)
+   return ret;
+
+   npage = ret;
+   for (i = 0; i < npage; i++) {
+   iova = user_pfn[i] << PAGE_SHIFT;
+   ret = vfio_add_to_pfn_list(dma, iova, phys_pfn[i]);
+   if (ret)
+   goto unwind;
+
+   if (!is_invalid_reserved_pfn(phys_pfn[i]))
+   lock_acct++;
+   }
+
+   if (do_accounting) {
+   ret = vfio_lock_acct(dma, lock_acct, true);
+   if (ret) {
+   if (ret == -ENOMEM)
+   pr_warn("%s: Task %s (%d) RLIMIT_MEMLOCK (%ld) 
exceeded\n",
+   __func__, dma->task->comm, 
task_pid_nr(dma->task),
+   task_rlimit(dma->task, RLIMIT_MEMLOCK));
+   goto unwind;
+   }
+   }
+
+   if (iommu->dirty_page_tracking) {
+   unsigned long pgshift = __ffs(iommu->pgsize_bitmap);
+
+   /*
+* Bitmap populated with the smallest supported page
+* size
+*/
+   bitmap_set(dma->bitmap,
+  ((user_pfn[0] << PAGE_SHIFT) - dma->iova) >> 
pgshift, npage);
+   }
+
+   return i;
+unwind:
+   for (j = 0; j < npage; j++) {
+   if (j < i) {
+   iova = user_pfn[j] << PAGE_SHIFT;
+   vpfn = vfio_find_vpfn(dma, iova);
+   vfio_iova_put_vfio_pfn(dma, vpfn);
+   } else {
+   put_pfn(phys_pfn[j], dma->prot);
+   }
+
+   phys_pfn[j] = 0;
+   }
+
+   return ret;
+}
+
+static int

[PATCH] crypto: cavium - Use dma_set_mask_and_coherent to simplify code

2020-11-20 Thread Christophe JAILLET

'pci_set_dma_mask()' + 'pci_set_consistent_dma_mask()' can be replaced by
an equivalent 'dma_set_mask_and_coherent()' which is much less verbose.

Signed-off-by: Christophe JAILLET 
---
 drivers/crypto/cavium/cpt/cptpf_main.c | 10 ++
 drivers/crypto/cavium/cpt/cptvf_main.c | 10 ++
 2 files changed, 4 insertions(+), 16 deletions(-)

diff --git a/drivers/crypto/cavium/cpt/cptpf_main.c 
b/drivers/crypto/cavium/cpt/cptpf_main.c
index 781949027451..24d63bdc5dd2 100644
--- a/drivers/crypto/cavium/cpt/cptpf_main.c
+++ b/drivers/crypto/cavium/cpt/cptpf_main.c
@@ -569,15 +569,9 @@ static int cpt_probe(struct pci_dev *pdev, const struct 
pci_device_id *ent)
goto cpt_err_disable_device;
}
 
-   err = pci_set_dma_mask(pdev, DMA_BIT_MASK(48));
+   err = dma_set_mask_and_coherent(>dev, DMA_BIT_MASK(48));
if (err) {
-   dev_err(dev, "Unable to get usable DMA configuration\n");
-   goto cpt_err_release_regions;
-   }
-
-   err = pci_set_consistent_dma_mask(pdev, DMA_BIT_MASK(48));
-   if (err) {
-   dev_err(dev, "Unable to get 48-bit DMA for consistent 
allocations\n");
+   dev_err(dev, "Unable to get usable 48-bit DMA configuration\n");
goto cpt_err_release_regions;
}
 
diff --git a/drivers/crypto/cavium/cpt/cptvf_main.c 
b/drivers/crypto/cavium/cpt/cptvf_main.c
index a15245992cf9..f016448e43bb 100644
--- a/drivers/crypto/cavium/cpt/cptvf_main.c
+++ b/drivers/crypto/cavium/cpt/cptvf_main.c
@@ -687,15 +687,9 @@ static int cptvf_probe(struct pci_dev *pdev, const struct 
pci_device_id *ent)
}
/* Mark as VF driver */
cptvf->flags |= CPT_FLAG_VF_DRIVER;
-   err = pci_set_dma_mask(pdev, DMA_BIT_MASK(48));
+   err = dma_set_mask_and_coherent(>dev, DMA_BIT_MASK(48));
if (err) {
-   dev_err(dev, "Unable to get usable DMA configuration\n");
-   goto cptvf_err_release_regions;
-   }
-
-   err = pci_set_consistent_dma_mask(pdev, DMA_BIT_MASK(48));
-   if (err) {
-   dev_err(dev, "Unable to get 48-bit DMA for consistent 
allocations\n");
+   dev_err(dev, "Unable to get usable 48-bit DMA configuration\n");
goto cptvf_err_release_regions;
}
 
-- 
2.27.0

RE: [PATCH] uio/uio_pci_generic: remove unneeded pci_set_drvdata()

2020-11-20 Thread Ardelean, Alexandru




> -Original Message-
> From: Greg KH 
> Sent: Friday, November 20, 2020 5:46 PM
> To: Ardelean, Alexandru 
> Cc: linux-kernel@vger.kernel.org
> Subject: Re: [PATCH] uio/uio_pci_generic: remove unneeded pci_set_drvdata()
> 
> [External]
> 
> On Thu, Nov 19, 2020 at 04:59:06PM +0200, Alexandru Ardelean wrote:
> > The pci_get_drvdata() was moved during commit ef84928cff58
> > ("uio/uio_pci_generic: use device-managed function equivalents").
> >
> > I should have notice that the pci_set_drvdata() requires a
> > pci_get_drvdata() for it to make sense.
> >
> > Signed-off-by: Alexandru Ardelean 
> > ---
> >
> > Apologies for not noticing this sooner.
> > If this can be squashed into commit ef84928cff58 , then it's also fine.
> > I've started seeing that there actually more xxx_set_drvdata()
> > leftovers in the entire kernel, and I pinged the checkpatch crew to
> > add a check for this.
> >
> > https://urldefense.com/v3/__https://lore.kernel.org/lkml/CA*U=Dspy5*RE
> >
> 9agcLr6eY9DCMa1c5**b0jleugmmbrxz4yl...@mail.gmail.com/T/*u__;KysrK
> ysj!
> >
> !A3Ni8CS0y2Y!q3fJW4rKvEHQ7BDt1PaK4Cbexv4wbivUKBeDjo7ZwNXYwOLBawA
> Eq1Jaj
> > mhYxftX6DAJpg$
> 
> I can't squash existing public commits.  Can you resend this and add a 
> "Fixes:"
> tag to it to show what commit it fixes so we can track this properly?
> 

Sure, will re-send in the next couple of days.

Thanks
Alex

> thanks,
> 
> greg k-h

[PATCH] crypto: marvell/octeontx - Use dma_set_mask_and_coherent to simplify code

2020-11-20 Thread Christophe JAILLET

'pci_set_dma_mask()' + 'pci_set_consistent_dma_mask()' can be replaced by
an equivalent 'dma_set_mask_and_coherent()' which is much less verbose.

Signed-off-by: Christophe JAILLET 
---
 drivers/crypto/marvell/octeontx/otx_cptpf_main.c | 10 ++
 drivers/crypto/marvell/octeontx/otx_cptvf_main.c | 10 ++
 2 files changed, 4 insertions(+), 16 deletions(-)

diff --git a/drivers/crypto/marvell/octeontx/otx_cptpf_main.c 
b/drivers/crypto/marvell/octeontx/otx_cptpf_main.c
index 34bb3063eb70..14a42559f81d 100644
--- a/drivers/crypto/marvell/octeontx/otx_cptpf_main.c
+++ b/drivers/crypto/marvell/octeontx/otx_cptpf_main.c
@@ -212,15 +212,9 @@ static int otx_cpt_probe(struct pci_dev *pdev,
goto err_disable_device;
}
 
-   err = pci_set_dma_mask(pdev, DMA_BIT_MASK(48));
+   err = dma_set_mask_and_coherent(>dev, DMA_BIT_MASK(48));
if (err) {
-   dev_err(dev, "Unable to get usable DMA configuration\n");
-   goto err_release_regions;
-   }
-
-   err = pci_set_consistent_dma_mask(pdev, DMA_BIT_MASK(48));
-   if (err) {
-   dev_err(dev, "Unable to get 48-bit DMA for consistent 
allocations\n");
+   dev_err(dev, "Unable to get usable 48-bit DMA configuration\n");
goto err_release_regions;
}
 
diff --git a/drivers/crypto/marvell/octeontx/otx_cptvf_main.c 
b/drivers/crypto/marvell/octeontx/otx_cptvf_main.c
index 228fe8e47e0e..c076d0b3ad5f 100644
--- a/drivers/crypto/marvell/octeontx/otx_cptvf_main.c
+++ b/drivers/crypto/marvell/octeontx/otx_cptvf_main.c
@@ -804,15 +804,9 @@ static int otx_cptvf_probe(struct pci_dev *pdev,
dev_err(dev, "PCI request regions failed 0x%x\n", err);
goto disable_device;
}
-   err = pci_set_dma_mask(pdev, DMA_BIT_MASK(48));
+   err = dma_set_mask_and_coherent(>dev, DMA_BIT_MASK(48));
if (err) {
-   dev_err(dev, "Unable to get usable DMA configuration\n");
-   goto release_regions;
-   }
-
-   err = pci_set_consistent_dma_mask(pdev, DMA_BIT_MASK(48));
-   if (err) {
-   dev_err(dev, "Unable to get 48-bit DMA for consistent 
allocations\n");
+   dev_err(dev, "Unable to get usable 48-bit DMA configuration\n");
goto release_regions;
}
 
-- 
2.27.0

[gustavoars-linux:testing/clang-ft/for-next] BUILD SUCCESS d2944854e3e118b837755abf4cbdb497662001b7

2020-11-20 Thread kernel test robot

tree/branch: 
https://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux.git  
testing/clang-ft/for-next
branch HEAD: d2944854e3e118b837755abf4cbdb497662001b7  Input: libps2 - Fix 
fall-through warnings for Clang

elapsed time: 728m

configs tested: 125
configs skipped: 2

The following configs have been built successfully.
More configs may be tested in the coming days.

gcc tested configs:
arm defconfig
arm64allyesconfig
arm64   defconfig
arm  allyesconfig
arm  allmodconfig
powerpc  chrp32_defconfig
mips   ip27_defconfig
arm s3c6400_defconfig
powerpcklondike_defconfig
sh   se7780_defconfig
arm mxs_defconfig
powerpc mpc83xx_defconfig
mips   ip32_defconfig
arm  badge4_defconfig
powerpc tqm8540_defconfig
sh   se7721_defconfig
i386 alldefconfig
sh  rsk7203_defconfig
arm bcm2835_defconfig
sh  sdk7786_defconfig
m68kstmark2_defconfig
sh  rsk7269_defconfig
arm   imx_v6_v7_defconfig
armpleb_defconfig
mipsmaltaup_defconfig
ia64  tiger_defconfig
xtensa  audio_kc705_defconfig
powerpc mpc836x_rdk_defconfig
mips  maltasmvp_eva_defconfig
mips db1xxx_defconfig
m68k  sun3x_defconfig
arm mv78xx0_defconfig
powerpc mpc5200_defconfig
arm   corgi_defconfig
powerpc powernv_defconfig
mips  pic32mzda_defconfig
riscvallmodconfig
arm hackkit_defconfig
mips  fuloong2e_defconfig
arm  simpad_defconfig
armtrizeps4_defconfig
arcnsimosci_defconfig
mips loongson1b_defconfig
powerpc64   defconfig
arc haps_hs_defconfig
arm   aspeed_g5_defconfig
arm  integrator_defconfig
arm   cns3420vb_defconfig
arm am200epdkit_defconfig
powerpc ppa8548_defconfig
alpha   defconfig
arm  colibri_pxa300_defconfig
arcvdk_hs38_smp_defconfig
powerpc canyonlands_defconfig
arm   omap1_defconfig
ia64  gensparse_defconfig
sh  sh7785lcr_32bit_defconfig
powerpc mpc836x_mds_defconfig
ia64 allmodconfig
ia64defconfig
ia64 allyesconfig
m68k allmodconfig
m68kdefconfig
m68k allyesconfig
nios2   defconfig
arc  allyesconfig
nds32 allnoconfig
c6x  allyesconfig
nds32   defconfig
nios2allyesconfig
cskydefconfig
alphaallyesconfig
xtensa   allyesconfig
h8300allyesconfig
arc defconfig
sh   allmodconfig
parisc  defconfig
s390 allyesconfig
parisc   allyesconfig
s390defconfig
i386 allyesconfig
sparcallyesconfig
sparc   defconfig
i386defconfig
mips allyesconfig
mips allmodconfig
powerpc  allyesconfig
powerpc  allmodconfig
powerpc   allnoconfig
x86_64   randconfig-a006-20201120
x86_64   randconfig-a003-20201120
x86_64   randconfig-a004-20201120
x86_64   randconfig-a005-20201120
x86_64   randconfig-a001-20201120
x86_64   randconfig-a002-20201120
i386 randconfig-a004-20201120
i386 randconfig-a003-20201120
i386 randconfig-a002-20201120
i386 randconfig-a005-20201120
i386 randconfig-a001

arch/powerpc/xmon/xmon.c:1379:12: error: 'find_free_data_bpt' defined but not used

2020-11-20 Thread kernel test robot

tree:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
master
head:   27bba9c532a8d21050b94224ffd310ad0058c353
commit: 30df74d67d48949da87e3a5b57c381763e8fd526 powerpc/watchpoint/xmon: 
Support 2nd DAWR
date:   6 months ago
config: powerpc-randconfig-m031-20201121 (attached as .config)
compiler: powerpc-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=30df74d67d48949da87e3a5b57c381763e8fd526
git remote add linus 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
git fetch --no-tags linus master
git checkout 30df74d67d48949da87e3a5b57c381763e8fd526
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross 
ARCH=powerpc 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

All errors (new ones prefixed by >>):

>> arch/powerpc/xmon/xmon.c:1379:12: error: 'find_free_data_bpt' defined but 
>> not used [-Werror=unused-function]
1379 | static int find_free_data_bpt(void)
 |^~
   arch/powerpc/xmon/xmon.c: In function 'xmon_print_symbol':
   arch/powerpc/xmon/xmon.c:3633:14: error: variable 'name' might be clobbered 
by 'longjmp' or 'vfork' [-Werror=clobbered]
3633 |  const char *name = NULL;
 |  ^~~~
   arch/powerpc/xmon/xmon.c: In function 'show_tasks':
   arch/powerpc/xmon/xmon.c:3310:22: error: variable 'tsk' might be clobbered 
by 'longjmp' or 'vfork' [-Werror=clobbered]
3310 |  struct task_struct *tsk = NULL;
 |  ^~~
   arch/powerpc/xmon/xmon.c: In function 'xmon_core.constprop':
   arch/powerpc/xmon/xmon.c:487:6: error: variable 'cmd' might be clobbered by 
'longjmp' or 'vfork' [-Werror=clobbered]
 487 |  int cmd = 0;
 |  ^~~
   cc1: all warnings being treated as errors

vim +/find_free_data_bpt +1379 arch/powerpc/xmon/xmon.c

  1378  
> 1379  static int find_free_data_bpt(void)
  1380  {
  1381  int i;
  1382  
  1383  for (i = 0; i < nr_wp_slots(); i++) {
  1384  if (!dabr[i].enabled)
  1385  return i;
  1386  }
  1387  printf("Couldn't find free breakpoint register\n");
  1388  return -1;
  1389  }
  1390  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org


.config.gz
Description: application/gzip

Re: [RFC PATCH] bpf: preload: Fix build error when O= is set

2020-11-20 Thread Andrii Nakryiko

On Thu, Nov 19, 2020 at 12:51 AM David Gow  wrote:
>
> If BPF_PRELOAD is enabled, and an out-of-tree build is requested with
> make O=, compilation seems to fail with:
>
> tools/scripts/Makefile.include:4: *** O=.kunit does not exist.  Stop.
> make[4]: *** [../kernel/bpf/preload/Makefile:8: kernel/bpf/preload/libbpf.a] 
> Error 2
> make[3]: *** [../scripts/Makefile.build:500: kernel/bpf/preload] Error 2
> make[2]: *** [../scripts/Makefile.build:500: kernel/bpf] Error 2
> make[2]: *** Waiting for unfinished jobs
> make[1]: *** [.../Makefile:1799: kernel] Error 2
> make[1]: *** Waiting for unfinished jobs
> make: *** [Makefile:185: __sub-make] Error 2
>
> By the looks of things, this is because the (relative path) O= passed on
> the command line is being passed to the libbpf Makefile, which then
> can't find the directory. Given OUTPUT= is being passed anyway, we can
> work around this by explicitly setting an empty O=, which will be
> ignored in favour of OUTPUT= in tools/scripts/Makefile.include.

Strange, but I can't repro it. I use make O= all the
time with no issues. I just tried specifically with a make O=.build,
where .build is inside Linux repo, and it still worked fine. See also
be40920fbf10 ("tools: Let O= makes handle a relative path with -C
option") which was supposed to address such an issue. So I'm wondering
what exactly is causing this problem.

>
> Signed-off-by: David Gow 
> ---
>
> Hi all,
>
> I'm not 100% sure this is the correct fix here -- it seems to work for
> me, and makes some sense, but let me know if there's a better way.
>
> One other thing worth noting is that I've been hitting this with
> make allyesconfig on ARCH=um, but there's a comment in the Kconfig
> suggesting that, because BPF_PRELOAD depends on !COMPILE_TEST, that
> maybe it shouldn't be being built at all. I figured that it was worth
> trying to fix this anyway.
>
> Cheers,
> -- David
>
>
>  kernel/bpf/preload/Makefile | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/kernel/bpf/preload/Makefile b/kernel/bpf/preload/Makefile
> index 23ee310b6eb4..39848d296097 100644
> --- a/kernel/bpf/preload/Makefile
> +++ b/kernel/bpf/preload/Makefile
> @@ -5,7 +5,7 @@ LIBBPF_A = $(obj)/libbpf.a
>  LIBBPF_OUT = $(abspath $(obj))
>
>  $(LIBBPF_A):
> -   $(Q)$(MAKE) -C $(LIBBPF_SRCS) OUTPUT=$(LIBBPF_OUT)/ 
> $(LIBBPF_OUT)/libbpf.a
> +   $(Q)$(MAKE) -C $(LIBBPF_SRCS) O= OUTPUT=$(LIBBPF_OUT)/ 
> $(LIBBPF_OUT)/libbpf.a
>
>  userccflags += -I $(srctree)/tools/include/ -I $(srctree)/tools/include/uapi 
> \
> -I $(srctree)/tools/lib/ -Wno-unused-result
> --
> 2.29.2.454.gaff20da3a2-goog
>

Re: [PATCH v7 00/17] Add support for Clang LTO

2020-11-20 Thread Ard Biesheuvel

On Sat, 21 Nov 2020 at 00:53, Nick Desaulniers  wrote:
>
> On Fri, Nov 20, 2020 at 3:30 PM Ard Biesheuvel  wrote:
> >
> > On Fri, 20 Nov 2020 at 21:19, Nick Desaulniers  
> > wrote:
> > >
> > > On Fri, Nov 20, 2020 at 2:30 AM Ard Biesheuvel  wrote:
> > > >
> > > > On Thu, 19 Nov 2020 at 00:42, Nick Desaulniers 
> > > >  wrote:
> > > > >
> > > > > Thanks for continuing to drive this series Sami.  For the series,
> > > > >
> > > > > Tested-by: Nick Desaulniers 
> > > > >
> > > > > I did virtualized boot tests with the series applied to aarch64
> > > > > defconfig without CONFIG_LTO, with CONFIG_LTO_CLANG, and a third time
> > > > > with CONFIG_THINLTO.  If you make changes to the series in follow ups,
> > > > > please drop my tested by tag from the modified patches and I'll help
> > > > > re-test.  Some minor feedback on the Kconfig change, but I'll post it
> > > > > off of that patch.
> > > > >
> > > >
> > > > When you say 'virtualized" do you mean QEMU on x86? Or actual
> > > > virtualization on an AArch64 KVM host?
> > >
> > > aarch64 guest on x86_64 host.  If you have additional configurations
> > > that are important to you, additional testing help would be
> > > appreciated.
> > >
> >
> > Could you run this on an actual phone? Or does Android already ship
> > with this stuff?
>
> By `this`, if you mean "the LTO series", it has been shipping on
> Android phones for years now, I think it's even required in the latest
> release.
>
> If you mean "the LTO series + mainline" on a phone, well there's the
> android-mainline of https://android.googlesource.com/kernel/common/,
> in which this series was recently removed in order to facilitate
> rebasing Android's patches on ToT-mainline until getting the series
> landed upstream.  Bit of a chicken and the egg problem there.
>
> If you mean "the LTO series + mainline + KVM" on a phone; I don't know
> the precise state of aarch64 KVM and Android (Will or Marc would
> know).  We did experiment recently with RockPI's for aach64 KVM, IIRC;
> I think Android is tricky as it still requires A64+A32/T32 chipsets,
> Alistair would know more.  Might be interesting to boot a virtualized
> (or paravirtualized?) guest built with LTO in a host built with LTO
> for sure, but I don't know if we have tried that yet (I think we did
> try LTO guests of android kernels, but I think they were on the stock
> RockPI host BSP image IIRC).
>

I don't think testing under KVM gives us more confidence or coverage
than testing on bare metal. I was just pointing out that 'virtualized'
is misleading, and if you test things under QEMU/x86 + TCG, it is
better to be clear about this, and refer to it as 'under emulation'.

WARNING: filesystem loop2 was created with 512 inodes, the real maximum is 511, mounting anyway

2020-11-20 Thread syzbot

Hello,

syzbot found the following issue on:

HEAD commit:09162bc3 Linux 5.10-rc4
git tree:   upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=16e9a48650
kernel config:  https://syzkaller.appspot.com/x/.config?x=e93bbe4ce29223b
dashboard link: https://syzkaller.appspot.com/bug?extid=ae3ff0bb2a0133596a5b
compiler:   clang version 11.0.0 (https://github.com/llvm/llvm-project.git 
ca2dcbd030eadbf0aa9b660efe864ff08af6e18b)

Unfortunately, I don't have any reproducer for this issue yet.

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+ae3ff0bb2a0133596...@syzkaller.appspotmail.com

BFS-fs: bfs_fill_super(): WARNING: filesystem loop2 was created with 512 
inodes, the real maximum is 511, mounting anyway
BFS-fs: bfs_fill_super(): Last block not available on loop2: 1507328
BFS-fs: bfs_fill_super(): WARNING: filesystem loop2 was created with 512 
inodes, the real maximum is 511, mounting anyway
BFS-fs: bfs_fill_super(): Last block not available on loop2: 1507328


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkal...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

WARNING: filesystem loop4 was created with 512 inodes, the real maximum is 511, mounting anyway

2020-11-20 Thread syzbot

Hello,

syzbot found the following issue on:

HEAD commit:09162bc3 Linux 5.10-rc4
git tree:   upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=103f4fbe50
kernel config:  https://syzkaller.appspot.com/x/.config?x=75292221eb79ace2
dashboard link: https://syzkaller.appspot.com/bug?extid=1a219abc12077a390bc9
compiler:   gcc (GCC) 10.1.0-syz 20200507

Unfortunately, I don't have any reproducer for this issue yet.

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+1a219abc12077a390...@syzkaller.appspotmail.com

BFS-fs: bfs_fill_super(): WARNING: filesystem loop4 was created with 512 
inodes, the real maximum is 511, mounting anyway
BFS-fs: bfs_fill_super(): Last block not available on loop4: 1507328
BFS-fs: bfs_fill_super(): WARNING: filesystem loop4 was created with 512 
inodes, the real maximum is 511, mounting anyway
BFS-fs: bfs_fill_super(): Last block not available on loop4: 1507328


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkal...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

[PATCH] crypto: cavium/zip - Use dma_set_mask_and_coherent to simplify code

2020-11-20 Thread Christophe JAILLET

'pci_set_dma_mask()' + 'pci_set_consistent_dma_mask()' can be replaced by
an equivalent 'dma_set_mask_and_coherent()' which is much less verbose.

Signed-off-by: Christophe JAILLET 
---
 drivers/crypto/cavium/zip/zip_main.c | 10 ++
 1 file changed, 2 insertions(+), 8 deletions(-)

diff --git a/drivers/crypto/cavium/zip/zip_main.c 
b/drivers/crypto/cavium/zip/zip_main.c
index d35216e2f6cd..812b4ac9afd6 100644
--- a/drivers/crypto/cavium/zip/zip_main.c
+++ b/drivers/crypto/cavium/zip/zip_main.c
@@ -263,15 +263,9 @@ static int zip_probe(struct pci_dev *pdev, const struct 
pci_device_id *ent)
goto err_disable_device;
}
 
-   err = pci_set_dma_mask(pdev, DMA_BIT_MASK(48));
+   err = dma_set_mask_and_coherent(>dev, DMA_BIT_MASK(48));
if (err) {
-   dev_err(dev, "Unable to get usable DMA configuration\n");
-   goto err_release_regions;
-   }
-
-   err = pci_set_consistent_dma_mask(pdev, DMA_BIT_MASK(48));
-   if (err) {
-   dev_err(dev, "Unable to get 48-bit DMA for allocations\n");
+   dev_err(dev, "Unable to get usable 48-bit DMA configuration\n");
goto err_release_regions;
}
 
-- 
2.27.0

[PATCH] nfs: Only include nfs42.h when NFS_V4_2 enable

2020-11-20 Thread Wang Qing

Remove duplicate header unnecessary.
Only include nfs42.h when NFS_V4_2 enable.

Signed-off-by: Wang Qing 
---
 fs/nfs/nfs4proc.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 9e0ca9b..a1321a5 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -67,7 +67,6 @@
 #include "nfs4idmap.h"
 #include "nfs4session.h"
 #include "fscache.h"
-#include "nfs42.h"
 
 #include "nfs4trace.h"
 
-- 
2.7.4

Need competent person for this position

2020-11-20 Thread Center linkedIn

Hello!
We are writing this letter to you from LinkedIn Customer Center. We have a Job 
Vacancy for you in your country, If you Receive this message Kindly send us 
your CV to our office Email: mec...@xcontrol.it  For more information about the 
job,The position of the job cannot stop your business or the work you are doing 
already.

Best regards
Jeff.

Re: [PATCH] binder: add flag to clear buffer on txn complete

2020-11-20 Thread Greg KH

On Fri, Nov 20, 2020 at 03:37:43PM -0800, Todd Kjos wrote:
> Add a per-transaction flag to indicate that the buffer
> must be cleared when the transaction is complete to
> prevent copies of sensitive data from being preserved
> in memory.
> 
> Signed-off-by: Todd Kjos 
> ---

DOes this need to be backported to stable kernels as well?

thanks,

greg k-h

[PATCH] crypto: qat - Use dma_set_mask_and_coherent to simplify code

2020-11-20 Thread Christophe JAILLET

'pci_set_dma_mask()' + 'pci_set_consistent_dma_mask()' can be replaced by
an equivalent 'dma_set_mask_and_coherent()' which is much less verbose.

While at it, also remove some unless extra () in the 32 bits case.

Signed-off-by: Christophe JAILLET 
---
Instead of returning -EFAULT, we could also propagate the error returned
by dma_set_mask_and_coherent()
---
 drivers/crypto/qat/qat_c3xxx/adf_drv.c  | 9 ++---
 drivers/crypto/qat/qat_c3xxxvf/adf_drv.c| 9 ++---
 drivers/crypto/qat/qat_c62x/adf_drv.c   | 9 ++---
 drivers/crypto/qat/qat_c62xvf/adf_drv.c | 9 ++---
 drivers/crypto/qat/qat_dh895xcc/adf_drv.c   | 9 ++---
 drivers/crypto/qat/qat_dh895xccvf/adf_drv.c | 9 ++---
 6 files changed, 12 insertions(+), 42 deletions(-)

diff --git a/drivers/crypto/qat/qat_c3xxx/adf_drv.c 
b/drivers/crypto/qat/qat_c3xxx/adf_drv.c
index 7fb3343ae8b0..b39e06820295 100644
--- a/drivers/crypto/qat/qat_c3xxx/adf_drv.c
+++ b/drivers/crypto/qat/qat_c3xxx/adf_drv.c
@@ -159,17 +159,12 @@ static int adf_probe(struct pci_dev *pdev, const struct 
pci_device_id *ent)
}
 
/* set dma identifier */
-   if (pci_set_dma_mask(pdev, DMA_BIT_MASK(64))) {
-   if ((pci_set_dma_mask(pdev, DMA_BIT_MASK(32 {
+   if (dma_set_mask_and_coherent(>dev, DMA_BIT_MASK(64))) {
+   if (dma_set_mask_and_coherent(>dev, DMA_BIT_MASK(32))) {
dev_err(>dev, "No usable DMA configuration\n");
ret = -EFAULT;
goto out_err_disable;
-   } else {
-   pci_set_consistent_dma_mask(pdev, DMA_BIT_MASK(32));
}
-
-   } else {
-   pci_set_consistent_dma_mask(pdev, DMA_BIT_MASK(64));
}
 
if (pci_request_regions(pdev, ADF_C3XXX_DEVICE_NAME)) {
diff --git a/drivers/crypto/qat/qat_c3xxxvf/adf_drv.c 
b/drivers/crypto/qat/qat_c3xxxvf/adf_drv.c
index 1d1532e8fb6d..b1d1d12694dc 100644
--- a/drivers/crypto/qat/qat_c3xxxvf/adf_drv.c
+++ b/drivers/crypto/qat/qat_c3xxxvf/adf_drv.c
@@ -141,17 +141,12 @@ static int adf_probe(struct pci_dev *pdev, const struct 
pci_device_id *ent)
}
 
/* set dma identifier */
-   if (pci_set_dma_mask(pdev, DMA_BIT_MASK(64))) {
-   if ((pci_set_dma_mask(pdev, DMA_BIT_MASK(32 {
+   if (dma_set_mask_and_coherent(>dev, DMA_BIT_MASK(64))) {
+   if (dma_set_mask_and_coherent(>dev, DMA_BIT_MASK(32))) {
dev_err(>dev, "No usable DMA configuration\n");
ret = -EFAULT;
goto out_err_disable;
-   } else {
-   pci_set_consistent_dma_mask(pdev, DMA_BIT_MASK(32));
}
-
-   } else {
-   pci_set_consistent_dma_mask(pdev, DMA_BIT_MASK(64));
}
 
if (pci_request_regions(pdev, ADF_C3XXXVF_DEVICE_NAME)) {
diff --git a/drivers/crypto/qat/qat_c62x/adf_drv.c 
b/drivers/crypto/qat/qat_c62x/adf_drv.c
index 1f5de442e1e6..99f6f3c7c6b0 100644
--- a/drivers/crypto/qat/qat_c62x/adf_drv.c
+++ b/drivers/crypto/qat/qat_c62x/adf_drv.c
@@ -159,17 +159,12 @@ static int adf_probe(struct pci_dev *pdev, const struct 
pci_device_id *ent)
}
 
/* set dma identifier */
-   if (pci_set_dma_mask(pdev, DMA_BIT_MASK(64))) {
-   if ((pci_set_dma_mask(pdev, DMA_BIT_MASK(32 {
+   if (dma_set_mask_and_coherent(>dev, DMA_BIT_MASK(64))) {
+   if (dma_set_mask_and_coherent(>dev, DMA_BIT_MASK(32))) {
dev_err(>dev, "No usable DMA configuration\n");
ret = -EFAULT;
goto out_err_disable;
-   } else {
-   pci_set_consistent_dma_mask(pdev, DMA_BIT_MASK(32));
}
-
-   } else {
-   pci_set_consistent_dma_mask(pdev, DMA_BIT_MASK(64));
}
 
if (pci_request_regions(pdev, ADF_C62X_DEVICE_NAME)) {
diff --git a/drivers/crypto/qat/qat_c62xvf/adf_drv.c 
b/drivers/crypto/qat/qat_c62xvf/adf_drv.c
index 04742a6d91ca..26c0b7d08636 100644
--- a/drivers/crypto/qat/qat_c62xvf/adf_drv.c
+++ b/drivers/crypto/qat/qat_c62xvf/adf_drv.c
@@ -141,17 +141,12 @@ static int adf_probe(struct pci_dev *pdev, const struct 
pci_device_id *ent)
}
 
/* set dma identifier */
-   if (pci_set_dma_mask(pdev, DMA_BIT_MASK(64))) {
-   if ((pci_set_dma_mask(pdev, DMA_BIT_MASK(32 {
+   if (dma_set_mask_and_coherent(>dev, DMA_BIT_MASK(64))) {
+   if (dma_set_mask_and_coherent(>dev, DMA_BIT_MASK(32))) {
dev_err(>dev, "No usable DMA configuration\n");
ret = -EFAULT;
goto out_err_disable;
-   } else {
-   pci_set_consistent_dma_mask(pdev, DMA_BIT_MASK(32));
}
-
-   } else {
-   pci_set_consistent_dma_mask(pdev, DMA_BIT_MASK(64));
}

Re: [PATCH v24 08/12] landlock: Add syscall implementations

2020-11-20 Thread Jann Horn

On Thu, Nov 12, 2020 at 9:52 PM Mickaël Salaün  wrote:
> These 3 system calls are designed to be used by unprivileged processes
> to sandbox themselves:
> * landlock_create_ruleset(2): Creates a ruleset and returns its file
>   descriptor.
> * landlock_add_rule(2): Adds a rule (e.g. file hierarchy access) to a
>   ruleset, identified by the dedicated file descriptor.
> * landlock_enforce_ruleset_current(2): Enforces a ruleset on the current
>   thread and its future children (similar to seccomp).  This syscall has
>   the same usage restrictions as seccomp(2): the caller must have the
>   no_new_privs attribute set or have CAP_SYS_ADMIN in the current user
>   namespace.
>
> All these syscalls have a "flags" argument (not currently used) to
> enable extensibility.
>
> Here are the motivations for these new syscalls:
> * A sandboxed process may not have access to file systems, including
>   /dev, /sys or /proc, but it should still be able to add more
>   restrictions to itself.
> * Neither prctl(2) nor seccomp(2) (which was used in a previous version)
>   fit well with the current definition of a Landlock security policy.
>
> All passed structs (attributes) are checked at build time to ensure that
> they don't contain holes and that they are aligned the same way for each
> architecture.
>
> See the user and kernel documentation for more details (provided by a
> following commit):
> * Documentation/userspace-api/landlock.rst
> * Documentation/security/landlock.rst
>
> Cc: Arnd Bergmann 
> Cc: James Morris 
> Cc: Jann Horn 
> Cc: Kees Cook 
> Cc: Serge E. Hallyn 
> Signed-off-by: Mickaël Salaün 

Reviewed-by: Jann Horn

Re: [PATCH v24 02/12] landlock: Add ruleset and domain management

2020-11-20 Thread Jann Horn

On Thu, Nov 12, 2020 at 9:51 PM Mickaël Salaün  wrote:
> A Landlock ruleset is mainly a red-black tree with Landlock rules as
> nodes.  This enables quick update and lookup to match a requested
> access, e.g. to a file.  A ruleset is usable through a dedicated file
> descriptor (cf. following commit implementing syscalls) which enables a
> process to create and populate a ruleset with new rules.
>
> A domain is a ruleset tied to a set of processes.  This group of rules
> defines the security policy enforced on these processes and their future
> children.  A domain can transition to a new domain which is the
> intersection of all its constraints and those of a ruleset provided by
> the current process.  This modification only impact the current process.
> This means that a process can only gain more constraints (i.e. lose
> accesses) over time.
>
> Cc: James Morris 
> Cc: Jann Horn 
> Cc: Kees Cook 
> Cc: Serge E. Hallyn 
> Signed-off-by: Mickaël Salaün 
> ---
>
> Changes since v23:
> * Always intersect access rights.  Following the filesystem change
>   logic, make ruleset updates more consistent by always intersecting
>   access rights (boolean AND) instead of combining them (boolean OR) for
>   the same layer.

This seems wrong to me. If some software e.g. builds a policy that
allows it to execute specific libraries and to open input files
specified on the command line, and the user then specifies a library
as an input file, this change will make that fail unless the software
explicitly deduplicates the rules.
Userspace will be forced to add extra complexity to work around this.

>   This defensive approach could also help avoid user
>   space to inadvertently allow multiple access rights for the same
>   object (e.g.  write and execute access on a path hierarchy) instead of
>   dealing with such inconsistency.  This can happen when there is no
>   deduplication of objects (e.g. paths and underlying inodes) whereas
>   they get different access rights with landlock_add_rule(2).

I don't see why that's an issue. If userspace wants to be able to
access the same object in different ways for different purposes, it
should be able to do that, no?

I liked the semantics from the previous version.

Re: [PATCH v24 07/12] landlock: Support filesystem access-control

2020-11-20 Thread Jann Horn

On Thu, Nov 12, 2020 at 9:52 PM Mickaël Salaün  wrote:
> Thanks to the Landlock objects and ruleset, it is possible to identify
> inodes according to a process's domain.  To enable an unprivileged
> process to express a file hierarchy, it first needs to open a directory
> (or a file) and pass this file descriptor to the kernel through
> landlock_add_rule(2).  When checking if a file access request is
> allowed, we walk from the requested dentry to the real root, following
> the different mount layers.  The access to each "tagged" inodes are
> collected according to their rule layer level, and ANDed to create
> access to the requested file hierarchy.  This makes possible to identify
> a lot of files without tagging every inodes nor modifying the
> filesystem, while still following the view and understanding the user
> has from the filesystem.
>
> Add a new ARCH_EPHEMERAL_INODES for UML because it currently does not
> keep the same struct inodes for the same inodes whereas these inodes are
> in use.
>
> This commit adds a minimal set of supported filesystem access-control
> which doesn't enable to restrict all file-related actions.  This is the
> result of multiple discussions to minimize the code of Landlock to ease
> review.  Thanks to the Landlock design, extending this access-control
> without breaking user space will not be a problem.  Moreover, seccomp
> filters can be used to restrict the use of syscall families which may
> not be currently handled by Landlock.
>
> Cc: Al Viro 
> Cc: Anton Ivanov 
> Cc: James Morris 
> Cc: Jann Horn 
> Cc: Jeff Dike 
> Cc: Kees Cook 
> Cc: Richard Weinberger 
> Cc: Serge E. Hallyn 
> Signed-off-by: Mickaël Salaün 
> ---
>
> Changes since v23:
> * Enforce deterministic interleaved path rules.  To have consistent
>   layered rules, granting access to a path implies that all accesses
>   tied to inodes, from the requested file to the real root, must be
>   checked.  Otherwise, stacked rules may result to overzealous
>   restrictions.  By excluding the ability to add exceptions in the same
>   layer (e.g. /a allowed, /a/b denied, and /a/b/c allowed), we get
>   deterministic interleaved path rules.  This removes an optimization

I don't understand the "deterministic interleaved path rules" part.


What if I have a policy like this?

/home/user READ
/home/user/Downloads READ+WRITE

That's a reasonable policy, right?

If I then try to open /home/user/Downloads/foo in WRITE mode, the loop
will first check against the READ+WRITE rule for /home/user, that
check will pass, and then it will check against the READ rule for /,
which will deny the access, right? That seems bad.


The v22 code ensured that for each layer, the most specific rule (the
first we encounter on the walk) always wins, right? What's the problem
with that?

>   which could be replaced by a proper cache mechanism.  This also
>   further simplifies and explain check_access_path_continue().

>From the interdiff between v23 and v24 (git range-diff
99ade5d59b23~1..99ade5d59b23 faa8c09be9fd~1..faa8c09be9fd):

@@ security/landlock/fs.c (new)
 +  rcu_dereference(landlock_inode(inode)->object));
 +  rcu_read_unlock();
 +
-+  /* Checks for matching layers. */
-+  if (rule && (rule->layers | *layer_mask)) {
-+  if ((rule->access & access_request) == access_request) {
-+  *layer_mask &= ~rule->layers;
-+  return true;
-+  } else {
-+  return false;
-+  }
++  if (!rule)
++  /* Continues to walk if there is no rule for this inode. */
++  return true;
++  /*
++   * We must check all layers for each inode because we may encounter
++   * multiple different accesses from the same layer in a walk.  Each
++   * layer must at least allow the access request one time (i.e. with one
++   * inode).  This enables to have a deterministic behavior whatever
++   * inode is tagged within interleaved layers.
++   */
++  if ((rule->access & access_request) == access_request) {
++  /* Validates layers for which all accesses are allowed. */
++  *layer_mask &= ~rule->layers;
++  /* Continues to walk until all layers are validated. */
++  return true;
 +  }
-+  return true;
++  /* Stops if a rule in the path don't allow all requested access. */
++  return false;
 +}
 +
 +static int check_access_path(const struct landlock_ruleset *const domain,
@@ security/landlock/fs.c (new)
 +  _mask)) {
 +  struct dentry *parent_dentry;
 +
-+  /* Stops when a rule from each layer granted access. */
-+  if (layer_mask == 0) {
-+  allowed = true;
-+  break;
-+  }
-+

This change also made it so that disconnected paths aren't accessible
unless they're internal, right?

Re: [PATCH v24 12/12] landlock: Add user and kernel documentation

2020-11-20 Thread Jann Horn

On Thu, Nov 12, 2020 at 9:52 PM Mickaël Salaün  wrote:
> This documentation can be built with the Sphinx framework.
>
> Cc: James Morris 
> Cc: Jann Horn 
> Cc: Kees Cook 
> Cc: Serge E. Hallyn 
> Signed-off-by: Mickaël Salaün 
> Reviewed-by: Vincent Dagonneau 

Reviewed-by: Jann Horn

Re: [PATCH v24 01/12] landlock: Add object management

2020-11-20 Thread Jann Horn

On Thu, Nov 12, 2020 at 9:51 PM Mickaël Salaün  wrote:
> A Landlock object enables to identify a kernel object (e.g. an inode).
> A Landlock rule is a set of access rights allowed on an object.  Rules
> are grouped in rulesets that may be tied to a set of processes (i.e.
> subjects) to enforce a scoped access-control (i.e. a domain).
>
> Because Landlock's goal is to empower any process (especially
> unprivileged ones) to sandbox themselves, we cannot rely on a
> system-wide object identification such as file extended attributes.
> Indeed, we need innocuous, composable and modular access-controls.
>
> The main challenge with these constraints is to identify kernel objects
> while this identification is useful (i.e. when a security policy makes
> use of this object).  But this identification data should be freed once
> no policy is using it.  This ephemeral tagging should not and may not be
> written in the filesystem.  We then need to manage the lifetime of a
> rule according to the lifetime of its objects.  To avoid a global lock,
> this implementation make use of RCU and counters to safely reference
> objects.
>
> A following commit uses this generic object management for inodes.
>
> Cc: James Morris 
> Cc: Kees Cook 
> Cc: Serge E. Hallyn 
> Signed-off-by: Mickaël Salaün 
> Reviewed-by: Jann Horn 

Still looks good, except for one comment:

[...]
> +   /**
> +* @lock: Guards against concurrent modifications.  This lock might be
> +* held from the time @usage drops to zero until any weak references
> +* from @underobj to this object have been cleaned up.
> +*
> +* Lock ordering: inode->i_lock nests inside this.
> +*/
> +   spinlock_t lock;

Why did you change this to "might be held" (v22 had "must")? Is the
"might" a typo?

Re: [PATCH bpf-next v2 2/3] bpf: Add a BPF helper for getting the IMA hash of an inode

2020-11-20 Thread Yonghong Song





On 11/20/20 4:50 PM, KP Singh wrote:

From: KP Singh 

Provide a wrapper function to get the IMA hash of an inode. This helper
is useful in fingerprinting files (e.g executables on execution) and
using these fingerprints in detections like an executable unlinking
itself.

Since the ima_inode_hash can sleep, it's only allowed for sleepable
LSM hooks.

Signed-off-by: KP Singh 

Acked-by: Yonghong Song

Re: [PATCH bpf-next v2 1/3] ima: Implement ima_inode_hash

2020-11-20 Thread Yonghong Song





On 11/20/20 4:50 PM, KP Singh wrote:

From: KP Singh 

This is in preparation to add a helper for BPF LSM programs to use
IMA hashes when attached to LSM hooks. There are LSM hooks like
inode_unlink which do not have a struct file * argument and cannot
use the existing ima_file_hash API.

An inode based API is, therefore, useful in LSM based detections like an
executable trying to delete itself which rely on the inode_unlink LSM
hook.

Moreover, the ima_file_hash function does nothing with the struct file
pointer apart from calling file_inode on it and converting it to an
inode.

Signed-off-by: KP Singh 


Acked-by: Yonghong Song

[PATCH] regulator: Kconfig: Fix REGULATOR_QCOM_RPMH dependencies to avoid build error

2020-11-20 Thread John Stultz

The kernel test robot reported the following build error:

All errors (new ones prefixed by >>):

   xtensa-linux-ld: drivers/regulator/qcom-rpmh-regulator.o: in function 
`rpmh_regulator_vrm_get_voltage_sel':
   qcom-rpmh-regulator.c:(.text+0x270): undefined reference to `rpmh_write'
   xtensa-linux-ld: drivers/regulator/qcom-rpmh-regulator.o: in function 
`rpmh_regulator_send_request':
   qcom-rpmh-regulator.c:(.text+0x2f2): undefined reference to `rpmh_write'
   xtensa-linux-ld: drivers/regulator/qcom-rpmh-regulator.o: in function 
`rpmh_regulator_vrm_get_voltage_sel':
>> qcom-rpmh-regulator.c:(.text+0x274): undefined reference to 
>> `rpmh_write_async'
   xtensa-linux-ld: drivers/regulator/qcom-rpmh-regulator.o: in function 
`rpmh_regulator_send_request':
   qcom-rpmh-regulator.c:(.text+0x2fc): undefined reference to 
`rpmh_write_async'

Which is due to REGULATOR_QCOM_RPMH depending on
QCOM_RPMH || COMPILE_TEST. The problem is that QOM_RPMH can now
be a module, which in that case requires REGULATOR_QCOM_RPMH=m
to build.

However, if COMPILE_TEST is enabled, REGULATOR_QCOM_RPMH can be
set to =y while REGULATOR_QCOM_RPMH=m which will cause build
failures.

The easy fix here is to remove COMPILE_TEST.

Feedback would be appreciated!

Cc: Todd Kjos 
Cc: Saravana Kannan 
Cc: Andy Gross 
Cc: Bjorn Andersson 
Cc: Rajendra Nayak 
Cc: Maulik Shah 
Cc: Stephen Boyd 
Cc: Liam Girdwood 
Cc: Mark Brown 
Cc: linux-arm-...@vger.kernel.org
Reported-by: kernel test robot 
Signed-off-by: John Stultz 
---
 drivers/regulator/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/regulator/Kconfig b/drivers/regulator/Kconfig
index 020a00d6696b..9e4fc73ed5a1 100644
--- a/drivers/regulator/Kconfig
+++ b/drivers/regulator/Kconfig
@@ -843,7 +843,7 @@ config REGULATOR_QCOM_RPM
 
 config REGULATOR_QCOM_RPMH
tristate "Qualcomm Technologies, Inc. RPMh regulator driver"
-   depends on QCOM_RPMH || COMPILE_TEST
+   depends on QCOM_RPMH
help
  This driver supports control of PMIC regulators via the RPMh hardware
  block found on Qualcomm Technologies Inc. SoCs.  RPMh regulator
-- 
2.17.1

[no subject]

2020-11-20 Thread Steve & Lenka Thomson Foundation®





Greetings,

We are pleased to inform you that an amount of £500,000.00(GBP) has been
donated and given, gifted to you and your family by Steve & Lenka Thomson,
who won the Euro Millions jackpot, lottery of
£105,100,701.90 Euro Millions, part of this donation, it is for you and
your family. This donation is to help fight against Corona Virus COVID -19
pandemic in the world, and help the poor people off the streets, also to
contribute to poverty reduction, public donations, public charity,
orphanages, less privileged and help poor individuals in your community
please contact her to claim the money via email for more details:
i...@supportfoundation.co.uk

Regards
Steve Thomson

[PATCH 2/2] arm64: dts: qcom: sm8150-mtp: Enable WiFi node

2020-11-20 Thread Bjorn Andersson

From: Jonathan Marek 

Enable the WiFi node and specify its supply regulators.

Signed-off-by: Jonathan Marek 
[bjorn: Extracted patch from larger HDK patch]
Signed-off-by: Bjorn Andersson 
---
 arch/arm64/boot/dts/qcom/sm8150-mtp.dts | 9 +
 1 file changed, 9 insertions(+)

diff --git a/arch/arm64/boot/dts/qcom/sm8150-mtp.dts 
b/arch/arm64/boot/dts/qcom/sm8150-mtp.dts
index 6c6325c3af59..7a64a2ed78c3 100644
--- a/arch/arm64/boot/dts/qcom/sm8150-mtp.dts
+++ b/arch/arm64/boot/dts/qcom/sm8150-mtp.dts
@@ -429,3 +429,12 @@ _1 {
 _1_dwc3 {
dr_mode = "peripheral";
 };
+
+ {
+   status = "okay";
+
+   vdd-0.8-cx-mx-supply = <_wcss_pll>;
+   vdd-1.8-xo-supply = <_l7a_1p8>;
+   vdd-1.3-rfa-supply = <_wcss_adcdac_1>;
+   vdd-3.3-ch0-supply = <_l11c_3p3>;
+};
-- 
2.28.0

[PATCH] arm64: dts: qcom: sm8150-mtp: Specify remoteproc firmware

2020-11-20 Thread Bjorn Andersson

Point the various remoteprocs of SM8150 MTP to a place with the platform
specific firmware.

Signed-off-by: Bjorn Andersson 
---
 arch/arm64/boot/dts/qcom/sm8150-mtp.dts | 8 
 1 file changed, 8 insertions(+)

diff --git a/arch/arm64/boot/dts/qcom/sm8150-mtp.dts 
b/arch/arm64/boot/dts/qcom/sm8150-mtp.dts
index 7a64a2ed78c3..3774f8e63416 100644
--- a/arch/arm64/boot/dts/qcom/sm8150-mtp.dts
+++ b/arch/arm64/boot/dts/qcom/sm8150-mtp.dts
@@ -369,14 +369,22 @@ resin {
 
 _adsp {
status = "okay";
+   firmware-name = "qcom/sm8150/adsp.mdt";
 };
 
 _cdsp {
status = "okay";
+   firmware-name = "qcom/sm8150/cdsp.mdt";
+};
+
+_mpss {
+   status = "okay";
+   firmware-name = "qcom/sm8150/modem.mdt";
 };
 
 _slpi {
status = "okay";
+   firmware-name = "qcom/sm8150/slpi.mdt";
 };
 
  {
-- 
2.28.0

[PATCH 1/2] arm64: dts: qcom: sm8150: Add wifi node

2020-11-20 Thread Bjorn Andersson

From: Jonathan Marek 

Add a node for the WCN3990 WiFi module.

Signed-off-by: Jonathan Marek 
[bjorn: Extracted patch from larger "misc" patch, added qdss clock]
Signed-off-by: Bjorn Andersson 
---
 arch/arm64/boot/dts/qcom/sm8150.dtsi | 23 +++
 1 file changed, 23 insertions(+)

diff --git a/arch/arm64/boot/dts/qcom/sm8150.dtsi 
b/arch/arm64/boot/dts/qcom/sm8150.dtsi
index c2f8c3097ac5..f4c3fbf36e87 100644
--- a/arch/arm64/boot/dts/qcom/sm8150.dtsi
+++ b/arch/arm64/boot/dts/qcom/sm8150.dtsi
@@ -1297,6 +1297,29 @@ cpufreq_hw: cpufreq@18323000 {
 
#freq-domain-cells = <1>;
};
+
+   wifi: wifi@1880 {
+   compatible = "qcom,wcn3990-wifi";
+   reg = <0 0x1880 0 0x80>;
+   reg-names = "membase";
+   memory-region = <_mem>;
+   clock-names = "cxo_ref_clk_pin", "qdss";
+   clocks = < RPMH_RF_CLK2>, <_qmp>;
+   interrupts = ,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+;
+   iommus = <_smmu 0x0640 0x1>;
+   status = "disabled";
+   };
};
 
timer {
-- 
2.28.0

[PATCH v1] qnx6: check the sanity of root inode type

2020-11-20 Thread Tong Zhang

root inode should be directory type

mount /dev/sdb /mnt
[   18.799875] qnx6: superblock #1 active
[   18.810693] BUG: kernel NULL pointer dereference, address: 
[   18.810885] #PF: supervisor instruction fetch in kernel mode
[   18.810999] #PF: error_code(0x0010) - not-present page
[   18.811170] PGD 3b4c067 P4D 3b4c067 PUD 4213067 PMD 0
[   18.811522] Oops: 0010 [#1] SMP KASAN NOPTI
[   18.811754] CPU: 0 PID: 159 Comm: mount Not tainted 5.10.0-rc4+ #100
[   18.811880] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 
rel-1.13.0-48-gd9c812d4
[   18.812167] RIP: 0010:0x0
[   18.812354] Code: Unable to access opcode bytes at RIP 0xffd6.
[   18.812491] RSP: 0018:8880040afa28 EFLAGS: 0246
[   18.812639] RAX:  RBX: 838b78a0 RCX: 81a4d485
[   18.812777] RDX: dc00 RSI: ea524900 RDI: 
[   18.812913] RBP: 8880040afcf0 R08: 0001 R09: f94a4921
[   18.813050] R10: ea524907 R11: f94a4920 R12: 888002fa87d0
[   18.813186] R13: 888003e4ff00 R14: 888002fa8668 R15: ea524900
[   18.813352] FS:  7f2e4163f6a0() GS:88801aa0() 
knlGS:
[   18.813496] CS:  0010 DS:  ES:  CR0: 80050033
[   18.813606] CR2: ffd6 CR3: 03e2 CR4: 06f0
[   18.813798] Call Trace:
[   18.814090]  do_read_cache_page+0x856/0xb50
[   18.814234]  ? generic_file_read_iter+0x220/0x220
[   18.814343]  ? _raw_spin_lock+0x75/0xd0
[   18.814438]  ? _raw_read_lock_irq+0x30/0x30
[   18.814537]  ? _raw_spin_lock+0x75/0xd0
[   18.814636]  ? d_flags_for_inode+0x56/0xf0
[   18.814735]  ? __d_instantiate+0x169/0x190
[   18.814839]  qnx6_fill_super+0x369/0x630
[   18.814941]  ? qnx6_iget+0x460/0x460
[   18.815035]  ? asm_sysvec_apic_timer_interrupt+0x12/0x20
[   18.815147]  ? sb_set_blocksize+0x3d/0x70
[   18.815259]  ? __asan_load1+0x70/0x70
[   18.815354]  mount_bdev+0x1d9/0x220
[   18.815448]  ? qnx6_iget+0x460/0x460
[   18.815541]  ? qnx6_readpage+0x10/0x10
[   18.815635]  legacy_get_tree+0x6b/0xa0
[   18.815739]  vfs_get_tree+0x41/0x110
[   18.815838]  path_mount+0x3b3/0xd50
[   18.815937]  ? finish_automount+0x2b0/0x2b0
[   18.816039]  ? getname_flags+0x100/0x2a0
[   18.816138]  do_mount+0xc5/0xe0
[   18.816226]  ? path_mount+0xd50/0xd50
[   18.816319]  ? memdup_user+0x3c/0x80
[   18.816413]  __x64_sys_mount+0xb9/0xf0
[   18.816511]  do_syscall_64+0x33/0x40
[   18.816607]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[   18.816747] RIP: 0033:0x7f2e415bd515
[   18.816950] Code: b8 b0 00 00 00 0f 05 48 3d 00 f0 ff ff 76 10 48 8b 15 5f 
79 06 00 f7 d8 63
[   18.817203] RSP: 002b:7b70a928 EFLAGS: 0206 ORIG_RAX: 
00a5
[   18.817360] RAX: ffda RBX: 8000 RCX: 7f2e415bd515
[   18.817478] RDX: 00a0d690 RSI: 7b70bf62 RDI: 7b70bf59
[   18.817597] RBP: 7b70aab0 R08:  R09: 
[   18.817715] R10: 8000 R11: 0206 R12: 
[   18.817831] R13: 7f2e4163f690 R14:  R15: 00a0d6f0
[   18.817998] Modules linked in:
[   18.818164] CR2: 
[   18.818738] ---[ end trace 254672b93198cc87 ]---
[   18.818866] RIP: 0010:0x0
[   18.818946] Code: Unable to access opcode bytes at RIP 0xffd6.
[   18.819063] RSP: 0018:8880040afa28 EFLAGS: 0246
[   18.819180] RAX:  RBX: 838b78a0 RCX: 81a4d485
[   18.819299] RDX: dc00 RSI: ea524900 RDI: 
[   18.819418] RBP: 8880040afcf0 R08: 0001 R09: f94a4921
[   18.819536] R10: ea524907 R11: f94a4920 R12: 888002fa87d0
[   18.819768] R13: 888003e4ff00 R14: 888002fa8668 R15: ea524900
[   18.819895] FS:  7f2e4163f6a0() GS:88801aa0() 
knlGS:
[   18.820024] CS:  0010 DS:  ES:  CR0: 80050033
[   18.820125] CR2: ffd6 CR3: 03e2 CR4: 06f0
Killed

Signed-off-by: Tong Zhang 
---
 fs/qnx6/inode.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/fs/qnx6/inode.c b/fs/qnx6/inode.c
index 61191f7bdf62..26e0187ddfad 100644
--- a/fs/qnx6/inode.c
+++ b/fs/qnx6/inode.c
@@ -447,6 +447,11 @@ static int qnx6_fill_super(struct super_block *s, void 
*data, int silent)
ret = PTR_ERR(root);
goto out2;
}
+   if (!S_ISDIR(root->i_mode)) {
+   pr_err("wrong root inode type\n");
+   ret = -EINVAL;
+   goto out2;
+   }
 
ret = -ENOMEM;
s->s_root = d_make_root(root);
-- 
2.25.1

Re: [PATCH] mm: memcontrol: account pagetables per node

2020-11-20 Thread kernel test robot

Hi Shakeel,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on driver-core/driver-core-testing]
[also build test ERROR on linus/master v5.10-rc4 next-20201120]
[cannot apply to mmotm/master cgroup/for-next hnaz-linux-mm/master]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:
https://github.com/0day-ci/linux/commits/Shakeel-Butt/mm-memcontrol-account-pagetables-per-node/20201121-102353
base:   https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core.git 
33c0c9bdf7a59051a654cd98b7d2b48ce0080967
config: nds32-allnoconfig (attached as .config)
compiler: nds32le-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# 
https://github.com/0day-ci/linux/commit/e51fa7d7d401d329238b2f8bc4d506a2ab1f5c67
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review 
Shakeel-Butt/mm-memcontrol-account-pagetables-per-node/20201121-102353
git checkout e51fa7d7d401d329238b2f8bc4d506a2ab1f5c67
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross 
ARCH=nds32 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

All errors (new ones prefixed by >>):

   nds32le-linux-ld: arch/nds32/mm/mm-nds32.o: in function `pgd_alloc':
>> mm-nds32.c:(.text+0x6e): undefined reference to `inc_lruvec_page_state'
>> nds32le-linux-ld: mm-nds32.c:(.text+0x72): undefined reference to 
>> `inc_lruvec_page_state'
   nds32le-linux-ld: arch/nds32/mm/mm-nds32.o: in function `pgd_free':
>> mm-nds32.c:(.text+0xd8): undefined reference to `dec_lruvec_page_state'
>> nds32le-linux-ld: mm-nds32.c:(.text+0xdc): undefined reference to 
>> `dec_lruvec_page_state'
   nds32le-linux-ld: mm-nds32.c:(.text+0xf4): undefined reference to 
`dec_lruvec_page_state'
   nds32le-linux-ld: mm-nds32.c:(.text+0xf8): undefined reference to 
`dec_lruvec_page_state'

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org


.config.gz
Description: application/gzip

[rcu:rcu/next] BUILD SUCCESS 53f29e8d30d0ce9af81b320b115a4e2956f317b3

2020-11-20 Thread kernel test robot

tree/branch: 
https://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git  rcu/next
branch HEAD: 53f29e8d30d0ce9af81b320b115a4e2956f317b3  torture: Make kvm.sh 
"Test Summary" date be end of test

elapsed time: 832m

configs tested: 120
configs skipped: 2

The following configs have been built successfully.
More configs may be tested in the coming days.

gcc tested configs:
arm defconfig
arm64allyesconfig
arm64   defconfig
arm  allyesconfig
arm  allmodconfig
armvt8500_v6_v7_defconfig
sh  defconfig
mips   ip27_defconfig
m68kq40_defconfig
ia64 alldefconfig
sh ecovec24_defconfig
arcvdk_hs38_defconfig
sh   sh7724_generic_defconfig
powerpc mpc8540_ads_defconfig
riscv  rv32_defconfig
arm   efm32_defconfig
mips bigsur_defconfig
powerpc sequoia_defconfig
shshmin_defconfig
i386 alldefconfig
sh  rsk7203_defconfig
arm bcm2835_defconfig
sh  sdk7786_defconfig
m68kstmark2_defconfig
powerpc mpc836x_rdk_defconfig
mips  maltasmvp_eva_defconfig
arm s3c6400_defconfig
mips db1xxx_defconfig
m68k  sun3x_defconfig
arm mv78xx0_defconfig
arm eseries_pxa_defconfig
pariscgeneric-64bit_defconfig
arcvdk_hs38_smp_defconfig
powerpcwarp_defconfig
ia64  gensparse_defconfig
mipsvocore2_defconfig
arm   cns3420vb_defconfig
arm am200epdkit_defconfig
mips   ip32_defconfig
powerpc ppa8548_defconfig
sh   se7780_defconfig
mipsmalta_kvm_guest_defconfig
arm  pxa3xx_defconfig
sh apsh4a3a_defconfig
armpleb_defconfig
h8300   defconfig
mips   sb1250_swarm_defconfig
sh   se7705_defconfig
powerpc  tqm8xx_defconfig
powerpc  ppc6xx_defconfig
shsh7785lcr_defconfig
arm   sunxi_defconfig
ia64 allmodconfig
ia64defconfig
ia64 allyesconfig
m68k allmodconfig
m68kdefconfig
m68k allyesconfig
nds32   defconfig
nios2allyesconfig
cskydefconfig
alpha   defconfig
alphaallyesconfig
nios2   defconfig
arc  allyesconfig
nds32 allnoconfig
c6x  allyesconfig
xtensa   allyesconfig
h8300allyesconfig
arc defconfig
sh   allmodconfig
parisc  defconfig
s390 allyesconfig
parisc   allyesconfig
s390defconfig
i386 allyesconfig
sparcallyesconfig
sparc   defconfig
i386defconfig
mips allyesconfig
mips allmodconfig
powerpc  allyesconfig
powerpc  allmodconfig
powerpc   allnoconfig
x86_64   randconfig-a006-20201120
x86_64   randconfig-a003-20201120
x86_64   randconfig-a004-20201120
x86_64   randconfig-a005-20201120
x86_64   randconfig-a001-20201120
x86_64   randconfig-a002-20201120
i386 randconfig-a004-20201120
i386 randconfig-a003-20201120
i386 randconfig-a002-20201120
i386 randconfig-a005-20201120
i386 randconfig-a001-20201120
i386 randconfig-a006-20201120
i386 randconfig-a012-20201120
i386 randconfig-a013-20201120
i386 randconfig-a011-20201120
i386 randconfig-a016-202

INFO: task hung in sync_inodes_sb (4)

2020-11-20 Thread syzbot

Hello,

syzbot found the following issue on:

HEAD commit:03430750 Add linux-next specific files for 20201116
git tree:   linux-next
console output: https://syzkaller.appspot.com/x/log.txt?x=17027fdc50
kernel config:  https://syzkaller.appspot.com/x/.config?x=a1c4c3f27041fdb8
dashboard link: https://syzkaller.appspot.com/bug?extid=7d50f1e54a12ba3aeae2
compiler:   gcc (GCC) 10.1.0-syz 20200507
syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=124a884150
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=15a4fce250

The issue was bisected to:

commit c68df2e7be0c1238ea3c281fd744a204ef3b15a0
Author: Emmanuel Grumbach 
Date:   Thu Sep 15 13:30:02 2016 +

mac80211: allow using AP_LINK_PS with mac80211-generated TIM IE

bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=1445e98150
final oops: https://syzkaller.appspot.com/x/report.txt?x=1645e98150
console output: https://syzkaller.appspot.com/x/log.txt?x=1245e98150

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+7d50f1e54a12ba3ae...@syzkaller.appspotmail.com
Fixes: c68df2e7be0c ("mac80211: allow using AP_LINK_PS with mac80211-generated 
TIM IE")

INFO: task syz-executor017:8513 blocked for more than 143 seconds.
  Not tainted 5.10.0-rc3-next-20201116-syzkaller #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:syz-executor017 state:D stack:27448 pid: 8513 ppid:  8507 flags:0x4000
Call Trace:
 context_switch kernel/sched/core.c:4269 [inline]
 __schedule+0x890/0x2030 kernel/sched/core.c:5019
 schedule+0xcf/0x270 kernel/sched/core.c:5098
 wb_wait_for_completion+0x17b/0x230 fs/fs-writeback.c:209
 sync_inodes_sb+0x1a6/0x9d0 fs/fs-writeback.c:2559
 __sync_filesystem fs/sync.c:34 [inline]
 sync_filesystem fs/sync.c:67 [inline]
 sync_filesystem+0x15c/0x260 fs/sync.c:48
 generic_shutdown_super+0x70/0x370 fs/super.c:448
 kill_block_super+0x97/0xf0 fs/super.c:1446
 deactivate_locked_super+0x94/0x160 fs/super.c:335
 deactivate_super+0xad/0xd0 fs/super.c:366
 cleanup_mnt+0x3a3/0x530 fs/namespace.c:1123
 task_work_run+0xdd/0x190 kernel/task_work.c:140
 tracehook_notify_resume include/linux/tracehook.h:188 [inline]
 exit_to_user_mode_loop kernel/entry/common.c:172 [inline]
 exit_to_user_mode_prepare+0x1f0/0x200 kernel/entry/common.c:199
 syscall_exit_to_user_mode+0x38/0x260 kernel/entry/common.c:274
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x44e0e7
Code: Unable to access opcode bytes at RIP 0x44e0bd.
RSP: 002b:7fff42061288 EFLAGS: 0206 ORIG_RAX: 00a6
RAX:  RBX: 000cee4c RCX: 0044e0e7
RDX: 00400be0 RSI: 0002 RDI: 7fff42061330
RBP: 2142 R08:  R09: 0009
R10: 0005 R11: 0206 R12: 7fff420623e0
R13: 01f67880 R14:  R15: 

Showing all locks held in the system:
2 locks held by kworker/u4:5/225:
 #0: 8881413a4138 ((wq_completion)writeback){+.+.}-{0:0}, at: 
arch_atomic64_set arch/x86/include/asm/atomic64_64.h:34 [inline]
 #0: 8881413a4138 ((wq_completion)writeback){+.+.}-{0:0}, at: atomic64_set 
include/asm-generic/atomic-instrumented.h:856 [inline]
 #0: 8881413a4138 ((wq_completion)writeback){+.+.}-{0:0}, at: 
atomic_long_set include/asm-generic/atomic-long.h:41 [inline]
 #0: 8881413a4138 ((wq_completion)writeback){+.+.}-{0:0}, at: set_work_data 
kernel/workqueue.c:616 [inline]
 #0: 8881413a4138 ((wq_completion)writeback){+.+.}-{0:0}, at: 
set_work_pool_and_clear_pending kernel/workqueue.c:643 [inline]
 #0: 8881413a4138 ((wq_completion)writeback){+.+.}-{0:0}, at: 
process_one_work+0x821/0x15a0 kernel/workqueue.c:2243
 #1: c9000191fda8 ((work_completion)(&(>dwork)->work)){+.+.}-{0:0}, at: 
process_one_work+0x854/0x15a0 kernel/workqueue.c:2247
1 lock held by khungtaskd/1655:
 #0: 8b339ce0 (rcu_read_lock){}-{1:2}, at: 
debug_show_all_locks+0x53/0x260 kernel/locking/lockdep.c:6252
1 lock held by in:imklog/8188:
 #0: 888017c8f4f0 (>f_pos_lock){+.+.}-{3:3}, at: __fdget_pos+0xe9/0x100 
fs/file.c:932
2 locks held by syz-executor017/8513:
 #0: 88801a8500e0 (>s_umount_key#49){+.+.}-{3:3}, at: 
deactivate_super+0xa5/0xd0 fs/super.c:365
 #1: 888143f5e708 (>wb_switch_rwsem){+.+.}-{3:3}, at: 
bdi_down_write_wb_switch_rwsem fs/fs-writeback.c:344 [inline]
 #1: 888143f5e708 (>wb_switch_rwsem){+.+.}-{3:3}, at: 
sync_inodes_sb+0x18c/0x9d0 fs/fs-writeback.c:2557

=

NMI backtrace for cpu 0
CPU: 0 PID: 1655 Comm: khungtaskd Not tainted 
5.10.0-rc3-next-20201116-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 
01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:79 [inline]
 dump_stack+0x107/0x163 lib/dump_stack.c:120
 nmi_cpu_backtrace.cold+0x44/0xd7 lib/nmi_backtrace.c:105

[PATCH v1] qnx6: avoid double release bh

2020-11-20 Thread Tong Zhang

set bh to NULL to avoid double release

[   38.848384] qnx6: superblock #1 active
[   38.855489] attempt to access beyond end of device
[   38.855489] sdb: rw=0, want=6359988796, limit=20
[   38.855852] Buffer I/O error on dev sdb, logical block 3179994397, async 
page read
[   38.856327] attempt to access beyond end of device
[   38.856327] sdb: rw=0, want=1390132904, limit=20
[   38.856513] Buffer I/O error on dev sdb, logical block 695066451, async page 
read
[   38.856800] attempt to access beyond end of device
[   38.856800] sdb: rw=0, want=1646095356, limit=20
[   38.857059] Buffer I/O error on dev sdb, logical block 823047677, async page 
read
[   38.857339] attempt to access beyond end of device
[   38.857339] sdb: rw=0, want=2511434484, limit=20
[   38.857504] Buffer I/O error on dev sdb, logical block 1255717241, async 
page read
[   38.857911] qnx6: major problem: unable to read inode from dev sdb
[   38.858318] qnx6: get inode failed
[   38.866847] [ cut here ]
[   38.866992] VFS: brelse: Trying to free free buffer
[   38.867406] WARNING: CPU: 0 PID: 159 at fs/buffer.c:1177 __brelse+0x31/0x50
[   38.867576] Modules linked in:
[   38.867933] CPU: 0 PID: 159 Comm: mount Not tainted 5.10.0-rc4+ #97
[   38.868068] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 
rel-1.13.0-48-gd9c812d4
[   38.868408] RIP: 0010:__brelse+0x31/0x50
[   38.868562] Code: 00 00 00 53 48 89 fb 48 89 ef e8 ea 89 f8 ff 48 89 ef e8 
c2 a6 f8 ff 8b 40
[   38.868877] RSP: 0018:8880042a7b90 EFLAGS: 0082
[   38.869119] RAX:  RBX: 888002eeaa80 RCX: 
[   38.869286] RDX: dc00 RSI: 0008 RDI: ed1000854f64
[   38.869419] RBP: 888002eeaae0 R08:  R09: ed1000854f03
[   38.869553] R10: 8880042a7817 R11: ed1000854f02 R12: a8b71460
[   38.869687] R13:  R14: a8b70a10 R15: 
[   38.869854] FS:  7f2e41c2a6a0() GS:88801620() 
knlGS:
[   38.869996] CS:  0010 DS:  ES:  CR0: 80050033
[   38.870103] CR2: 004ad288 CR3: 027ca000 CR4: 06f0
[   38.870281] Call Trace:
[   38.870572]  invalidate_bh_lru+0x2d/0x50
[   38.870702]  on_each_cpu_cond_mask+0x64/0x80
[   38.870808]  kill_bdev.isra.0+0x36/0x50
[   38.870904]  __blkdev_put+0x10d/0x370
[   38.871030]  ? freeze_bdev+0xf0/0xf0
[   38.871123]  ? _raw_read_lock_irq+0x30/0x30
[   38.871224]  ? mutex_unlock+0x18/0x40
[   38.871320]  deactivate_locked_super+0x50/0x90
[   38.871420]  mount_bdev+0x20f/0x220
[   38.871513]  ? qnx6_iget+0x460/0x460
[   38.871603]  ? qnx6_readpage+0x10/0x10
[   38.871694]  legacy_get_tree+0x6b/0xa0
[   38.871791]  vfs_get_tree+0x41/0x110
[   38.871887]  path_mount+0x3b3/0xd50
[   38.871984]  ? finish_automount+0x2b0/0x2b0
[   38.872085]  ? getname_flags+0x100/0x2a0
[   38.872182]  do_mount+0xc5/0xe0
[   38.872272]  ? path_mount+0xd50/0xd50
[   38.872366]  ? memdup_user+0x3c/0x80
[   38.872458]  __x64_sys_mount+0xb9/0xf0
[   38.872555]  do_syscall_64+0x33/0x40
[   38.872649]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[   38.872823] RIP: 0033:0x7f2e41ba8515
[   38.873013] Code: b8 b0 00 00 00 0f 05 48 3d 00 f0 ff ff 76 10 48 8b 15 5f 
79 06 00 f7 d8 63
[   38.873257] RSP: 002b:7ffd2d0eaaf8 EFLAGS: 0202 ORIG_RAX: 
00a5
[   38.873420] RAX: ffda RBX: 8001 RCX: 7f2e41ba8515
[   38.873537] RDX: 7ffd2d0ecf62 RSI: 7ffd2d0ecf54 RDI: 7ffd2d0ecf4b
[   38.873652] RBP: 7ffd2d0eac80 R08:  R09: 7f2e41bf1480
[   38.873766] R10: 8001 R11: 0202 R12: 
[   38.873882] R13: 7f2e41c2a690 R14:  R15: 
[   38.874049] ---[ end trace cc983a0044562d15 ]---

Signed-off-by: Tong Zhang 
---
 fs/qnx6/inode.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/fs/qnx6/inode.c b/fs/qnx6/inode.c
index 61191f7bdf62..9fbe2b29bd9b 100644
--- a/fs/qnx6/inode.c
+++ b/fs/qnx6/inode.c
@@ -404,12 +404,14 @@ static int qnx6_fill_super(struct super_block *s, void 
*data, int silent)
sbi->sb_buf = bh1;
sbi->sb = (struct qnx6_super_block *)bh1->b_data;
brelse(bh2);
+   bh2 = NULL;
pr_info("superblock #1 active\n");
} else {
/* superblock #2 active */
sbi->sb_buf = bh2;
sbi->sb = (struct qnx6_super_block *)bh2->b_data;
brelse(bh1);
+   bh1 = NULL;
pr_info("superblock #2 active\n");
}
 mmi_success:
-- 
2.25.1

[tip:x86/cleanups] BUILD SUCCESS 61b39ad9a7d26fe14a2f5f23e5e940e7f9664d41

2020-11-20 Thread kernel test robot

  allyesconfig
powerpc  allmodconfig
powerpc   allnoconfig
x86_64   randconfig-a006-20201120
x86_64   randconfig-a003-20201120
x86_64   randconfig-a004-20201120
x86_64   randconfig-a005-20201120
x86_64   randconfig-a001-20201120
x86_64   randconfig-a002-20201120
i386 randconfig-a004-20201120
i386 randconfig-a003-20201120
i386 randconfig-a002-20201120
i386 randconfig-a005-20201120
i386 randconfig-a001-20201120
i386 randconfig-a006-20201120
i386 randconfig-a012-20201120
i386 randconfig-a013-20201120
i386 randconfig-a011-20201120
i386 randconfig-a016-20201120
i386 randconfig-a014-20201120
i386 randconfig-a015-20201120
riscvnommu_k210_defconfig
riscvallyesconfig
riscvnommu_virt_defconfig
riscv allnoconfig
riscv   defconfig
riscvallmodconfig
x86_64   rhel
x86_64   allyesconfig
x86_64rhel-7.6-kselftests
x86_64  defconfig
x86_64   rhel-8.3
x86_64  kexec

clang tested configs:
x86_64   randconfig-a015-20201120
x86_64   randconfig-a011-20201120
x86_64   randconfig-a014-20201120
x86_64   randconfig-a016-20201120
x86_64   randconfig-a012-20201120
x86_64   randconfig-a013-20201120

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org

[PATCH v6 1/5] dma-buf: system_heap: Rework system heap to use sgtables instead of pagelists

2020-11-20 Thread John Stultz

In preparation for some patches to optmize the system
heap code, rework the dmabuf exporter to utilize sgtables rather
then pageslists for tracking the associated pages.

This will allow for large order page allocations, as well as
more efficient page pooling.

In doing so, the system heap stops using the heap-helpers logic
which sadly is not quite as generic as I was hoping it to be, so
this patch adds heap specific implementations of the dma_buf_ops
function handlers.

Cc: Sumit Semwal 
Cc: Liam Mark 
Cc: Laura Abbott 
Cc: Brian Starkey 
Cc: Hridya Valsaraju 
Cc: Suren Baghdasaryan 
Cc: Sandeep Patil 
Cc: Daniel Mentz 
Cc: Chris Goldsworthy 
Cc: Ørjan Eide 
Cc: Robin Murphy 
Cc: Ezequiel Garcia 
Cc: Simon Ser 
Cc: James Jones 
Cc: linux-me...@vger.kernel.org
Cc: dri-de...@lists.freedesktop.org
Reviewed-by: Brian Starkey 
Signed-off-by: John Stultz 
---
v2:
* Fix locking issue and an unused return value Reported-by:
 kernel test robot 
 Julia Lawall 
* Make system_heap_buf_ops static Reported-by:
 kernel test robot 
v3:
* Use the new sgtable mapping functions, as Suggested-by:
 Daniel Mentz 
v4:
* Make sys_heap static (indirectly) Reported-by:
 kernel test robot 
* Spelling fix suggested by BrianS
v6:
* Fixups against drm-misc-next, from Sumit
---
 drivers/dma-buf/heaps/system_heap.c | 346 
 1 file changed, 300 insertions(+), 46 deletions(-)

diff --git a/drivers/dma-buf/heaps/system_heap.c 
b/drivers/dma-buf/heaps/system_heap.c
index 0bf688e3c023..e5f9f964b910 100644
--- a/drivers/dma-buf/heaps/system_heap.c
+++ b/drivers/dma-buf/heaps/system_heap.c
@@ -3,7 +3,11 @@
  * DMABUF System heap exporter
  *
  * Copyright (C) 2011 Google, Inc.
- * Copyright (C) 2019 Linaro Ltd.
+ * Copyright (C) 2019, 2020 Linaro Ltd.
+ *
+ * Portions based off of Andrew Davis' SRAM heap:
+ * Copyright (C) 2019 Texas Instruments Incorporated - http://www.ti.com/
+ * Andrew F. Davis 
  */
 
 #include 
@@ -15,72 +19,323 @@
 #include 
 #include 
 #include 
-#include 
-#include 
+#include 
+
+static struct dma_heap *sys_heap;
 
-#include "heap-helpers.h"
+struct system_heap_buffer {
+   struct dma_heap *heap;
+   struct list_head attachments;
+   struct mutex lock;
+   unsigned long len;
+   struct sg_table sg_table;
+   int vmap_cnt;
+   struct dma_buf_map *map;
+};
 
-struct dma_heap *sys_heap;
+struct dma_heap_attachment {
+   struct device *dev;
+   struct sg_table *table;
+   struct list_head list;
+};
 
-static void system_heap_free(struct heap_helper_buffer *buffer)
+static struct sg_table *dup_sg_table(struct sg_table *table)
 {
-   pgoff_t pg;
+   struct sg_table *new_table;
+   int ret, i;
+   struct scatterlist *sg, *new_sg;
+
+   new_table = kzalloc(sizeof(*new_table), GFP_KERNEL);
+   if (!new_table)
+   return ERR_PTR(-ENOMEM);
+
+   ret = sg_alloc_table(new_table, table->orig_nents, GFP_KERNEL);
+   if (ret) {
+   kfree(new_table);
+   return ERR_PTR(-ENOMEM);
+   }
+
+   new_sg = new_table->sgl;
+   for_each_sgtable_sg(table, sg, i) {
+   sg_set_page(new_sg, sg_page(sg), sg->length, sg->offset);
+   new_sg = sg_next(new_sg);
+   }
+
+   return new_table;
+}
+
+static int system_heap_attach(struct dma_buf *dmabuf,
+ struct dma_buf_attachment *attachment)
+{
+   struct system_heap_buffer *buffer = dmabuf->priv;
+   struct dma_heap_attachment *a;
+   struct sg_table *table;
+
+   a = kzalloc(sizeof(*a), GFP_KERNEL);
+   if (!a)
+   return -ENOMEM;
+
+   table = dup_sg_table(>sg_table);
+   if (IS_ERR(table)) {
+   kfree(a);
+   return -ENOMEM;
+   }
+
+   a->table = table;
+   a->dev = attachment->dev;
+   INIT_LIST_HEAD(>list);
+
+   attachment->priv = a;
+
+   mutex_lock(>lock);
+   list_add(>list, >attachments);
+   mutex_unlock(>lock);
+
+   return 0;
+}
+
+static void system_heap_detach(struct dma_buf *dmabuf,
+  struct dma_buf_attachment *attachment)
+{
+   struct system_heap_buffer *buffer = dmabuf->priv;
+   struct dma_heap_attachment *a = attachment->priv;
+
+   mutex_lock(>lock);
+   list_del(>list);
+   mutex_unlock(>lock);
+
+   sg_free_table(a->table);
+   kfree(a->table);
+   kfree(a);
+}
+
+static struct sg_table *system_heap_map_dma_buf(struct dma_buf_attachment 
*attachment,
+   enum dma_data_direction 
direction)
+{
+   struct dma_heap_attachment *a = attachment->priv;
+   struct sg_table *table = a->table;
+   int ret;
+
+   ret = dma_map_sgtable(attachment->dev, table, direction, 0);
+   if (ret)
+   return ERR_PTR(ret);
+
+   return table;
+}
+
+static void system_heap_unmap_dma_buf(struct dma_buf_attachment *attachment,
+

[PATCH v6 4/5] dma-buf: heaps: Skip sync if not mapped

2020-11-20 Thread John Stultz

This patch is basically a port of Ørjan Eide's similar patch for ION
 https://lore.kernel.org/lkml/20200414134629.54567-1-orjan.e...@arm.com/

Only sync the sg-list of dma-buf heap attachment when the attachment
is actually mapped on the device.

dma-bufs may be synced at any time. It can be reached from user space
via DMA_BUF_IOCTL_SYNC, so there are no guarantees from callers on when
syncs may be attempted, and dma_buf_end_cpu_access() and
dma_buf_begin_cpu_access() may not be paired.

Since the sg_list's dma_address isn't set up until the buffer is used
on the device, and dma_map_sg() is called on it, the dma_address will be
NULL if sync is attempted on the dma-buf before it's mapped on a device.

Before v5.0 (commit 55897af63091 ("dma-direct: merge swiotlb_dma_ops
into the dma_direct code")) this was a problem as the dma-api (at least
the swiotlb_dma_ops on arm64) would use the potentially invalid
dma_address. How that failed depended on how the device handled physical
address 0. If 0 was a valid address to physical ram, that page would get
flushed a lot, while the actual pages in the buffer would not get synced
correctly. While if 0 is an invalid physical address it may cause a
fault and trigger a crash.

In v5.0 this was incidentally fixed by commit 55897af63091 ("dma-direct:
merge swiotlb_dma_ops into the dma_direct code"), as this moved the
dma-api to use the page pointer in the sg_list, and (for Ion buffers at
least) this will always be valid if the sg_list exists at all.

But, this issue is re-introduced in v5.3 with
commit 449fa54d6815 ("dma-direct: correct the physical addr in
dma_direct_sync_sg_for_cpu/device") moves the dma-api back to the old
behaviour and picks the dma_address that may be invalid.

dma-buf core doesn't ensure that the buffer is mapped on the device, and
thus have a valid sg_list, before calling the exporter's
begin_cpu_access.

Logic and commit message originally by: Ørjan Eide 

Cc: Sumit Semwal 
Cc: Liam Mark 
Cc: Laura Abbott 
Cc: Brian Starkey 
Cc: Hridya Valsaraju 
Cc: Suren Baghdasaryan 
Cc: Sandeep Patil 
Cc: Daniel Mentz 
Cc: Chris Goldsworthy 
Cc: Ørjan Eide 
Cc: Robin Murphy 
Cc: Ezequiel Garcia 
Cc: Simon Ser 
Cc: James Jones 
Cc: linux-me...@vger.kernel.org
Cc: dri-de...@lists.freedesktop.org
Reviewed-by: Brian Starkey 
Signed-off-by: John Stultz 
---
 drivers/dma-buf/heaps/cma_heap.c| 10 ++
 drivers/dma-buf/heaps/system_heap.c | 10 ++
 2 files changed, 20 insertions(+)

diff --git a/drivers/dma-buf/heaps/cma_heap.c b/drivers/dma-buf/heaps/cma_heap.c
index 52459f3e60e2..2b468767a2af 100644
--- a/drivers/dma-buf/heaps/cma_heap.c
+++ b/drivers/dma-buf/heaps/cma_heap.c
@@ -43,6 +43,7 @@ struct dma_heap_attachment {
struct device *dev;
struct sg_table table;
struct list_head list;
+   bool mapped;
 };
 
 static int cma_heap_attach(struct dma_buf *dmabuf,
@@ -67,6 +68,7 @@ static int cma_heap_attach(struct dma_buf *dmabuf,
 
a->dev = attachment->dev;
INIT_LIST_HEAD(>list);
+   a->mapped = false;
 
attachment->priv = a;
 
@@ -101,6 +103,7 @@ static struct sg_table *cma_heap_map_dma_buf(struct 
dma_buf_attachment *attachme
ret = dma_map_sgtable(attachment->dev, table, direction, 0);
if (ret)
return ERR_PTR(-ENOMEM);
+   a->mapped = true;
return table;
 }
 
@@ -108,6 +111,9 @@ static void cma_heap_unmap_dma_buf(struct 
dma_buf_attachment *attachment,
   struct sg_table *table,
   enum dma_data_direction direction)
 {
+   struct dma_heap_attachment *a = attachment->priv;
+
+   a->mapped = false;
dma_unmap_sgtable(attachment->dev, table, direction, 0);
 }
 
@@ -122,6 +128,8 @@ static int cma_heap_dma_buf_begin_cpu_access(struct dma_buf 
*dmabuf,
 
mutex_lock(>lock);
list_for_each_entry(a, >attachments, list) {
+   if (!a->mapped)
+   continue;
dma_sync_sgtable_for_cpu(a->dev, >table, direction);
}
mutex_unlock(>lock);
@@ -140,6 +148,8 @@ static int cma_heap_dma_buf_end_cpu_access(struct dma_buf 
*dmabuf,
 
mutex_lock(>lock);
list_for_each_entry(a, >attachments, list) {
+   if (!a->mapped)
+   continue;
dma_sync_sgtable_for_device(a->dev, >table, direction);
}
mutex_unlock(>lock);
diff --git a/drivers/dma-buf/heaps/system_heap.c 
b/drivers/dma-buf/heaps/system_heap.c
index e5f9f964b910..b1a7b355132f 100644
--- a/drivers/dma-buf/heaps/system_heap.c
+++ b/drivers/dma-buf/heaps/system_heap.c
@@ -37,6 +37,7 @@ struct dma_heap_attachment {
struct device *dev;
struct sg_table *table;
struct list_head list;
+   bool mapped;
 };
 
 static struct sg_table *dup_sg_table(struct sg_table *table)
@@ -84,6 +85,7 @@ static int system_heap_attach(struct dma_buf *dmabuf,
a->table = table;

[PATCH v6 5/5] dma-buf: system_heap: Allocate higher order pages if available

2020-11-20 Thread John Stultz

While the system heap can return non-contiguous pages,
try to allocate larger order pages if possible.

This will allow slight performance gains and make implementing
page pooling easier.

Cc: Sumit Semwal 
Cc: Liam Mark 
Cc: Laura Abbott 
Cc: Brian Starkey 
Cc: Hridya Valsaraju 
Cc: Suren Baghdasaryan 
Cc: Sandeep Patil 
Cc: Daniel Mentz 
Cc: Chris Goldsworthy 
Cc: Ørjan Eide 
Cc: Robin Murphy 
Cc: Ezequiel Garcia 
Cc: Simon Ser 
Cc: James Jones 
Cc: linux-me...@vger.kernel.org
Cc: dri-de...@lists.freedesktop.org
Reviewed-by: Brian Starkey 
Signed-off-by: John Stultz 
---
v3:
* Use page_size() rather then opencoding it
v5:
* Add comment explaining order size rational
---
 drivers/dma-buf/heaps/system_heap.c | 89 +++--
 1 file changed, 71 insertions(+), 18 deletions(-)

diff --git a/drivers/dma-buf/heaps/system_heap.c 
b/drivers/dma-buf/heaps/system_heap.c
index b1a7b355132f..de275b7ff1ed 100644
--- a/drivers/dma-buf/heaps/system_heap.c
+++ b/drivers/dma-buf/heaps/system_heap.c
@@ -40,6 +40,20 @@ struct dma_heap_attachment {
bool mapped;
 };
 
+#define HIGH_ORDER_GFP  (((GFP_HIGHUSER | __GFP_ZERO | __GFP_NOWARN \
+   | __GFP_NORETRY) & ~__GFP_RECLAIM) \
+   | __GFP_COMP)
+#define LOW_ORDER_GFP (GFP_HIGHUSER | __GFP_ZERO | __GFP_COMP)
+static gfp_t order_flags[] = {HIGH_ORDER_GFP, LOW_ORDER_GFP, LOW_ORDER_GFP};
+/*
+ * The selection of the orders used for allocation (1MB, 64K, 4K) is designed
+ * to match with the sizes often found in IOMMUs. Using order 4 pages instead
+ * of order 0 pages can significantly improve the performance of many IOMMUs
+ * by reducing TLB pressure and time spent updating page tables.
+ */
+static const unsigned int orders[] = {8, 4, 0};
+#define NUM_ORDERS ARRAY_SIZE(orders)
+
 static struct sg_table *dup_sg_table(struct sg_table *table)
 {
struct sg_table *new_table;
@@ -272,8 +286,11 @@ static void system_heap_dma_buf_release(struct dma_buf 
*dmabuf)
int i;
 
table = >sg_table;
-   for_each_sgtable_sg(table, sg, i)
-   __free_page(sg_page(sg));
+   for_each_sg(table->sgl, sg, table->nents, i) {
+   struct page *page = sg_page(sg);
+
+   __free_pages(page, compound_order(page));
+   }
sg_free_table(table);
kfree(buffer);
 }
@@ -291,6 +308,26 @@ static const struct dma_buf_ops system_heap_buf_ops = {
.release = system_heap_dma_buf_release,
 };
 
+static struct page *alloc_largest_available(unsigned long size,
+   unsigned int max_order)
+{
+   struct page *page;
+   int i;
+
+   for (i = 0; i < NUM_ORDERS; i++) {
+   if (size <  (PAGE_SIZE << orders[i]))
+   continue;
+   if (max_order < orders[i])
+   continue;
+
+   page = alloc_pages(order_flags[i], orders[i]);
+   if (!page)
+   continue;
+   return page;
+   }
+   return NULL;
+}
+
 static int system_heap_allocate(struct dma_heap *heap,
unsigned long len,
unsigned long fd_flags,
@@ -298,11 +335,13 @@ static int system_heap_allocate(struct dma_heap *heap,
 {
struct system_heap_buffer *buffer;
DEFINE_DMA_BUF_EXPORT_INFO(exp_info);
+   unsigned long size_remaining = len;
+   unsigned int max_order = orders[0];
struct dma_buf *dmabuf;
struct sg_table *table;
struct scatterlist *sg;
-   pgoff_t pagecount;
-   pgoff_t pg;
+   struct list_head pages;
+   struct page *page, *tmp_page;
int i, ret = -ENOMEM;
 
buffer = kzalloc(sizeof(*buffer), GFP_KERNEL);
@@ -314,25 +353,35 @@ static int system_heap_allocate(struct dma_heap *heap,
buffer->heap = heap;
buffer->len = len;
 
-   table = >sg_table;
-   pagecount = len / PAGE_SIZE;
-   if (sg_alloc_table(table, pagecount, GFP_KERNEL))
-   goto free_buffer;
-
-   sg = table->sgl;
-   for (pg = 0; pg < pagecount; pg++) {
-   struct page *page;
+   INIT_LIST_HEAD();
+   i = 0;
+   while (size_remaining > 0) {
/*
 * Avoid trying to allocate memory if the process
 * has been killed by SIGKILL
 */
if (fatal_signal_pending(current))
-   goto free_pages;
-   page = alloc_page(GFP_KERNEL | __GFP_ZERO);
+   goto free_buffer;
+
+   page = alloc_largest_available(size_remaining, max_order);
if (!page)
-   goto free_pages;
+   goto free_buffer;
+
+   list_add_tail(>lru, );
+   size_remaining -= page_size(page);
+   max_order = compound_order(page);
+   i++;
+   }
+
+   table =

[PATCH v6 3/5] dma-buf: heaps: Remove heap-helpers code

2020-11-20 Thread John Stultz

The heap-helpers code was not as generic as initially hoped
and it is now not being used, so remove it from the tree.

Cc: Sumit Semwal 
Cc: Liam Mark 
Cc: Laura Abbott 
Cc: Brian Starkey 
Cc: Hridya Valsaraju 
Cc: Suren Baghdasaryan 
Cc: Sandeep Patil 
Cc: Daniel Mentz 
Cc: Chris Goldsworthy 
Cc: Ørjan Eide 
Cc: Robin Murphy 
Cc: Ezequiel Garcia 
Cc: Simon Ser 
Cc: James Jones 
Cc: linux-me...@vger.kernel.org
Cc: dri-de...@lists.freedesktop.org
Reviewed-by: Brian Starkey 
Signed-off-by: John Stultz 
---
v6: Rebased onto drm-misc-next
---
 drivers/dma-buf/heaps/Makefile   |   1 -
 drivers/dma-buf/heaps/heap-helpers.c | 274 ---
 drivers/dma-buf/heaps/heap-helpers.h |  53 --
 3 files changed, 328 deletions(-)
 delete mode 100644 drivers/dma-buf/heaps/heap-helpers.c
 delete mode 100644 drivers/dma-buf/heaps/heap-helpers.h

diff --git a/drivers/dma-buf/heaps/Makefile b/drivers/dma-buf/heaps/Makefile
index 6e54cdec3da0..974467791032 100644
--- a/drivers/dma-buf/heaps/Makefile
+++ b/drivers/dma-buf/heaps/Makefile
@@ -1,4 +1,3 @@
 # SPDX-License-Identifier: GPL-2.0
-obj-y  += heap-helpers.o
 obj-$(CONFIG_DMABUF_HEAPS_SYSTEM)  += system_heap.o
 obj-$(CONFIG_DMABUF_HEAPS_CMA) += cma_heap.o
diff --git a/drivers/dma-buf/heaps/heap-helpers.c 
b/drivers/dma-buf/heaps/heap-helpers.c
deleted file mode 100644
index fcf4ce3e2cbb..
--- a/drivers/dma-buf/heaps/heap-helpers.c
+++ /dev/null
@@ -1,274 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-
-#include "heap-helpers.h"
-
-void init_heap_helper_buffer(struct heap_helper_buffer *buffer,
-void (*free)(struct heap_helper_buffer *))
-{
-   buffer->priv_virt = NULL;
-   mutex_init(>lock);
-   buffer->vmap_cnt = 0;
-   buffer->vaddr = NULL;
-   buffer->pagecount = 0;
-   buffer->pages = NULL;
-   INIT_LIST_HEAD(>attachments);
-   buffer->free = free;
-}
-
-struct dma_buf *heap_helper_export_dmabuf(struct heap_helper_buffer *buffer,
- int fd_flags)
-{
-   DEFINE_DMA_BUF_EXPORT_INFO(exp_info);
-
-   exp_info.ops = _helper_ops;
-   exp_info.size = buffer->size;
-   exp_info.flags = fd_flags;
-   exp_info.priv = buffer;
-
-   return dma_buf_export(_info);
-}
-
-static void *dma_heap_map_kernel(struct heap_helper_buffer *buffer)
-{
-   void *vaddr;
-
-   vaddr = vmap(buffer->pages, buffer->pagecount, VM_MAP, PAGE_KERNEL);
-   if (!vaddr)
-   return ERR_PTR(-ENOMEM);
-
-   return vaddr;
-}
-
-static void dma_heap_buffer_destroy(struct heap_helper_buffer *buffer)
-{
-   if (buffer->vmap_cnt > 0) {
-   WARN(1, "%s: buffer still mapped in the kernel\n", __func__);
-   vunmap(buffer->vaddr);
-   }
-
-   buffer->free(buffer);
-}
-
-static void *dma_heap_buffer_vmap_get(struct heap_helper_buffer *buffer)
-{
-   void *vaddr;
-
-   if (buffer->vmap_cnt) {
-   buffer->vmap_cnt++;
-   return buffer->vaddr;
-   }
-   vaddr = dma_heap_map_kernel(buffer);
-   if (IS_ERR(vaddr))
-   return vaddr;
-   buffer->vaddr = vaddr;
-   buffer->vmap_cnt++;
-   return vaddr;
-}
-
-static void dma_heap_buffer_vmap_put(struct heap_helper_buffer *buffer)
-{
-   if (!--buffer->vmap_cnt) {
-   vunmap(buffer->vaddr);
-   buffer->vaddr = NULL;
-   }
-}
-
-struct dma_heaps_attachment {
-   struct device *dev;
-   struct sg_table table;
-   struct list_head list;
-};
-
-static int dma_heap_attach(struct dma_buf *dmabuf,
-  struct dma_buf_attachment *attachment)
-{
-   struct dma_heaps_attachment *a;
-   struct heap_helper_buffer *buffer = dmabuf->priv;
-   int ret;
-
-   a = kzalloc(sizeof(*a), GFP_KERNEL);
-   if (!a)
-   return -ENOMEM;
-
-   ret = sg_alloc_table_from_pages(>table, buffer->pages,
-   buffer->pagecount, 0,
-   buffer->pagecount << PAGE_SHIFT,
-   GFP_KERNEL);
-   if (ret) {
-   kfree(a);
-   return ret;
-   }
-
-   a->dev = attachment->dev;
-   INIT_LIST_HEAD(>list);
-
-   attachment->priv = a;
-
-   mutex_lock(>lock);
-   list_add(>list, >attachments);
-   mutex_unlock(>lock);
-
-   return 0;
-}
-
-static void dma_heap_detach(struct dma_buf *dmabuf,
-   struct dma_buf_attachment *attachment)
-{
-   struct dma_heaps_attachment *a = attachment->priv;
-   struct heap_helper_buffer *buffer = dmabuf->priv;
-
-   mutex_lock(>lock);
-   list_del(>list);
-   mutex_unlock(>lock);
-
-   sg_free_table(>table);
-   kfree(a);
-}
-
-static
-struct

[PATCH v6 2/5] dma-buf: heaps: Move heap-helper logic into the cma_heap implementation

2020-11-20 Thread John Stultz

Since the heap-helpers logic ended up not being as generic as
hoped, move the heap-helpers dma_buf_ops implementations into
the cma_heap directly.

This will allow us to remove the heap_helpers code in a following
patch.

Cc: Sumit Semwal 
Cc: Liam Mark 
Cc: Laura Abbott 
Cc: Brian Starkey 
Cc: Hridya Valsaraju 
Cc: Suren Baghdasaryan 
Cc: Sandeep Patil 
Cc: Daniel Mentz 
Cc: Chris Goldsworthy 
Cc: Ørjan Eide 
Cc: Robin Murphy 
Cc: Ezequiel Garcia 
Cc: Simon Ser 
Cc: James Jones 
Cc: linux-me...@vger.kernel.org
Cc: dri-de...@lists.freedesktop.org
Reviewed-by: Brian Starkey 
Signed-off-by: John Stultz 
---
v2:
* Fix unused return value and locking issue Reported-by:
kernel test robot 
Julia Lawall 
* Make cma_heap_buf_ops static suggested by
kernel test robot 
* Fix uninitialized return in cma Reported-by:
kernel test robot 
* Minor cleanups
v3:
* Use the new sgtable mapping functions, as Suggested-by:
 Daniel Mentz 
v4:
* Spelling fix suggested by BrianS
v6:
* Fixups against drm-misc-next
---
 drivers/dma-buf/heaps/cma_heap.c | 315 ++-
 1 file changed, 266 insertions(+), 49 deletions(-)

diff --git a/drivers/dma-buf/heaps/cma_heap.c b/drivers/dma-buf/heaps/cma_heap.c
index e55384dc115b..52459f3e60e2 100644
--- a/drivers/dma-buf/heaps/cma_heap.c
+++ b/drivers/dma-buf/heaps/cma_heap.c
@@ -2,76 +2,291 @@
 /*
  * DMABUF CMA heap exporter
  *
- * Copyright (C) 2012, 2019 Linaro Ltd.
+ * Copyright (C) 2012, 2019, 2020 Linaro Ltd.
  * Author:  for ST-Ericsson.
+ *
+ * Also utilizing parts of Andrew Davis' SRAM heap:
+ * Copyright (C) 2019 Texas Instruments Incorporated - http://www.ti.com/
+ * Andrew F. Davis 
  */
-
 #include 
-#include 
 #include 
 #include 
 #include 
 #include 
-#include 
 #include 
+#include 
+#include 
 #include 
-#include 
 #include 
-#include 
+#include 
 
-#include "heap-helpers.h"
 
 struct cma_heap {
struct dma_heap *heap;
struct cma *cma;
 };
 
-static void cma_heap_free(struct heap_helper_buffer *buffer)
+struct cma_heap_buffer {
+   struct cma_heap *heap;
+   struct list_head attachments;
+   struct mutex lock;
+   unsigned long len;
+   struct page *cma_pages;
+   struct page **pages;
+   pgoff_t pagecount;
+   int vmap_cnt;
+   struct dma_buf_map *map;
+};
+
+struct dma_heap_attachment {
+   struct device *dev;
+   struct sg_table table;
+   struct list_head list;
+};
+
+static int cma_heap_attach(struct dma_buf *dmabuf,
+  struct dma_buf_attachment *attachment)
 {
-   struct cma_heap *cma_heap = dma_heap_get_drvdata(buffer->heap);
-   unsigned long nr_pages = buffer->pagecount;
-   struct page *cma_pages = buffer->priv_virt;
+   struct cma_heap_buffer *buffer = dmabuf->priv;
+   struct dma_heap_attachment *a;
+   int ret;
 
-   /* free page list */
-   kfree(buffer->pages);
-   /* release memory */
-   cma_release(cma_heap->cma, cma_pages, nr_pages);
+   a = kzalloc(sizeof(*a), GFP_KERNEL);
+   if (!a)
+   return -ENOMEM;
+
+   ret = sg_alloc_table_from_pages(>table, buffer->pages,
+   buffer->pagecount, 0,
+   buffer->pagecount << PAGE_SHIFT,
+   GFP_KERNEL);
+   if (ret) {
+   kfree(a);
+   return ret;
+   }
+
+   a->dev = attachment->dev;
+   INIT_LIST_HEAD(>list);
+
+   attachment->priv = a;
+
+   mutex_lock(>lock);
+   list_add(>list, >attachments);
+   mutex_unlock(>lock);
+
+   return 0;
+}
+
+static void cma_heap_detach(struct dma_buf *dmabuf,
+   struct dma_buf_attachment *attachment)
+{
+   struct cma_heap_buffer *buffer = dmabuf->priv;
+   struct dma_heap_attachment *a = attachment->priv;
+
+   mutex_lock(>lock);
+   list_del(>list);
+   mutex_unlock(>lock);
+
+   sg_free_table(>table);
+   kfree(a);
+}
+
+static struct sg_table *cma_heap_map_dma_buf(struct dma_buf_attachment 
*attachment,
+enum dma_data_direction direction)
+{
+   struct dma_heap_attachment *a = attachment->priv;
+   struct sg_table *table = >table;
+   int ret;
+
+   ret = dma_map_sgtable(attachment->dev, table, direction, 0);
+   if (ret)
+   return ERR_PTR(-ENOMEM);
+   return table;
+}
+
+static void cma_heap_unmap_dma_buf(struct dma_buf_attachment *attachment,
+  struct sg_table *table,
+  enum dma_data_direction direction)
+{
+   dma_unmap_sgtable(attachment->dev, table, direction, 0);
+}
+
+static int cma_heap_dma_buf_begin_cpu_access(struct dma_buf *dmabuf,
+enum dma_data_direction direction)
+{
+   struct cma_heap_buffer *buffer = dmabuf->priv;
+   struct dma_heap_attachment *a;

[PATCH v6 0/5] dma-buf: Code rework and performance improvements for system heap

2020-11-20 Thread John Stultz

Hey All,
  So just wanted to send another revision of my patch series
of performance optimizations to the dma-buf system heap, this
time against drm-misc-next.

This series reworks the system heap to use sgtables, and then
consolidates the pagelist method from the heap-helpers into the
CMA heap. After which the heap-helpers logic is removed (as it
is unused). I'd still like to find a better way to avoid some of
the logic duplication in implementing the entire dma_buf_ops
handlers per heap. But unfortunately that code is tied somewhat
to how the buffer's memory is tracked. As more heaps show up I
think we'll have a better idea how to best share code, so for
now I think this is ok.

After this, the series introduces an optimization that
Ørjan Eide implemented for ION that avoids calling sync on
attachments that don't have a mapping.

Finally, an optimization to use larger order pages for the system
heap. This change brings us closer to the current performance
of the ION allocation code (though there still is a gap due
to ION using a mix of deferred-freeing and page pools, I'll be
looking at integrating those eventually).

This version of the series does not include the system-uncached
heap as Daniel wanted further demonstration that it is useful
with devices that use the mesa stack. I'm working on such a
justification but I don't want to hold up these rework patches
in the meantime.

thanks
-john

New in v6:
* Dropped the system-uncached heap submission for now
* Rebased onto drm-misc-next

Cc: Sumit Semwal 
Cc: Liam Mark 
Cc: Laura Abbott 
Cc: Brian Starkey 
Cc: Hridya Valsaraju 
Cc: Suren Baghdasaryan 
Cc: Sandeep Patil 
Cc: Daniel Mentz 
Cc: Chris Goldsworthy 
Cc: Ørjan Eide 
Cc: Robin Murphy 
Cc: Ezequiel Garcia 
Cc: Simon Ser 
Cc: James Jones 
Cc: linux-me...@vger.kernel.org
Cc: dri-de...@lists.freedesktop.org

John Stultz (5):
  dma-buf: system_heap: Rework system heap to use sgtables instead of
pagelists
  dma-buf: heaps: Move heap-helper logic into the cma_heap
implementation
  dma-buf: heaps: Remove heap-helpers code
  dma-buf: heaps: Skip sync if not mapped
  dma-buf: system_heap: Allocate higher order pages if available

 drivers/dma-buf/heaps/Makefile   |   1 -
 drivers/dma-buf/heaps/cma_heap.c | 325 +
 drivers/dma-buf/heaps/heap-helpers.c | 274 --
 drivers/dma-buf/heaps/heap-helpers.h |  53 
 drivers/dma-buf/heaps/system_heap.c  | 411 ---
 5 files changed, 640 insertions(+), 424 deletions(-)
 delete mode 100644 drivers/dma-buf/heaps/heap-helpers.c
 delete mode 100644 drivers/dma-buf/heaps/heap-helpers.h

-- 
2.17.1

[PATCH] scsi: ufs: Adjust logic in common ADAPT helper

2020-11-20 Thread Bjorn Andersson

The introduction of ufshcd_dme_configure_adapt() refactored out
duplication from the Mediatek and Qualcomm drivers.

Both these implementations had the logic of:
gear_tx == UFS_HS_G4 => PA_INITIAL_ADAPT
gear_tx != UFS_HS_G4 => PA_NO_ADAPT

but now both implementations pass PA_INITIAL_ADAPT as "adapt_val" and if
gear_tx is not UFS_HS_G4 that is replaced with PA_INITIAL_ADAPT. In
other words, it's PA_INITIAL_ADAPT in both above cases.

The result is that e.g. Qualcomm SM8150 has no longer functional UFS, so
adjust the logic to match the previous implementation.

Fixes: fc85a74e28fe ("scsi: ufs: Refactor ADAPT configuration function")
Signed-off-by: Bjorn Andersson 
---
 drivers/scsi/ufs/ufshcd.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index 52e077aa3efe..13281c74cb4f 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -3618,7 +3618,7 @@ int ufshcd_dme_configure_adapt(struct ufs_hba *hba,
int ret;
 
if (agreed_gear != UFS_HS_G4)
-   adapt_val = PA_INITIAL_ADAPT;
+   adapt_val = PA_NO_ADAPT;
 
ret = ufshcd_dme_set(hba,
 UIC_ARG_MIB(PA_TXHSADAPTTYPE),
-- 
2.28.0

Re: [PATCH] mm: memcontrol: account pagetables per node

2020-11-20 Thread kernel test robot

Hi Shakeel,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on driver-core/driver-core-testing]
[also build test WARNING on linus/master v5.10-rc4 next-20201120]
[cannot apply to mmotm/master cgroup/for-next hnaz-linux-mm/master]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:
https://github.com/0day-ci/linux/commits/Shakeel-Butt/mm-memcontrol-account-pagetables-per-node/20201121-102353
base:   https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core.git 
33c0c9bdf7a59051a654cd98b7d2b48ce0080967
config: m68k-allmodconfig (attached as .config)
compiler: m68k-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# 
https://github.com/0day-ci/linux/commit/e51fa7d7d401d329238b2f8bc4d506a2ab1f5c67
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review 
Shakeel-Butt/mm-memcontrol-account-pagetables-per-node/20201121-102353
git checkout e51fa7d7d401d329238b2f8bc4d506a2ab1f5c67
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross 
ARCH=m68k 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

All warnings (new ones prefixed by >>):

   In file included from include/linux/kernel.h:11,
from include/linux/skbuff.h:13,
from include/linux/netlink.h:7,
from include/uapi/linux/genetlink.h:6,
from include/linux/genetlink.h:5,
from include/net/genetlink.h:5,
from fs/dlm/netlink.c:6:
   include/linux/scatterlist.h: In function 'sg_set_buf':
   arch/m68k/include/asm/page_mm.h:169:49: warning: ordered comparison of 
pointer with null pointer [-Wextra]
 169 | #define virt_addr_valid(kaddr) ((void *)(kaddr) >= (void 
*)PAGE_OFFSET && (void *)(kaddr) < high_memory)
 | ^~
   include/linux/compiler.h:78:42: note: in definition of macro 'unlikely'
  78 | # define unlikely(x) __builtin_expect(!!(x), 0)
 |  ^
   include/linux/scatterlist.h:143:2: note: in expansion of macro 'BUG_ON'
 143 |  BUG_ON(!virt_addr_valid(buf));
 |  ^~
   include/linux/scatterlist.h:143:10: note: in expansion of macro 
'virt_addr_valid'
 143 |  BUG_ON(!virt_addr_valid(buf));
 |  ^~~
   In file included from include/linux/bvec.h:14,
from include/linux/skbuff.h:17,
from include/linux/netlink.h:7,
from include/uapi/linux/genetlink.h:6,
from include/linux/genetlink.h:5,
from include/net/genetlink.h:5,
from fs/dlm/netlink.c:6:
   fs/dlm/netlink.c: At top level:
>> include/linux/mm.h:2201:13: warning: 'inc_lruvec_page_state' used but never 
>> defined
2201 | static void inc_lruvec_page_state(struct page *page, enum 
node_stat_item idx);
 | ^
>> include/linux/mm.h:2202:13: warning: 'dec_lruvec_page_state' used but never 
>> defined
2202 | static void dec_lruvec_page_state(struct page *page, enum 
node_stat_item idx);
 | ^
--
   In file included from include/linux/pagemap.h:8,
from fs/dlm/debug_fs.c:11:
>> include/linux/mm.h:2201:13: warning: 'inc_lruvec_page_state' used but never 
>> defined
2201 | static void inc_lruvec_page_state(struct page *page, enum 
node_stat_item idx);
 | ^
>> include/linux/mm.h:2202:13: warning: 'dec_lruvec_page_state' used but never 
>> defined
2202 | static void dec_lruvec_page_state(struct page *page, enum 
node_stat_item idx);
 | ^

vim +/inc_lruvec_page_state +2201 include/linux/mm.h

  2200  
> 2201  static void inc_lruvec_page_state(struct page *page, enum 
> node_stat_item idx);
> 2202  static void dec_lruvec_page_state(struct page *page, enum 
> node_stat_item idx);
  2203  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org


.config.gz
Description: application/gzip

Re: [RFC PATCH 4/4] sched, rt: support schedstat for RT sched class

2020-11-20 Thread Yafang Shao

On Fri, Nov 20, 2020 at 10:39 AM jun qian  wrote:
>
> Yafang Shao  于2020年11月19日周四 上午11:55写道：
> >
> > We want to measure the latency of RT tasks in our production
> > environment with schedstat facility, but currently schedstat is only
> > supported for fair sched class. This patch enable it for RT sched class
> > as well.
> >
> > The schedstat statistics are define in struct sched_entity, which is a
> > member of struct task_struct, so we can resue it for RT sched class.
> >
> > The schedstat usage in RT sched class is similar with fair sched class,
> > for example,
> > fairRT
> > enqueue update_stats_enqueue_fair   update_stats_enqueue_rt
> > dequeue update_stats_dequeue_fair   update_stats_dequeue_rt
> > put_prev_task   update_stats_wait_start update_stats_wait_start
> > set_next_task   update_stats_wait_end   update_stats_wait_end
> > show/proc/[pid]/sched   /proc/[pid]/sched
> >
> > The sched:sched_stats_* tracepoints can be used to trace RT tasks as
> > well.
> >
> > Signed-off-by: Yafang Shao 
> > ---
> >  kernel/sched/rt.c| 61 
> >  kernel/sched/sched.h |  2 ++
> >  2 files changed, 63 insertions(+)
> >
> > diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
> > index b9ec886702a1..a318236b7166 100644
> > --- a/kernel/sched/rt.c
> > +++ b/kernel/sched/rt.c
> > @@ -1246,6 +1246,46 @@ void dec_rt_tasks(struct sched_rt_entity *rt_se, 
> > struct rt_rq *rt_rq)
> > dec_rt_group(rt_se, rt_rq);
> >  }
> >
>
> Does the deadline schedule class should be considered also?
>

deadline sched class can be supported as well per my understanding, I
think we can do it later.
This patchset only aims to support RT sched class.

> thanks
>
> > +static inline void
> > +update_stats_enqueue_rt(struct rq *rq, struct sched_entity *se,
> > +   struct sched_rt_entity *rt_se, int flags)
> > +{
> > +   struct rt_rq *rt_rq = >rt;
> > +
> > +   if (!schedstat_enabled())
> > +   return;
> > +
> > +   if (rt_se != rt_rq->curr)
> > +   update_stats_wait_start(rq, se);
> > +
> > +   if (flags & ENQUEUE_WAKEUP)
> > +   update_stats_enqueue_sleeper(rq, se);
> > +}
> > +
> > +static inline void
> > +update_stats_dequeue_rt(struct rq *rq, struct sched_entity *se,
> > +   struct sched_rt_entity *rt_se, int flags)
> > +{
> > +   struct rt_rq *rt_rq = >rt;
> > +
> > +   if (!schedstat_enabled())
> > +   return;
> > +
> > +   if (rt_se != rt_rq->curr)
> > +   update_stats_wait_end(rq, se);
> > +
> > +   if ((flags & DEQUEUE_SLEEP) && rt_entity_is_task(rt_se)) {
> > +   struct task_struct *tsk = rt_task_of(rt_se);
> > +
> > +   if (tsk->state & TASK_INTERRUPTIBLE)
> > +   __schedstat_set(se->statistics.sleep_start,
> > +   rq_clock(rq));
> > +   if (tsk->state & TASK_UNINTERRUPTIBLE)
> > +   __schedstat_set(se->statistics.block_start,
> > +   rq_clock(rq));
> > +   }
> > +}
> > +
> >  /*
> >   * Change rt_se->run_list location unless SAVE && !MOVE
> >   *
> > @@ -1275,6 +1315,7 @@ static void __enqueue_rt_entity(struct 
> > sched_rt_entity *rt_se, unsigned int flag
> > struct rt_prio_array *array = _rq->active;
> > struct rt_rq *group_rq = group_rt_rq(rt_se);
> > struct list_head *queue = array->queue + rt_se_prio(rt_se);
> > +   struct task_struct *task = rt_task_of(rt_se);
> >
> > /*
> >  * Don't enqueue the group if its throttled, or when empty.
> > @@ -1288,6 +1329,8 @@ static void __enqueue_rt_entity(struct 
> > sched_rt_entity *rt_se, unsigned int flag
> > return;
> > }
> >
> > +   update_stats_enqueue_rt(rq_of_rt_rq(rt_rq), >se, rt_se, 
> > flags);
> > +
> > if (move_entity(flags)) {
> > WARN_ON_ONCE(rt_se->on_list);
> > if (flags & ENQUEUE_HEAD)
> > @@ -1307,7 +1350,9 @@ static void __dequeue_rt_entity(struct 
> > sched_rt_entity *rt_se, unsigned int flag
> >  {
> > struct rt_rq *rt_rq = rt_rq_of_se(rt_se);
> > struct rt_prio_array *array = _rq->active;
> > +   struct task_struct *task = rt_task_of(rt_se);
> >
> > +   update_stats_dequeue_rt(rq_of_rt_rq(rt_rq), >se, rt_se, 
> > flags);
> > if (move_entity(flags)) {
> > WARN_ON_ONCE(!rt_se->on_list);
> > __delist_rt_entity(rt_se, array);
> > @@ -1374,6 +1419,7 @@ enqueue_task_rt(struct rq *rq, struct task_struct *p, 
> > int flags)
> > if (flags & ENQUEUE_WAKEUP)
> > rt_se->timeout = 0;
> >
> > +   check_schedstat_required();
> > enqueue_rt_entity(rt_se, flags);
> >
> > if (!task_current(rq, p) && p->nr_cpus_allowed > 1)
> > @@

RE: [PATCH v3 4/4] Documentation/admin-guide: Change doc for split_lock_detect parameter

2020-11-20 Thread Yu, Fenghua

Hi, Randy,

> >>> +   for bus lock detection. 0 < N <= HZ/2 and
> >>> +   N is approximate. Only applied to non-root
> >>> +   users.
> >>
> >> Sorry, but I don't know what this means. I think it's the "and N is
> >> appropriate"
> >> that is confusing me.
> >>
> >>0 < N <= HZ/2 and N is appropriate.
> >
> > You are right. I will remove "and N is appropriate" in the next version.
> >
> > Could you please ack this patch? Can I add Acked-by from you in the
> updated patch?
> >
> > Thank you very much for your review!
> 
> Sure, no problem.
> 
> Acked-by: Randy Dunlap 

Really appreciate your review!

-Fenghua

Re: [PATCH 3/5] arm64: dts: sdm845: add oneplus 6/t devices

2020-11-20 Thread Bjorn Andersson

On Thu 12 Nov 10:21 CST 2020, Caleb Connolly wrote:
[..]
> diff --git a/arch/arm64/boot/dts/qcom/sdm845-oneplus-common.dtsi 
> b/arch/arm64/boot/dts/qcom/sdm845-oneplus-common.dtsi
> new file mode 100644
> index ..4e6477f1e574
> --- /dev/null
> +++ b/arch/arm64/boot/dts/qcom/sdm845-oneplus-common.dtsi
> @@ -0,0 +1,822 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * SDM845 OnePlus 6(T) (enchilada / fajita) common device tree source
> + *
> + * Copyright (c) 2020, The Linux Foundation. All rights reserved.
> + */
> +
> +/dts-v1/;
> +
> +#include 
> +#include 
> +#include 

Please keep these sorted alphabetically.

> +#include "sdm845.dtsi"
> +
> +// Needed for some GPIO (like the volume buttons)

This is or is going to be needed for more things, so feel free to skip
this comment.

> +#include "pm8998.dtsi"
> +#include "pmi8998.dtsi"
> +
> +/ {
> +
> + aliases {
> + hsuart0 = 
> + };
> +
> + vph_pwr: vph-pwr-regulator {
> + compatible = "regulator-fixed";
> + regulator-name = "vph_pwr";
> + regulator-min-microvolt = <370>;
> + regulator-max-microvolt = <370>;
> + };
> +
> + /*
> +  * Apparently RPMh does not provide support for PM8998 S4 because it
> +  * is always-on; model it as a fixed regulator.
> +  */
> + vreg_s4a_1p8: pm8998-smps4 {
> + compatible = "regulator-fixed";
> + regulator-name = "vreg_s4a_1p8";
> +
> + regulator-min-microvolt = <180>;
> + regulator-max-microvolt = <180>;
> +
> + regulator-always-on;
> + regulator-boot-on;
> +
> + vin-supply = <_pwr>;
> + };
> +
> + /*
> +  * The touchscreen regulator seems to be controlled somehow by a gpio.
> +  */
> + ts_1p8_supply: ts_1v8_regulator {

Please don't use _ in the node name.

> + compatible = "regulator-fixed";
> + regulator-name = "ts_1p8_supply";
> +
> + regulator-min-microvolt = <180>;
> + regulator-max-microvolt = <180>;
> +
> + gpio = < 88 0>;
> + enable-active-high;
> + regulator-boot-on;
> + };
> +
> + gpio_tristate_key: gpio-keys {
> + compatible = "gpio-keys";
> + label = "Tri-state keys";

What kind of button is this?

> +
> + pinctrl-names = "default";
> + pinctrl-0 = <_state_key_default>;
> +
> + state-top {
> + label = "Tri-state key top";
> + linux,code = ;
> + interrupt-parent = <>;
> + interrupts = <24 IRQ_TYPE_EDGE_FALLING>;
> + debounce-interval = <500>;
> + linux,can-disable;
> + };
> +
> + state-middle {
> + label = "Tri-state key middle";
> + linux,code = ;
> + interrupt-parent = <>;
> + interrupts = <52 IRQ_TYPE_EDGE_FALLING>;
> + debounce-interval = <500>;
> + linux,can-disable;
> + };
> +
> + state-bottom {
> + label = "Tri-state key bottom";
> + linux,code = ;
> + interrupt-parent = <>;
> + interrupts = <126 IRQ_TYPE_EDGE_FALLING>;
> + debounce-interval = <500>;
> + linux,can-disable;
> + };
> + };
[..]
> +/* Reserved memory changes */
> +/delete-node/ _mem;
> +
> +/ {

You already have one top-level section higher up, please group this in
there as well.

> + reserved-memory {
[..]
> + {

To avoid trouble finding your way around this file in the future I would
prefer if you sorted the nodes alphabetically.

> + status = "okay";
> +};
> +
[..]
> + {
> + status = "okay";
> +
> + touchscreen: synaptics-rmi4-i2c@20 {

You don't reference , so please omit this..

Regards,
Bjorn

Re: [PATCH v12 00/15] Add RCEC handling to PCI/AER

2020-11-20 Thread Bjorn Helgaas

On Fri, Nov 20, 2020 at 04:10:21PM -0800, Sean V Kelley wrote:
> Changes since v11 [1] and based on pci/master tree [2]:
> 
> - No functional changes. Tested with aer injection.
> 
> - Merge RCEC class code and extended capability patch with usage.
> - Apply same optimization for pci_pcie_type(dev) calls in
> drivers/pci/pcie/portdrv_pci.c and drivers/pci/pcie/aer.c.
> (Kuppuswamy Sathyanarayanan)
> 
> [1] 
> https://lore.kernel.org/linux-pci/20201117191954.1322844-1-sean.v.kel...@intel.com/
> [2] https://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci.git/log/
> 
> 
> Root Complex Event Collectors (RCEC) provide support for terminating error
> and PME messages from Root Complex Integrated Endpoints (RCiEPs).  An RCEC
> resides on a Bus in the Root Complex. Multiple RCECs can in fact reside on
> a single bus. An RCEC will explicitly declare supported RCiEPs through the
> Root Complex Endpoint Association Extended Capability.
> 
> (See PCIe 5.0-1, sections 1.3.2.3 (RCiEP), and 7.9.10 (RCEC Ext. Cap.))
> 
> The kernel lacks handling for these RCECs and the error messages received
> from their respective associated RCiEPs. More recently, a new CPU
> interconnect, Compute eXpress Link (CXL) depends on RCEC capabilities for
> purposes of error messaging from CXL 1.1 supported RCiEP devices.
> 
> DocLink: https://www.computeexpresslink.org/
> 
> This use case is not limited to CXL. Existing hardware today includes
> support for RCECs, such as the Denverton microserver product
> family. Future hardware will be forthcoming.
> 
> (See Intel Document, Order number: 33061-003US)
> 
> So services such as AER or PME could be associated with an RCEC driver.
> In the case of CXL, if an RCiEP (i.e., CXL 1.1 device) is associated with a
> platform's RCEC it shall signal PME and AER error conditions through that
> RCEC.
> 
> Towards the above use cases, add the missing RCEC class and extend the
> PCIe Root Port and service drivers to allow association of RCiEPs to their
> respective parent RCEC and facilitate handling of terminating error and PME
> messages.
> 
> Tested-by: Jonathan Cameron  #non-native/no RCEC
> 
> 
> Qiuxu Zhuo (3):
>   PCI/RCEC: Bind RCEC devices to the Root Port driver
>   PCI/RCEC: Add RCiEP's linked RCEC to AER/ERR
>   PCI/AER: Add RCEC AER error injection support
> 
> Sean V Kelley (12):
>   AER: aer_root_reset() non-native handling
>   PCI/RCEC: Cache RCEC capabilities in pci_init_capabilities()
>   PCI/ERR: Rename reset_link() to reset_subordinates()
>   PCI/ERR: Simplify by using pci_upstream_bridge()
>   PCI/ERR: Simplify by computing pci_pcie_type() once
>   PCI/ERR: Use "bridge" for clarity in pcie_do_recovery()
>   PCI/ERR: Avoid negated conditional for clarity
>   PCI/ERR: Add pci_walk_bridge() to pcie_do_recovery()
>   PCI/ERR: Limit AER resets in pcie_do_recovery()
>   PCI/RCEC: Add pcie_link_rcec() to associate RCiEPs
>   PCI/AER: Add pcie_walk_rcec() to RCEC AER handling
>   PCI/PME: Add pcie_walk_rcec() to RCEC PME handling
> 
>  drivers/pci/pci.h   |  29 -
>  drivers/pci/pcie/Makefile   |   2 +-
>  drivers/pci/pcie/aer.c  |  89 +++
>  drivers/pci/pcie/aer_inject.c   |   5 +-
>  drivers/pci/pcie/err.c  |  93 +++-
>  drivers/pci/pcie/pme.c  |  16 ++-
>  drivers/pci/pcie/portdrv_core.c |   9 +-
>  drivers/pci/pcie/portdrv_pci.c  |  13 ++-
>  drivers/pci/pcie/rcec.c | 190 
>  drivers/pci/probe.c |   2 +
>  include/linux/pci.h |   5 +
>  include/linux/pci_ids.h |   1 +
>  include/uapi/linux/pci_regs.h   |   7 ++
>  13 files changed, 393 insertions(+), 68 deletions(-)
>  create mode 100644 drivers/pci/pcie/rcec.c

Good timing, I was just tidying up v11 :)

Anyway, I applied this to pci/err for v5.11, thanks!

Now I see a Tested-by from Jonathan above; this cover letter doesn't
become part of the git history, so probably I should add that to each
individual patch, or maybe just the relevant ones if there are some
that it wouldn't apply to.  I'll tidy that up next week.

Minor procedural things I fixed up already because I think they're a
consequence of you building on a previous branch I published: patches
you post shouldn't include Signed-off-by; you should add your own when
you write the patch or are part of the delivery path, but you
shouldn't add *mine*.  I add that when I apply them.

I also removed the first Link: tags since they also look like they're
from an older version.  You don't need to add those; I add those
automatically so they point to the mailing list message where the
patch was posted.

[PATCH v2 5/5] locking/rwsem: Remove reader optimistic spinning

2020-11-20 Thread Waiman Long

Reader optimistic spinning is helpful when the reader critical section
is short and there aren't that many readers around. It also improves
the chance that a reader can get the lock as writer optimistic spinning
disproportionally favors writers much more than readers.

Since commit d3681e269fff ("locking/rwsem: Wake up almost all readers
in wait queue"), all the waiting readers are woken up so that they can
all get the read lock and run in parallel. When the number of contending
readers is large, allowing reader optimistic spinning will likely cause
reader fragmentation where multiple smaller groups of readers can get
the read lock in a sequential manner separated by writers. That reduces
reader parallelism.

One possible way to address that drawback is to limit the number of
readers (preferably one) that can do optimistic spinning. These readers
act as representatives of all the waiting readers in the wait queue as
they will wake up all those waiting readers once they get the lock.

Alternatively, as reader optimistic lock stealing has already enhanced
fairness to readers, it may be easier to just remove reader optimistic
spinning and simplifying the optimistic spinning code as a result.

Performance measurements (locking throughput kops/s) using a locking
microbenchmark with 50/50 reader/writer distribution and turbo-boost
disabled was done on a 2-socket Cascade Lake system (48-core 96-thread)
to see the impacts of these changes:

  1) Vanilla - 5.10-rc3 kernel
  2) Before  - 5.10-rc3 kernel with previous patches in this series
  2) limit-rspin - 5.10-rc3 kernel with limited reader spinning patch
  3) no-rspin- 5.10-rc3 kernel with reader spinning disabled

  # of threads  CS Load   Vanilla  Before   limit-rspin   no-rspin
    ---   ---  --   ---   
   21  5,1855,662  5,214   5,077
   41  5,1074,983  5,188   4,760
   81  4,7824,564  4,720   4,628
  161  4,6804,053  4,567   3,402
  321  4,2991,115  1,118   1,098
  641  3,218  983  1,001 957
  961  1,938  944957 930

   2   20  2,0082,128  2,264   1,665
   4   20  1,3901,033  1,046   1,101
   8   20  1,4721,155  1,098   1,213
  16   20  1,3321,077  1,089   1,122
  32   20967  914917 980
  64   20787  874891 858
  96   20730  836847 844

   2  100372  356360 355
   4  100492  425434 392
   8  100533  537529 538
  16  100548  572568 598
  32  100499  520527 537
  64  100466  517526 512
  96  100406  497506 509

The column "CS Load" represents the number of pause instructions issued
in the locking critical section. A CS load of 1 is extremely short and
is not likey in real situations. A load of 20 (moderate) and 100 (long)
are more realistic.

It can be seen that the previous patches in this series have reduced
performance in general except in highly contended cases with moderate
or long critical sections that performance improves a bit. This change
is mostly caused by the "Prevent potential lock starvation" patch that
reduce reader optimistic spinning and hence reduce reader fragmentation.

The patch that further limit reader optimistic spinning doesn't seem to
have too much impact on overall performance as shown in the benchmark
data.

The patch that disables reader optimistic spinning shows reduced
performance at lightly loaded cases, but comparable or slightly better
performance on with heavier contention.

This patch just removes reader optimistic spinning for now. As readers
are not going to do optimistic spinning anymore, we don't need to
consider if the OSQ is empty or not when doing lock stealing.

Signed-off-by: Waiman Long 
---
 kernel/locking/lock_events_list.h |   5 +-
 kernel/locking/rwsem.c| 278 +-
 2 files changed, 48 insertions(+), 235 deletions(-)

diff --git a/kernel/locking/lock_events_list.h 
b/kernel/locking/lock_events_list.h
index 270a0d351932..97fb6f3f840a 100644
--- a/kernel/locking/lock_events_list.h
+++ b/kernel/locking/lock_events_list.h
@@ -56,12 +56,9 @@ LOCK_EVENT(rwsem_sleep_reader)   /* # of reader sleeps   
*/
 LOCK_EVENT(rwsem_sleep_writer) /* # of writer sleeps   */
 LOCK_EVENT(rwsem_wake_reader)  /* # of reader wakeups  */

[PATCH v2 4/5] locking/rwsem: Wake up all waiting readers if RWSEM_WAKE_READ_OWNED

2020-11-20 Thread Waiman Long

The rwsem wakeup logic has been modified by commit d3681e269fff
("locking/rwsem: Wake up almost all readers in wait queue") to wake up
all readers in the wait queue if the first waiter is a reader. This
change was made to implement a phase-fair reader/writer lock. Once a
reader gets the lock, all the current waiting readers will be allowed
to join. Other readers that come after that will not be allowed to
prevent writer starvation.

In the case of RWSEM_WAKE_READ_OWNED, not all currently waiting readers
can be woken up if the first waiter happens to be a writer. Complete
the phase-fair logic by waking up all readers even for this case.

Signed-off-by: Waiman Long 
---
 kernel/locking/rwsem.c | 20 ++--
 1 file changed, 14 insertions(+), 6 deletions(-)

diff --git a/kernel/locking/rwsem.c b/kernel/locking/rwsem.c
index b373990fcab8..e0ad2019c518 100644
--- a/kernel/locking/rwsem.c
+++ b/kernel/locking/rwsem.c
@@ -404,6 +404,7 @@ static void rwsem_mark_wake(struct rw_semaphore *sem,
struct rwsem_waiter *waiter, *tmp;
long oldcount, woken = 0, adjustment = 0;
struct list_head wlist;
+   bool first_is_reader = true;
 
lockdep_assert_held(>wait_lock);
 
@@ -426,7 +427,13 @@ static void rwsem_mark_wake(struct rw_semaphore *sem,
lockevent_inc(rwsem_wake_writer);
}
 
-   return;
+   /*
+* If rwsem has already been owned by reader, wake up other
+* readers in the wait queue even if first one is a writer.
+*/
+   if (wake_type != RWSEM_WAKE_READ_OWNED)
+   return;
+   first_is_reader = false;
}
 
/*
@@ -520,10 +527,12 @@ static void rwsem_mark_wake(struct rw_semaphore *sem,
}
 
/*
-* When we've woken a reader, we no longer need to force writers
-* to give up the lock and we can clear HANDOFF.
+* When readers are woken, we no longer need to force writers to
+* give up the lock and we can clear HANDOFF unless the first
+* waiter is a writer.
 */
-   if (woken && (atomic_long_read(>count) & RWSEM_FLAG_HANDOFF))
+   if (woken && first_is_reader &&
+  (atomic_long_read(>count) & RWSEM_FLAG_HANDOFF))
adjustment -= RWSEM_FLAG_HANDOFF;
 
if (adjustment)
@@ -1053,8 +1062,7 @@ rwsem_down_read_slowpath(struct rw_semaphore *sem, int 
state, long count)
if (rwsem_optimistic_spin(sem, false)) {
/* rwsem_optimistic_spin() implies ACQUIRE on success */
/*
-* Wake up other readers in the wait list if the front
-* waiter is a reader.
+* Wake up other readers in the wait queue.
 */
 wake_readers:
if ((atomic_long_read(>count) & RWSEM_FLAG_WAITERS)) {
-- 
2.18.1

[PATCH v2 2/5] locking/rwsem: Prevent potential lock starvation

2020-11-20 Thread Waiman Long

The lock handoff bit is added in commit 4f23dbc1e657 ("locking/rwsem:
Implement lock handoff to prevent lock starvation") to avoid lock
starvation. However, allowing readers to do optimistic spinning does
introduce an unlikely scenario where lock starvation can happen.

The lock handoff bit may only be set when a waiter is being woken up.
In the case of reader unlock, wakeup happens only when the reader count
reaches 0. If there is a continuous stream of incoming readers acquiring
read lock via optimistic spinning, it is possible that the reader count
may never reach 0 and so the handoff bit will never be asserted.

One way to prevent this scenario from happening is to disallow optimistic
spinning if the rwsem is currently owned by readers. If the previous
or current owner is a writer, optimistic spinning will be allowed.

If the previous owner is a reader but the reader count has reached 0
before, a wakeup should have been issued. So the handoff mechanism
will be kicked in to prevent lock starvation. As a result, it should
be OK to do optimistic spinning in this case.

This patch may have some impact on reader performance as it reduces
reader optimistic spinning especially if the lock critical sections
are short the number of contending readers are small.

Signed-off-by: Waiman Long 
---
 kernel/locking/rwsem.c | 15 +--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/kernel/locking/rwsem.c b/kernel/locking/rwsem.c
index 12761e02ab9b..a961c5c53b70 100644
--- a/kernel/locking/rwsem.c
+++ b/kernel/locking/rwsem.c
@@ -991,16 +991,27 @@ rwsem_spin_on_owner(struct rw_semaphore *sem, unsigned 
long nonspinnable)
 static struct rw_semaphore __sched *
 rwsem_down_read_slowpath(struct rw_semaphore *sem, int state, long count)
 {
-   long adjustment = -RWSEM_READER_BIAS;
+   long owner, adjustment = -RWSEM_READER_BIAS;
+   long rcnt = (count >> RWSEM_READER_SHIFT);
struct rwsem_waiter waiter;
DEFINE_WAKE_Q(wake_q);
bool wake = false;
 
+   /*
+* To prevent a constant stream of readers from starving a sleeping
+* waiter, don't attempt optimistic spinning if the lock is currently
+* owned by readers.
+*/
+   owner = atomic_long_read(>owner);
+   if ((owner & RWSEM_READER_OWNED) && (rcnt > 1) &&
+  !(count & RWSEM_WRITER_LOCKED))
+   goto queue;
+
/*
 * Save the current read-owner of rwsem, if available, and the
 * reader nonspinnable bit.
 */
-   waiter.last_rowner = atomic_long_read(>owner);
+   waiter.last_rowner = owner;
if (!(waiter.last_rowner & RWSEM_READER_OWNED))
waiter.last_rowner &= RWSEM_RD_NONSPINNABLE;
 
-- 
2.18.1

[PATCH v2 1/5] locking/rwsem: Pass the current atomic count to rwsem_down_read_slowpath()

2020-11-20 Thread Waiman Long

The atomic count value right after reader count increment can be useful
to determine the rwsem state at trylock time. So the count value is
passed down to rwsem_down_read_slowpath() to be used when appropriate.

Signed-off-by: Waiman Long 
---
 kernel/locking/rwsem.c | 20 
 1 file changed, 12 insertions(+), 8 deletions(-)

diff --git a/kernel/locking/rwsem.c b/kernel/locking/rwsem.c
index f11b9bd3431d..12761e02ab9b 100644
--- a/kernel/locking/rwsem.c
+++ b/kernel/locking/rwsem.c
@@ -270,12 +270,12 @@ static inline void rwsem_set_nonspinnable(struct 
rw_semaphore *sem)
  owner | RWSEM_NONSPINNABLE));
 }
 
-static inline bool rwsem_read_trylock(struct rw_semaphore *sem)
+static inline long rwsem_read_trylock(struct rw_semaphore *sem)
 {
long cnt = atomic_long_add_return_acquire(RWSEM_READER_BIAS, 
>count);
if (WARN_ON_ONCE(cnt < 0))
rwsem_set_nonspinnable(sem);
-   return !(cnt & RWSEM_READ_FAILED_MASK);
+   return cnt;
 }
 
 /*
@@ -989,9 +989,9 @@ rwsem_spin_on_owner(struct rw_semaphore *sem, unsigned long 
nonspinnable)
  * Wait for the read lock to be granted
  */
 static struct rw_semaphore __sched *
-rwsem_down_read_slowpath(struct rw_semaphore *sem, int state)
+rwsem_down_read_slowpath(struct rw_semaphore *sem, int state, long count)
 {
-   long count, adjustment = -RWSEM_READER_BIAS;
+   long adjustment = -RWSEM_READER_BIAS;
struct rwsem_waiter waiter;
DEFINE_WAKE_Q(wake_q);
bool wake = false;
@@ -1337,8 +1337,10 @@ static struct rw_semaphore *rwsem_downgrade_wake(struct 
rw_semaphore *sem)
  */
 static inline void __down_read(struct rw_semaphore *sem)
 {
-   if (!rwsem_read_trylock(sem)) {
-   rwsem_down_read_slowpath(sem, TASK_UNINTERRUPTIBLE);
+   long count = rwsem_read_trylock(sem);
+
+   if (count & RWSEM_READ_FAILED_MASK) {
+   rwsem_down_read_slowpath(sem, TASK_UNINTERRUPTIBLE, count);
DEBUG_RWSEMS_WARN_ON(!is_rwsem_reader_owned(sem), sem);
} else {
rwsem_set_reader_owned(sem);
@@ -1347,8 +1349,10 @@ static inline void __down_read(struct rw_semaphore *sem)
 
 static inline int __down_read_killable(struct rw_semaphore *sem)
 {
-   if (!rwsem_read_trylock(sem)) {
-   if (IS_ERR(rwsem_down_read_slowpath(sem, TASK_KILLABLE)))
+   long count = rwsem_read_trylock(sem);
+
+   if (count & RWSEM_READ_FAILED_MASK) {
+   if (IS_ERR(rwsem_down_read_slowpath(sem, TASK_KILLABLE, count)))
return -EINTR;
DEBUG_RWSEMS_WARN_ON(!is_rwsem_reader_owned(sem), sem);
} else {
-- 
2.18.1

[PATCH v2 3/5] locking/rwsem: Enable reader optimistic lock stealing

2020-11-20 Thread Waiman Long

If the optimistic spinning queue is empty and the rwsem does not have
the handoff or write-lock bits set, it is actually not necessary to
call rwsem_optimistic_spin() to spin on it. Instead, it can steal the
lock directly as its reader bias is in the count already.  If it is
the first reader in this state, it will try to wake up other readers
in the wait queue.

With this patch applied, the following were the lock event counts
after rebooting a 2-socket system and a "make -j96" kernel rebuild.

  rwsem_opt_rlock=4437
  rwsem_rlock=29
  rwsem_rlock_steal=19

So lock stealing represents about 0.4% of all the read locks acquired
in the slow path.

Signed-off-by: Waiman Long 
---
 kernel/locking/lock_events_list.h |  1 +
 kernel/locking/rwsem.c| 28 
 2 files changed, 29 insertions(+)

diff --git a/kernel/locking/lock_events_list.h 
b/kernel/locking/lock_events_list.h
index 239039d0ce21..270a0d351932 100644
--- a/kernel/locking/lock_events_list.h
+++ b/kernel/locking/lock_events_list.h
@@ -63,6 +63,7 @@ LOCK_EVENT(rwsem_opt_nospin)  /* # of disabled optspins   
*/
 LOCK_EVENT(rwsem_opt_norspin)  /* # of disabled reader-only optspins   */
 LOCK_EVENT(rwsem_opt_rlock2)   /* # of opt-acquired 2ndary read locks  */
 LOCK_EVENT(rwsem_rlock)/* # of read locks acquired 
*/
+LOCK_EVENT(rwsem_rlock_steal)  /* # of read locks by lock stealing */
 LOCK_EVENT(rwsem_rlock_fast)   /* # of fast read locks acquired*/
 LOCK_EVENT(rwsem_rlock_fail)   /* # of failed read lock acquisitions   */
 LOCK_EVENT(rwsem_rlock_handoff)/* # of read lock handoffs  
*/
diff --git a/kernel/locking/rwsem.c b/kernel/locking/rwsem.c
index a961c5c53b70..b373990fcab8 100644
--- a/kernel/locking/rwsem.c
+++ b/kernel/locking/rwsem.c
@@ -957,6 +957,12 @@ static inline bool rwsem_reader_phase_trylock(struct 
rw_semaphore *sem,
}
return false;
 }
+
+static inline bool rwsem_no_spinners(struct rw_semaphore *sem)
+{
+   return !osq_is_locked(>osq);
+}
+
 #else
 static inline bool rwsem_can_spin_on_owner(struct rw_semaphore *sem,
   unsigned long nonspinnable)
@@ -977,6 +983,11 @@ static inline bool rwsem_reader_phase_trylock(struct 
rw_semaphore *sem,
return false;
 }
 
+static inline bool rwsem_no_spinners(sem)
+{
+   return false;
+}
+
 static inline int
 rwsem_spin_on_owner(struct rw_semaphore *sem, unsigned long nonspinnable)
 {
@@ -1007,6 +1018,22 @@ rwsem_down_read_slowpath(struct rw_semaphore *sem, int 
state, long count)
   !(count & RWSEM_WRITER_LOCKED))
goto queue;
 
+   /*
+* Reader optimistic lock stealing
+*
+* We can take the read lock directly without doing
+* rwsem_optimistic_spin() if the conditions are right.
+* Also wake up other readers if it is the first reader.
+*/
+   if (!(count & (RWSEM_WRITER_LOCKED | RWSEM_FLAG_HANDOFF)) &&
+   rwsem_no_spinners(sem)) {
+   rwsem_set_reader_owned(sem);
+   lockevent_inc(rwsem_rlock_steal);
+   if (rcnt == 1)
+   goto wake_readers;
+   return sem;
+   }
+
/*
 * Save the current read-owner of rwsem, if available, and the
 * reader nonspinnable bit.
@@ -1029,6 +1056,7 @@ rwsem_down_read_slowpath(struct rw_semaphore *sem, int 
state, long count)
 * Wake up other readers in the wait list if the front
 * waiter is a reader.
 */
+wake_readers:
if ((atomic_long_read(>count) & RWSEM_FLAG_WAITERS)) {
raw_spin_lock_irq(>wait_lock);
if (!list_empty(>wait_list))
-- 
2.18.1

[PATCH v2 0/5] locking/rwsem: Rework reader optimistic spinning

2020-11-20 Thread Waiman Long

 v2:
  - Update some commit logs to incorporate review comments.
  - Patch 2: remove unnecessary comment.
  - Patch 3: rename osq_is_empty() to rwsem_no_spinners() as suggested.
  - Patch 4: correctly handle HANDOFF clearing.
  - Patch 5: fix !CONFIG_RWSEM_SPIN_ON_OWNER compilation errors.

A recent report of SAP certification failure caused by increased system
time due to rwsem reader optimistic spinning led me to reexamine the
code to see the pro and cons of doing it. This led me to discover a
potential lock starvation scenario as explained in patch 2. That patch
does reduce reader spinning to avoid this potential problem. Patches
3 and 4 are further optimizations of the current code.

Then there is the issue of reader fragmentation that can potentially
reduce performance in some heavily contented cases. Two different
approaches are attempted:
 1) further reduce reader optimistic spinning
 2) disable reader spinning

See the performance shown in patch 5.

This patch series adopts the second approach by dropping reader spinning
for now as it simplifies the code. However, writers are still allowed
to spin on a reader-owned rwsem for a limited time.

Waiman Long (5):
  locking/rwsem: Pass the current atomic count to
rwsem_down_read_slowpath()
  locking/rwsem: Prevent potential lock starvation
  locking/rwsem: Enable reader optimistic lock stealing
  locking/rwsem: Wake up all waiting readers if RWSEM_WAKE_READ_OWNED
  locking/rwsem: Remove reader optimistic spinning

 kernel/locking/lock_events_list.h |   6 +-
 kernel/locking/rwsem.c| 293 --
 2 files changed, 82 insertions(+), 217 deletions(-)

-- 
2.18.1

Re: [PATCH 1/1] ktest.pl: Fix incorrect reboot for grub2bls

2020-11-20 Thread Steven Rostedt

On Fri, 20 Nov 2020 18:12:43 -0800
Libo Chen  wrote:

> This issue was first noticed when I was testing different kernels on
> Oracle Linux 8 which as Fedora 30+ adopts BLS as default. Even though a
> kernel entry was added successfully and the index of that kernel entry was
> retrieved correctly, ktest still wouldn't reboot the system into
> user-specified kernel.
> 
> The bug was spotted in subroutine reboot_to where the if-statement never
> checks for REBOOT_TYPE "grub2bls", therefore the desired entry will not be
> set for the next boot.
> 
> Add a check for "grub2bls" so that $grub_reboot $grub_number can
> be run before a reboot if REBOOT_TYPE is "grub2bls" then we can boot to
> the correct kernel.
> 
> Fixes: ac2466456eaa ("ktest: introduce grub2bls REBOOT_TYPE option")

I was just thinking a couple of hours ago if anyone uses ktest.pl, and
if so, how come I haven't received any patches for it ;-)

Anyway, I'll take a look at this next week, and it may be a while
before it gets into the kernel, as I like to run updates for a few
weeks on my systems (as I use it to build all my kernels), before I
push it upstream.

Thanks!

-- Steve


> 
> Signed-off-by: Libo Chen 
> ---
>  tools/testing/ktest/ktest.pl | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/tools/testing/ktest/ktest.pl b/tools/testing/ktest/ktest.pl
> index cb16d2aac51c..54188ee16c48 100755
> --- a/tools/testing/ktest/ktest.pl
> +++ b/tools/testing/ktest/ktest.pl
> @@ -2040,7 +2040,7 @@ sub reboot_to {
>  
>  if ($reboot_type eq "grub") {
>   run_ssh "'(echo \"savedefault --default=$grub_number --once\" | grub 
> --batch)'";
> -} elsif ($reboot_type eq "grub2") {
> +} elsif (($reboot_type eq "grub2") or ($reboot_type eq "grub2bls")) {
>   run_ssh "$grub_reboot $grub_number";
>  } elsif ($reboot_type eq "syslinux") {
>   run_ssh "$syslinux --once \\\"$syslinux_label\\\" $syslinux_path";

Re: [PATCH v3 4/4] Documentation/admin-guide: Change doc for split_lock_detect parameter

2020-11-20 Thread Randy Dunlap

On 11/20/20 8:09 PM, Yu, Fenghua wrote:
> Hi, Randy,
> 
>>> +   ratelimit:N -
>>> + Set rate limit to N bus locks per second
>>> + for bus lock detection. 0 < N <= HZ/2 and
>>> + N is approximate. Only applied to non-root
>>> + users.
>>
>> Sorry, but I don't know what this means. I think it's the "and N is
>> appropriate"
>> that is confusing me.
>>
>>  0 < N <= HZ/2 and N is appropriate.
> 
> You are right. I will remove "and N is appropriate" in the next version.
> 
> Could you please ack this patch? Can I add Acked-by from you in the updated 
> patch?
> 
> Thank you very much for your review!

Sure, no problem.

Acked-by: Randy Dunlap 

thanks.
-- 
~Randy

RE: [PATCH v3 4/4] Documentation/admin-guide: Change doc for split_lock_detect parameter

2020-11-20 Thread Yu, Fenghua

Hi, Randy,

> > +   ratelimit:N -
> > + Set rate limit to N bus locks per second
> > + for bus lock detection. 0 < N <= HZ/2 and
> > + N is approximate. Only applied to non-root
> > + users.
> 
> Sorry, but I don't know what this means. I think it's the "and N is
> appropriate"
> that is confusing me.
> 
>   0 < N <= HZ/2 and N is appropriate.

You are right. I will remove "and N is appropriate" in the next version.

Could you please ack this patch? Can I add Acked-by from you in the updated 
patch?

Thank you very much for your review!

-Fenghua

Re: [PATCH v8 4/4] remoteproc: qcom: Add minidump id for sm8150 modem

2020-11-20 Thread Bjorn Andersson

On Thu 19 Nov 15:05 CST 2020, Siddharth Gupta wrote:

> Add minidump id for modem in sm8150 chipset so that the regions to be
> included in the coredump generated upon a crash is based on the minidump
> tables in SMEM instead of those in the ELF header.
> 
> Signed-off-by: Siddharth Gupta 

When reposting patches without modifications, please add any
Acked-by, Reviewed-by or Tested-by that you received previously.


Reviewed-by: Bjorn Andersson 

Regards,
Bjorn

> ---
>  drivers/remoteproc/qcom_q6v5_pas.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/remoteproc/qcom_q6v5_pas.c 
> b/drivers/remoteproc/qcom_q6v5_pas.c
> index ca05c2ef..e61ef88 100644
> --- a/drivers/remoteproc/qcom_q6v5_pas.c
> +++ b/drivers/remoteproc/qcom_q6v5_pas.c
> @@ -630,6 +630,7 @@ static const struct adsp_data mpss_resource_init = {
>   .crash_reason_smem = 421,
>   .firmware_name = "modem.mdt",
>   .pas_id = 4,
> + .minidump_id = 3,
>   .has_aggre2_clk = false,
>   .auto_boot = false,
>   .active_pd_names = (char*[]){
> -- 
> Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
> a Linux Foundation Collaborative Project
>

Re: [PATCH v8 3/4] remoteproc: qcom: Add capability to collect minidumps

2020-11-20 Thread Bjorn Andersson

On Thu 19 Nov 15:05 CST 2020, Siddharth Gupta wrote:

> This patch adds support for collecting minidump in the event of remoteproc
> crash. Parse the minidump table based on remoteproc's unique minidump-id,
> read all memory regions from the remoteproc's minidump table entry and
> expose the memory to userspace. The remoteproc platform driver can choose
> to collect a full/mini dump by specifying the coredump op.
> 
> Co-developed-by: Rishabh Bhatnagar 
> Signed-off-by: Rishabh Bhatnagar 
> Co-developed-by: Gurbir Arora 
> Signed-off-by: Gurbir Arora 
> Signed-off-by: Siddharth Gupta 

Reviewed-by: Bjorn Andersson 

Regards,
Bjorn

> ---
>  drivers/remoteproc/qcom_common.c   | 147 
> +
>  drivers/remoteproc/qcom_common.h   |   2 +
>  drivers/remoteproc/qcom_q6v5_pas.c |  27 ++-
>  3 files changed, 174 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/remoteproc/qcom_common.c 
> b/drivers/remoteproc/qcom_common.c
> index 085fd73..c41c3a5 100644
> --- a/drivers/remoteproc/qcom_common.c
> +++ b/drivers/remoteproc/qcom_common.c
> @@ -17,6 +17,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #include "remoteproc_internal.h"
>  #include "qcom_common.h"
> @@ -25,6 +26,61 @@
>  #define to_smd_subdev(d) container_of(d, struct qcom_rproc_subdev, subdev)
>  #define to_ssr_subdev(d) container_of(d, struct qcom_rproc_ssr, subdev)
>  
> +#define MAX_NUM_OF_SS   10
> +#define MAX_REGION_NAME_LENGTH  16
> +#define SBL_MINIDUMP_SMEM_ID 602
> +#define MD_REGION_VALID  ('V' << 24 | 'A' << 16 | 'L' << 8 | 'I' 
> << 0)
> +#define MD_SS_ENCR_DONE  ('D' << 24 | 'O' << 16 | 'N' << 8 | 'E' 
> << 0)
> +#define MD_SS_ENABLED('E' << 24 | 'N' << 16 | 'B' << 8 | 'L' 
> << 0)
> +
> +/**
> + * struct minidump_region - Minidump region
> + * @name : Name of the region to be dumped
> + * @seq_num: : Use to differentiate regions with same name.
> + * @valid: This entry to be dumped (if set to 1)
> + * @address  : Physical address of region to be dumped
> + * @size : Size of the region
> + */
> +struct minidump_region {
> + charname[MAX_REGION_NAME_LENGTH];
> + __le32  seq_num;
> + __le32  valid;
> + __le64  address;
> + __le64  size;
> +};
> +
> +/**
> + * struct minidump_subsystem_toc: Subsystem's SMEM Table of content
> + * @status : Subsystem toc init status
> + * @enabled : if set to 1, this region would be copied during coredump
> + * @encryption_status: Encryption status for this subsystem
> + * @encryption_required : Decides to encrypt the subsystem regions or not
> + * @region_count : Number of regions added in this subsystem toc
> + * @regions_baseptr : regions base pointer of the subsystem
> + */
> +struct minidump_subsystem {
> + __le32  status;
> + __le32  enabled;
> + __le32  encryption_status;
> + __le32  encryption_required;
> + __le32  region_count;
> + __le64  regions_baseptr;
> +};
> +
> +/**
> + * struct minidump_global_toc: Global Table of Content
> + * @status : Global Minidump init status
> + * @md_revision : Minidump revision
> + * @enabled : Minidump enable status
> + * @subsystems : Array of subsystems toc
> + */
> +struct minidump_global_toc {
> + __le32  status;
> + __le32  md_revision;
> + __le32  enabled;
> + struct minidump_subsystem   subsystems[MAX_NUM_OF_SS];
> +};
> +
>  struct qcom_ssr_subsystem {
>   const char *name;
>   struct srcu_notifier_head notifier_list;
> @@ -34,6 +90,97 @@ struct qcom_ssr_subsystem {
>  static LIST_HEAD(qcom_ssr_subsystem_list);
>  static DEFINE_MUTEX(qcom_ssr_subsys_lock);
>  
> +
> +static void qcom_minidump_cleanup(struct rproc *rproc)
> +{
> + struct rproc_dump_segment *entry, *tmp;
> +
> + list_for_each_entry_safe(entry, tmp, >dump_segments, node) {
> + list_del(>node);
> + kfree(entry->priv);
> + kfree(entry);
> + }
> +}
> +
> +static int qcom_add_minidump_segments(struct rproc *rproc, struct 
> minidump_subsystem *subsystem)
> +{
> + struct minidump_region __iomem *ptr;
> + struct minidump_region region;
> + int seg_cnt, i;
> + dma_addr_t da;
> + size_t size;
> + char *name;
> +
> + if (WARN_ON(!list_empty(>dump_segments))) {
> + dev_err(>dev, "dump segment list already populated\n");
> + return -EUCLEAN;
> + }
> +
> + seg_cnt = le32_to_cpu(subsystem->region_count);
> + ptr = ioremap((unsigned long)le64_to_cpu(subsystem->regions_baseptr),
> +   seg_cnt * sizeof(struct minidump_region));
> + if (!ptr)
> + return -EFAULT;
> +
> + for (i = 0; i < seg_cnt; i++) {
> + memcpy_fromio(, ptr + i, sizeof(region));
> + if (region.valid == MD_REGION_VALID) {
> + name = kstrdup(region.name,

Re: [PATCH v8 1/4] remoteproc: core: Add ops to enable custom coredump functionality

2020-11-20 Thread Bjorn Andersson

On Thu 19 Nov 15:05 CST 2020, Siddharth Gupta wrote:

> Each remoteproc might have different requirements for coredumps and might
> want to choose the type of dumps it wants to collect. This change allows
> remoteproc drivers to specify their own custom dump function to be executed
> in place of rproc_coredump. If the coredump op is not specified by the
> remoteproc driver it will be set to rproc_coredump by default.
> 

Reviewed-by: Bjorn Andersson 

Regards,
Bjorn

> Signed-off-by: Siddharth Gupta 
> ---
>  drivers/remoteproc/remoteproc_core.c | 6 +-
>  include/linux/remoteproc.h   | 2 ++
>  2 files changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/remoteproc/remoteproc_core.c 
> b/drivers/remoteproc/remoteproc_core.c
> index dab2c0f..eba7543 100644
> --- a/drivers/remoteproc/remoteproc_core.c
> +++ b/drivers/remoteproc/remoteproc_core.c
> @@ -1704,7 +1704,7 @@ int rproc_trigger_recovery(struct rproc *rproc)
>   goto unlock_mutex;
>  
>   /* generate coredump */
> - rproc_coredump(rproc);
> + rproc->ops->coredump(rproc);
>  
>   /* load firmware */
>   ret = request_firmware(_p, rproc->firmware, dev);
> @@ -2126,6 +2126,10 @@ static int rproc_alloc_ops(struct rproc *rproc, const 
> struct rproc_ops *ops)
>   if (!rproc->ops)
>   return -ENOMEM;
>  
> + /* Default to rproc_coredump if no coredump function is specified */
> + if (!rproc->ops->coredump)
> + rproc->ops->coredump = rproc_coredump;
> +
>   if (rproc->ops->load)
>   return 0;
>  
> diff --git a/include/linux/remoteproc.h b/include/linux/remoteproc.h
> index 3fa3ba6..a419878 100644
> --- a/include/linux/remoteproc.h
> +++ b/include/linux/remoteproc.h
> @@ -375,6 +375,7 @@ enum rsc_handling_status {
>   * @get_boot_addr:   get boot address to entry point specified in firmware
>   * @panic:   optional callback to react to system panic, core will delay
>   *   panic at least the returned number of milliseconds
> + * @coredump:  collect firmware dump after the subsystem is shutdown
>   */
>  struct rproc_ops {
>   int (*prepare)(struct rproc *rproc);
> @@ -393,6 +394,7 @@ struct rproc_ops {
>   int (*sanity_check)(struct rproc *rproc, const struct firmware *fw);
>   u64 (*get_boot_addr)(struct rproc *rproc, const struct firmware *fw);
>   unsigned long (*panic)(struct rproc *rproc);
> + void (*coredump)(struct rproc *rproc);
>  };
>  
>  /**
> -- 
> Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
> a Linux Foundation Collaborative Project
>

[PATCHv4 net-next 2/3] octeontx2-af: Add devlink health reporters for NPA

2020-11-20 Thread George Cherian

Add health reporters for RVU NPA block.
NPA Health reporters handle following HW event groups
 - GENERAL events
 - ERROR events
 - RAS events
 - RVU event
An event counter per event is maintained in SW.

Output:
 # devlink health
 pci/0002:01:00.0:
   reporter npa
 state healthy error 0 recover 0
 # devlink  health dump show pci/0002:01:00.0 reporter npa
 NPA_AF_GENERAL:
Unmap PF Error: 0
Free Disabled for NIX0 RX: 0
Free Disabled for NIX0 TX: 0
Free Disabled for NIX1 RX: 0
Free Disabled for NIX1 TX: 0
Free Disabled for SSO: 0
Free Disabled for TIM: 0
Free Disabled for DPI: 0
Free Disabled for AURA: 0
Alloc Disabled for Resvd: 0
  NPA_AF_ERR:
Memory Fault on NPA_AQ_INST_S read: 0
Memory Fault on NPA_AQ_RES_S write: 0
AQ Doorbell Error: 0
Poisoned data on NPA_AQ_INST_S read: 0
Poisoned data on NPA_AQ_RES_S write: 0
Poisoned data on HW context read: 0
  NPA_AF_RVU:
Unmap Slot Error: 0

Signed-off-by: Sunil Kovvuri Goutham 
Signed-off-by: Jerin Jacob 
Signed-off-by: George Cherian 
---
 .../marvell/octeontx2/af/rvu_devlink.c| 492 +-
 .../marvell/octeontx2/af/rvu_devlink.h|  31 ++
 .../marvell/octeontx2/af/rvu_struct.h |  23 +
 3 files changed, 545 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu_devlink.c 
b/drivers/net/ethernet/marvell/octeontx2/af/rvu_devlink.c
index 04ef945e7e75..b7f0691d86b0 100644
--- a/drivers/net/ethernet/marvell/octeontx2/af/rvu_devlink.c
+++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu_devlink.c
@@ -5,10 +5,498 @@
  *
  */
 
+#include
+
 #include "rvu.h"
+#include "rvu_reg.h"
+#include "rvu_struct.h"
 
 #define DRV_NAME "octeontx2-af"
 
+static int rvu_report_pair_start(struct devlink_fmsg *fmsg, const char *name)
+{
+   int err;
+
+   err = devlink_fmsg_pair_nest_start(fmsg, name);
+   if (err)
+   return err;
+
+   return  devlink_fmsg_obj_nest_start(fmsg);
+}
+
+static int rvu_report_pair_end(struct devlink_fmsg *fmsg)
+{
+   int err;
+
+   err = devlink_fmsg_obj_nest_end(fmsg);
+   if (err)
+   return err;
+
+   return devlink_fmsg_pair_nest_end(fmsg);
+}
+
+static bool rvu_common_request_irq(struct rvu *rvu, int offset,
+  const char *name, irq_handler_t fn)
+{
+   struct rvu_devlink *rvu_dl = rvu->rvu_dl;
+   int rc;
+
+   sprintf(>irq_name[offset * NAME_SIZE], name);
+   rc = request_irq(pci_irq_vector(rvu->pdev, offset), fn, 0,
+>irq_name[offset * NAME_SIZE], rvu_dl);
+   if (rc)
+   dev_warn(rvu->dev, "Failed to register %s irq\n", name);
+   else
+   rvu->irq_allocated[offset] = true;
+
+   return rvu->irq_allocated[offset];
+}
+
+static irqreturn_t rvu_npa_af_rvu_intr_handler(int irq, void *rvu_irq)
+{
+   struct rvu_npa_event_ctx *npa_event_context;
+   struct rvu_npa_event_cnt *npa_event_count;
+   struct rvu_devlink *rvu_dl = rvu_irq;
+   struct rvu *rvu;
+   int blkaddr;
+   u64 intr;
+
+   rvu = rvu_dl->rvu;
+   blkaddr = rvu_get_blkaddr(rvu, BLKTYPE_NPA, 0);
+   if (blkaddr < 0)
+   return IRQ_NONE;
+
+   npa_event_context = rvu_dl->npa_event_ctx;
+   npa_event_count = _event_context->npa_event_cnt;
+   intr = rvu_read64(rvu, blkaddr, NPA_AF_RVU_INT);
+   npa_event_context->npa_af_rvu_int = intr;
+
+   if (intr & BIT_ULL(0))
+   npa_event_count->unmap_slot_count++;
+
+   /* Clear interrupts */
+   rvu_write64(rvu, blkaddr, NPA_AF_RVU_INT, intr);
+   rvu_write64(rvu, blkaddr, NPA_AF_RVU_INT_ENA_W1C, ~0ULL);
+   devlink_health_report(rvu_dl->rvu_npa_health_reporter, "NPA_AF_RVU 
Error",
+ npa_event_context);
+
+   return IRQ_HANDLED;
+}
+
+static int rvu_npa_inpq_to_cnt(u16 in,
+  struct rvu_npa_event_cnt *npa_event_count)
+{
+   switch (in) {
+   case 0:
+   return 0;
+   case BIT(NPA_INPQ_NIX0_RX):
+   return npa_event_count->free_dis_nix0_rx_count++;
+   case BIT(NPA_INPQ_NIX0_TX):
+   return npa_event_count->free_dis_nix0_tx_count++;
+   case BIT(NPA_INPQ_NIX1_RX):
+   return npa_event_count->free_dis_nix1_rx_count++;
+   case BIT(NPA_INPQ_NIX1_TX):
+   return npa_event_count->free_dis_nix1_tx_count++;
+   case BIT(NPA_INPQ_SSO):
+   return npa_event_count->free_dis_sso_count++;
+   case BIT(NPA_INPQ_TIM):
+   return npa_event_count->free_dis_tim_count++;
+   case BIT(NPA_INPQ_DPI):
+   return npa_event_count->free_dis_dpi_count++;
+   case BIT(NPA_INPQ_AURA_OP):
+   return npa_event_count->free_dis_aura_count++;
+   case BIT(NPA_INPQ_INTERNAL_RSV):
+   return

Re: [PATCH v8 2/4] remoteproc: coredump: Add minidump functionality

2020-11-20 Thread Bjorn Andersson

On Thu 19 Nov 15:05 CST 2020, Siddharth Gupta wrote:

> This change adds a new kind of core dump mechanism which instead of dumping
> entire program segments of the firmware, dumps sections of the remoteproc
> memory which are sufficient to allow debugging the firmware. This function
> thus uses section headers instead of program headers during creation of the
> core dump elf.
> 
> Co-developed-by: Rishabh Bhatnagar 
> Signed-off-by: Rishabh Bhatnagar 
> Signed-off-by: Siddharth Gupta 

Reviewed-by: Bjorn Andersson 

Regards,
Bjorn

> ---
>  drivers/remoteproc/remoteproc_coredump.c| 140 
> 
>  drivers/remoteproc/remoteproc_elf_helpers.h |  26 ++
>  include/linux/remoteproc.h  |   1 +
>  3 files changed, 167 insertions(+)
> 
> diff --git a/drivers/remoteproc/remoteproc_coredump.c 
> b/drivers/remoteproc/remoteproc_coredump.c
> index 34530dc..81ec154 100644
> --- a/drivers/remoteproc/remoteproc_coredump.c
> +++ b/drivers/remoteproc/remoteproc_coredump.c
> @@ -323,3 +323,143 @@ void rproc_coredump(struct rproc *rproc)
>*/
>   wait_for_completion(_state.dump_done);
>  }
> +
> +/**
> + * rproc_coredump_using_sections() - perform coredump using section headers
> + * @rproc:   rproc handle
> + *
> + * This function will generate an ELF header for the registered sections of
> + * segments and create a devcoredump device associated with rproc. Based on
> + * the coredump configuration this function will directly copy the segments
> + * from device memory to userspace or copy segments from device memory to
> + * a separate buffer, which can then be read by userspace.
> + * The first approach avoids using extra vmalloc memory. But it will stall
> + * recovery flow until dump is read by userspace.
> + */
> +void rproc_coredump_using_sections(struct rproc *rproc)
> +{
> + struct rproc_dump_segment *segment;
> + void *shdr;
> + void *ehdr;
> + size_t data_size;
> + size_t strtbl_size = 0;
> + size_t strtbl_index = 1;
> + size_t offset;
> + void *data;
> + u8 class = rproc->elf_class;
> + int shnum;
> + struct rproc_coredump_state dump_state;
> + unsigned int dump_conf = rproc->dump_conf;
> + char *str_tbl = "STR_TBL";
> +
> + if (list_empty(>dump_segments) ||
> + dump_conf == RPROC_COREDUMP_DISABLED)
> + return;
> +
> + if (class == ELFCLASSNONE) {
> + dev_err(>dev, "Elf class is not set\n");
> + return;
> + }
> +
> + /*
> +  * We allocate two extra section headers. The first one is null.
> +  * Second section header is for the string table. Also space is
> +  * allocated for string table.
> +  */
> + data_size = elf_size_of_hdr(class) + 2 * elf_size_of_shdr(class);
> + shnum = 2;
> +
> + /* the extra byte is for the null character at index 0 */
> + strtbl_size += strlen(str_tbl) + 2;
> +
> + list_for_each_entry(segment, >dump_segments, node) {
> + data_size += elf_size_of_shdr(class);
> + strtbl_size += strlen(segment->priv) + 1;
> + if (dump_conf == RPROC_COREDUMP_ENABLED)
> + data_size += segment->size;
> + shnum++;
> + }
> +
> + data_size += strtbl_size;
> +
> + data = vmalloc(data_size);
> + if (!data)
> + return;
> +
> + ehdr = data;
> + memset(ehdr, 0, elf_size_of_hdr(class));
> + /* e_ident field is common for both elf32 and elf64 */
> + elf_hdr_init_ident(ehdr, class);
> +
> + elf_hdr_set_e_type(class, ehdr, ET_CORE);
> + elf_hdr_set_e_machine(class, ehdr, rproc->elf_machine);
> + elf_hdr_set_e_version(class, ehdr, EV_CURRENT);
> + elf_hdr_set_e_entry(class, ehdr, rproc->bootaddr);
> + elf_hdr_set_e_shoff(class, ehdr, elf_size_of_hdr(class));
> + elf_hdr_set_e_ehsize(class, ehdr, elf_size_of_hdr(class));
> + elf_hdr_set_e_shentsize(class, ehdr, elf_size_of_shdr(class));
> + elf_hdr_set_e_shnum(class, ehdr, shnum);
> + elf_hdr_set_e_shstrndx(class, ehdr, 1);
> +
> + /*
> +  * The zeroth index of the section header is reserved and is rarely 
> used.
> +  * Set the section header as null (SHN_UNDEF) and move to the next one.
> +  */
> + shdr = data + elf_hdr_get_e_shoff(class, ehdr);
> + memset(shdr, 0, elf_size_of_shdr(class));
> + shdr += elf_size_of_shdr(class);
> +
> + /* Initialize the string table. */
> + offset = elf_hdr_get_e_shoff(class, ehdr) +
> +  elf_size_of_shdr(class) * elf_hdr_get_e_shnum(class, ehdr);
> + memset(data + offset, 0, strtbl_size);
> +
> + /* Fill in the string table section header. */
> + memset(shdr, 0, elf_size_of_shdr(class));
> + elf_shdr_set_sh_type(class, shdr, SHT_STRTAB);
> + elf_shdr_set_sh_offset(class, shdr, offset);
> + elf_shdr_set_sh_size(class, shdr, strtbl_size);
> + elf_shdr_set_sh_entsize(class, shdr, 0);
> +

[PATCHv4 net-next 3/3] octeontx2-af: Add devlink health reporters for NIX

2020-11-20 Thread George Cherian

Add health reporters for RVU NIX block.
NIX Health reporter handle following HW event groups
 - GENERAL events
 - RAS events
 - RVU event
An event counter per event is maintained in SW.

Output:
 # ./devlink health
 pci/0002:01:00.0:
   reporter npa
 state healthy error 0 recover 0
   reporter nix
 state healthy error 0 recover 0
 # ./devlink  health dump show pci/0002:01:00.0 reporter nix
  NIX_AF_GENERAL:
 Memory Fault on NIX_AQ_INST_S read: 0
 Memory Fault on NIX_AQ_RES_S write: 0
 AQ Doorbell error: 0
 Rx on unmapped PF_FUNC: 0
 Rx multicast replication error: 0
 Memory fault on NIX_RX_MCE_S read: 0
 Memory fault on multicast WQE read: 0
 Memory fault on mirror WQE read: 0
 Memory fault on mirror pkt write: 0
 Memory fault on multicast pkt write: 0
   NIX_AF_RAS:
 Poisoned data on NIX_AQ_INST_S read: 0
 Poisoned data on NIX_AQ_RES_S write: 0
 Poisoned data on HW context read: 0
 Poisoned data on packet read from mirror buffer: 0
 Poisoned data on packet read from mcast buffer: 0
 Poisoned data on WQE read from mirror buffer: 0
 Poisoned data on WQE read from multicast buffer: 0
 Poisoned data on NIX_RX_MCE_S read: 0
   NIX_AF_RVU:
 Unmap Slot Error: 0

Signed-off-by: Sunil Kovvuri Goutham 
Signed-off-by: Jerin Jacob 
Signed-off-by: George Cherian 
---
 .../marvell/octeontx2/af/rvu_devlink.c| 414 +-
 .../marvell/octeontx2/af/rvu_devlink.h|  31 ++
 .../marvell/octeontx2/af/rvu_struct.h |  10 +
 3 files changed, 453 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu_devlink.c 
b/drivers/net/ethernet/marvell/octeontx2/af/rvu_devlink.c
index b7f0691d86b0..c02d0f56ae7a 100644
--- a/drivers/net/ethernet/marvell/octeontx2/af/rvu_devlink.c
+++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu_devlink.c
@@ -35,6 +35,131 @@ static int rvu_report_pair_end(struct devlink_fmsg *fmsg)
return devlink_fmsg_pair_nest_end(fmsg);
 }
 
+static irqreturn_t rvu_nix_af_rvu_intr_handler(int irq, void *rvu_irq)
+{
+   struct rvu_nix_event_ctx *nix_event_context;
+   struct rvu_nix_event_cnt *nix_event_count;
+   struct rvu_devlink *rvu_dl = rvu_irq;
+   struct rvu *rvu;
+   int blkaddr;
+   u64 intr;
+
+   rvu = rvu_dl->rvu;
+   blkaddr = rvu_get_blkaddr(rvu, BLKTYPE_NIX, 0);
+   if (blkaddr < 0)
+   return IRQ_NONE;
+
+   nix_event_context = rvu_dl->nix_event_ctx;
+   nix_event_count = _event_context->nix_event_cnt;
+   intr = rvu_read64(rvu, blkaddr, NIX_AF_RVU_INT);
+   nix_event_context->nix_af_rvu_int = intr;
+
+   if (intr & BIT_ULL(0))
+   nix_event_count->unmap_slot_count++;
+
+   /* Clear interrupts */
+   rvu_write64(rvu, blkaddr, NIX_AF_RVU_INT, intr);
+   rvu_write64(rvu, blkaddr, NIX_AF_RVU_INT_ENA_W1C, ~0ULL);
+   devlink_health_report(rvu_dl->rvu_nix_health_reporter, "NIX_AF_RVU 
Error",
+ nix_event_context);
+
+   return IRQ_HANDLED;
+}
+
+static irqreturn_t rvu_nix_af_err_intr_handler(int irq, void *rvu_irq)
+{
+   struct rvu_nix_event_ctx *nix_event_context;
+   struct rvu_nix_event_cnt *nix_event_count;
+   struct rvu_devlink *rvu_dl = rvu_irq;
+   struct rvu *rvu;
+   int blkaddr;
+   u64 intr;
+
+   rvu = rvu_dl->rvu;
+   blkaddr = rvu_get_blkaddr(rvu, BLKTYPE_NIX, 0);
+   if (blkaddr < 0)
+   return IRQ_NONE;
+
+   nix_event_context = rvu_dl->nix_event_ctx;
+   nix_event_count = _event_context->nix_event_cnt;
+   intr = rvu_read64(rvu, blkaddr, NIX_AF_ERR_INT);
+   nix_event_context->nix_af_rvu_err = intr;
+
+   if (intr & BIT_ULL(14))
+   nix_event_count->aq_inst_count++;
+   if (intr & BIT_ULL(13))
+   nix_event_count->aq_res_count++;
+   if (intr & BIT_ULL(12))
+   nix_event_count->aq_db_count++;
+   if (intr & BIT_ULL(6))
+   nix_event_count->rx_on_unmap_pf_count++;
+   if (intr & BIT_ULL(5))
+   nix_event_count->rx_mcast_repl_count++;
+   if (intr & BIT_ULL(4))
+   nix_event_count->rx_mcast_memfault_count++;
+   if (intr & BIT_ULL(3))
+   nix_event_count->rx_mcast_wqe_memfault_count++;
+   if (intr & BIT_ULL(2))
+   nix_event_count->rx_mirror_wqe_memfault_count++;
+   if (intr & BIT_ULL(1))
+   nix_event_count->rx_mirror_pktw_memfault_count++;
+   if (intr & BIT_ULL(0))
+   nix_event_count->rx_mcast_pktw_memfault_count++;
+
+   /* Clear interrupts */
+   rvu_write64(rvu, blkaddr, NIX_AF_ERR_INT, intr);
+   rvu_write64(rvu, blkaddr, NIX_AF_ERR_INT_ENA_W1C, ~0ULL);
+   devlink_health_report(rvu_dl->rvu_nix_health_reporter, "NIX_AF_ERR 
Error",
+ nix_event_context);
+
+

[PATCHv4 net-next 1/3] octeontx2-af: Add devlink suppoort to af driver

2020-11-20 Thread George Cherian

Add devlink support to AF driver. Basic devlink support is added.
Currently info_get is the only supported devlink ops.

devlink ouptput looks like this
 # devlink dev
 pci/0002:01:00.0
 # devlink dev info
 pci/0002:01:00.0:
  driver octeontx2-af
  versions:
  fixed:
mbox version: 9

Signed-off-by: Sunil Kovvuri Goutham 
Signed-off-by: Jerin Jacob 
Signed-off-by: George Cherian 
---
 .../net/ethernet/marvell/octeontx2/Kconfig|  1 +
 .../ethernet/marvell/octeontx2/af/Makefile|  2 +-
 .../net/ethernet/marvell/octeontx2/af/rvu.c   |  9 ++-
 .../net/ethernet/marvell/octeontx2/af/rvu.h   |  4 ++
 .../marvell/octeontx2/af/rvu_devlink.c| 72 +++
 .../marvell/octeontx2/af/rvu_devlink.h| 20 ++
 6 files changed, 106 insertions(+), 2 deletions(-)
 create mode 100644 drivers/net/ethernet/marvell/octeontx2/af/rvu_devlink.c
 create mode 100644 drivers/net/ethernet/marvell/octeontx2/af/rvu_devlink.h

diff --git a/drivers/net/ethernet/marvell/octeontx2/Kconfig 
b/drivers/net/ethernet/marvell/octeontx2/Kconfig
index 543a1d047567..16caa02095fe 100644
--- a/drivers/net/ethernet/marvell/octeontx2/Kconfig
+++ b/drivers/net/ethernet/marvell/octeontx2/Kconfig
@@ -9,6 +9,7 @@ config OCTEONTX2_MBOX
 config OCTEONTX2_AF
tristate "Marvell OcteonTX2 RVU Admin Function driver"
select OCTEONTX2_MBOX
+   select NET_DEVLINK
depends on (64BIT && COMPILE_TEST) || ARM64
depends on PCI
help
diff --git a/drivers/net/ethernet/marvell/octeontx2/af/Makefile 
b/drivers/net/ethernet/marvell/octeontx2/af/Makefile
index 7100d1dd856e..eb535c98ca38 100644
--- a/drivers/net/ethernet/marvell/octeontx2/af/Makefile
+++ b/drivers/net/ethernet/marvell/octeontx2/af/Makefile
@@ -10,4 +10,4 @@ obj-$(CONFIG_OCTEONTX2_AF) += octeontx2_af.o
 octeontx2_mbox-y := mbox.o rvu_trace.o
 octeontx2_af-y := cgx.o rvu.o rvu_cgx.o rvu_npa.o rvu_nix.o \
  rvu_reg.o rvu_npc.o rvu_debugfs.o ptp.o rvu_npc_fs.o \
- rvu_cpt.o
+ rvu_cpt.o rvu_devlink.o
diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu.c 
b/drivers/net/ethernet/marvell/octeontx2/af/rvu.c
index 9f901c0edcbb..e8fd712860a1 100644
--- a/drivers/net/ethernet/marvell/octeontx2/af/rvu.c
+++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu.c
@@ -2826,17 +2826,23 @@ static int rvu_probe(struct pci_dev *pdev, const struct 
pci_device_id *id)
if (err)
goto err_flr;
 
+   err = rvu_register_dl(rvu);
+   if (err)
+   goto err_irq;
+
rvu_setup_rvum_blk_revid(rvu);
 
/* Enable AF's VFs (if any) */
err = rvu_enable_sriov(rvu);
if (err)
-   goto err_irq;
+   goto err_dl;
 
/* Initialize debugfs */
rvu_dbg_init(rvu);
 
return 0;
+err_dl:
+   rvu_unregister_dl(rvu);
 err_irq:
rvu_unregister_interrupts(rvu);
 err_flr:
@@ -2868,6 +2874,7 @@ static void rvu_remove(struct pci_dev *pdev)
 
rvu_dbg_exit(rvu);
rvu_unregister_interrupts(rvu);
+   rvu_unregister_dl(rvu);
rvu_flr_wq_destroy(rvu);
rvu_cgx_exit(rvu);
rvu_fwdata_exit(rvu);
diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu.h 
b/drivers/net/ethernet/marvell/octeontx2/af/rvu.h
index b6c0977499ab..b1a6ecfd563e 100644
--- a/drivers/net/ethernet/marvell/octeontx2/af/rvu.h
+++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu.h
@@ -12,7 +12,10 @@
 #define RVU_H
 
 #include 
+#include 
+
 #include "rvu_struct.h"
+#include "rvu_devlink.h"
 #include "common.h"
 #include "mbox.h"
 #include "npc.h"
@@ -422,6 +425,7 @@ struct rvu {
 #ifdef CONFIG_DEBUG_FS
struct rvu_debugfs  rvu_dbg;
 #endif
+   struct rvu_devlink  *rvu_dl;
 };
 
 static inline void rvu_write64(struct rvu *rvu, u64 block, u64 offset, u64 val)
diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu_devlink.c 
b/drivers/net/ethernet/marvell/octeontx2/af/rvu_devlink.c
new file mode 100644
index ..04ef945e7e75
--- /dev/null
+++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu_devlink.c
@@ -0,0 +1,72 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Marvell OcteonTx2 RVU Devlink
+ *
+ * Copyright (C) 2020 Marvell.
+ *
+ */
+
+#include "rvu.h"
+
+#define DRV_NAME "octeontx2-af"
+
+static int rvu_devlink_info_get(struct devlink *devlink, struct 
devlink_info_req *req,
+   struct netlink_ext_ack *extack)
+{
+   char buf[10];
+   int err;
+
+   err = devlink_info_driver_name_put(req, DRV_NAME);
+   if (err)
+   return err;
+
+   sprintf(buf, "%X", OTX2_MBOX_VERSION);
+   return devlink_info_version_fixed_put(req, "mbox version:", buf);
+}
+
+static const struct devlink_ops rvu_devlink_ops = {
+   .info_get = rvu_devlink_info_get,
+};
+
+int rvu_register_dl(struct rvu *rvu)
+{
+   struct rvu_devlink *rvu_dl;
+   struct devlink *dl;
+   int err;
+
+   rvu_dl = kzalloc(sizeof(*rvu_dl),

[PATCHv3 net-next 0/3] Add devlink and devlink health reporters to

2020-11-20 Thread George Cherian

Add basic devlink and devlink health reporters.
Devlink health reporters are added for NPA and NIX blocks.
These reporters report the error count in respective blocks.

Address Jakub's comment to add devlink support for error reporting.
https://www.spinics.net/lists/netdev/msg670712.html

Change-log:
v4 
 - Rebase to net-next (no logic changes).
 
v3
 - Address Saeed's comments on v2.
 - Renamed the reporter name as hw_*.
 - Call devlink_health_report() when an event is raised.
 - Added recover op too.

v2
 - Address Willem's comments on v1.
 - Fixed the sparse issues, reported by Jakub.


George Cherian (3):
  octeontx2-af: Add devlink suppoort to af driver
  octeontx2-af: Add devlink health reporters for NPA
  octeontx2-af: Add devlink health reporters for NIX

 .../net/ethernet/marvell/octeontx2/Kconfig|   1 +
 .../ethernet/marvell/octeontx2/af/Makefile|   2 +-
 .../net/ethernet/marvell/octeontx2/af/rvu.c   |   9 +-
 .../net/ethernet/marvell/octeontx2/af/rvu.h   |   4 +
 .../marvell/octeontx2/af/rvu_devlink.c| 972 ++
 .../marvell/octeontx2/af/rvu_devlink.h|  82 ++
 .../marvell/octeontx2/af/rvu_struct.h |  33 +
 7 files changed, 1101 insertions(+), 2 deletions(-)
 create mode 100644 drivers/net/ethernet/marvell/octeontx2/af/rvu_devlink.c
 create mode 100644 drivers/net/ethernet/marvell/octeontx2/af/rvu_devlink.h

-- 
2.25.1

Re: [PATCH 2/2] hwspinlock: add SUNXI implementation

2020-11-20 Thread Bjorn Andersson

On Thu 19 Nov 00:44 CST 2020, fu...@allwinnertech.com wrote:

> From: fuyao 
> 
> Add hwspinlock support for the SUNXI Hardware Spinlock device.
> 
> The Hardware Spinlock device on SUNXI provides hardware assistance
> for synchronization between the multiple processors in the system
> (Cortex-A7, or1k, Xtensa DSP, Cortex-A53)
> 
> Signed-off-by: fuyao 
> ---
>  MAINTAINERS   |   6 +
>  drivers/hwspinlock/Kconfig|  10 ++
>  drivers/hwspinlock/Makefile   |   1 +
>  drivers/hwspinlock/sunxi_hwspinlock.c | 205 ++
>  4 files changed, 222 insertions(+)
>  create mode 100644 drivers/hwspinlock/sunxi_hwspinlock.c
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index e451dcce054f0..68d25574432d0 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -737,6 +737,12 @@ L:   linux-me...@vger.kernel.org
>  S:   Maintained
>  F:   drivers/staging/media/sunxi/cedrus/
>  
> +ALLWINNER HWSPINLOCK DRIVER
> +M:   fuyao 
> +S:   Maintained
> +F:   drivers/hwspinlock/sunxi_hwspinlock.c
> +F:  Documentation/devicetree/bindings/hwlock/sunxi,hwspinlock.yaml
> +
>  ALPHA PORT
>  M:   Richard Henderson 
>  M:   Ivan Kokshaysky 
> diff --git a/drivers/hwspinlock/Kconfig b/drivers/hwspinlock/Kconfig
> index 32cd26352f381..4d0d516dcb544 100644
> --- a/drivers/hwspinlock/Kconfig
> +++ b/drivers/hwspinlock/Kconfig
> @@ -55,6 +55,16 @@ config HWSPINLOCK_STM32
>  
> If unsure, say N.
>  
> +config HWSPINLOCK_SUNXI
> + tristate "SUNXI Hardware Spinlock device"
> + depends on ARCH_SUNXI || COMPILE_TEST
> + help
> +   Say y here to support the SUNXI Hardware Semaphore functionality, 
> which
> +   provides a synchronisation mechanism for the various processor on the
> +   SoC.
> +
> +   If unsure, say N.
> +
>  config HSEM_U8500
>   tristate "STE Hardware Semaphore functionality"
>   depends on ARCH_U8500 || COMPILE_TEST
> diff --git a/drivers/hwspinlock/Makefile b/drivers/hwspinlock/Makefile
> index ed053e3f02be4..839a053205f73 100644
> --- a/drivers/hwspinlock/Makefile
> +++ b/drivers/hwspinlock/Makefile
> @@ -10,3 +10,4 @@ obj-$(CONFIG_HWSPINLOCK_SIRF)   += 
> sirf_hwspinlock.o
>  obj-$(CONFIG_HWSPINLOCK_SPRD)+= sprd_hwspinlock.o
>  obj-$(CONFIG_HWSPINLOCK_STM32)   += stm32_hwspinlock.o
>  obj-$(CONFIG_HSEM_U8500) += u8500_hsem.o
> +obj-$(CONFIG_HWSPINLOCK_SUNXI)   += sunxi_hwspinlock.o
> diff --git a/drivers/hwspinlock/sunxi_hwspinlock.c 
> b/drivers/hwspinlock/sunxi_hwspinlock.c
> new file mode 100644
> index 0..2c3dc148c9b72
> --- /dev/null
> +++ b/drivers/hwspinlock/sunxi_hwspinlock.c
> @@ -0,0 +1,205 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * SUNXI hardware spinlock driver
> + *
> + * Copyright (C) 2020 Allwinnertech - http://www.allwinnertech.com
> + *

Please remove the remainder of this comment, it's already covered by the
SPDX header above.

> + * This program is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU General Public License
> + * version 2 as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful, but
> + * WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * General Public License for more details.
> + */
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 

You don't need all of these.

> +
> +#include "hwspinlock_internal.h"
> +
> +/* hardware spinlock register list */
> +#define  LOCK_SYS_STATUS_REG (0x)
> +#define  LOCK_STATUS_REG (0x0010)
> +#define LOCK_BASE_OFFSET(0x0100)
> +#define LOCK_BASE_ID(0)

No need for the parenthesis on these, please drop them.

> +
> +/* Possible values of SPINLOCK_LOCK_REG */
> +#define SPINLOCK_NOTTAKEN   (0) /* free */
> +#define SPINLOCK_TAKEN  (1) /* locked */
> +
> +struct sunxi_spinlock_config {
> + int nspin;
> +};
> +
> +static int sunxi_hwspinlock_trylock(struct hwspinlock *lock)
> +{
> + void __iomem *lock_addr = lock->priv;
> +
> + /* attempt to acquire the lock by reading its value */
> + return (readl(lock_addr) == SPINLOCK_NOTTAKEN);

Please drop the outer ().

> +}
> +
> +static void sunxi_hwspinlock_unlock(struct hwspinlock *lock)
> +{
> + void __iomem *lock_addr = lock->priv;
> +
> + /* release the lock by writing 0 to it */
> + writel(SPINLOCK_NOTTAKEN, lock_addr);
> +}
> +
> +/*
> + * relax the SUNXI interconnect while spinning on it.
> + *
> + * The specs recommended that the retry delay time will be
> + * just over half of the time that a requester would be
> + * expected

Re: [PATCH net-next 4/6] net: ipa: support retries on generic GSI commands

2020-11-20 Thread Jakub Kicinski

On Fri, 20 Nov 2020 21:31:09 -0600 Alex Elder wrote:
> On 11/20/20 8:49 PM, Jakub Kicinski wrote:
> > On Thu, 19 Nov 2020 16:49:27 -0600 Alex Elder wrote:  
> >> +  do
> >> +  ret = gsi_generic_command(gsi, channel_id,
> >> +GSI_GENERIC_HALT_CHANNEL);
> >> +  while (ret == -EAGAIN && retries--);  
> > 
> > This may well be the first time I've seen someone write a do while loop
> > without the curly brackets!  
> 
> I had them at one time, then saw I could get away
> without them.  I don't have a preference but I see
> you accepted it as-is.

It was just an offhand comment, I don't have anything against it :)

Re: [PATCH v2 2/3] remoteproc: Introduce deny_sysfs_ops flag

2020-11-20 Thread Suman Anna

On 11/20/20 9:38 PM, Bjorn Andersson wrote:
> On Fri 20 Nov 21:01 CST 2020, Suman Anna wrote:
> 
>> The remoteproc framework provides sysfs interfaces for changing
>> the firmware name and for starting/stopping a remote processor
>> through the sysfs files 'state' and 'firmware'. The 'recovery'
>> sysfs file can also be used similarly to control the error recovery
>> state machine of a remoteproc. These interfaces are currently
>> allowed irrespective of how the remoteprocs were booted (like
>> remoteproc self auto-boot, remoteproc client-driven boot etc).
>> These interfaces can adversely affect a remoteproc and its clients
>> especially when a remoteproc is being controlled by a remoteproc
>> client driver(s). Also, not all remoteproc drivers may want to
>> support the sysfs interfaces by default.
>>
>> Add support to deny the sysfs state/firmware/recovery change by
>> introducing a state flag 'deny_sysfs_ops' that the individual
>> remoteproc drivers can set based on their usage needs. The default
>> behavior is to allow the sysfs operations as before.
>>
> 
> This makes sense, but can't we implement attribute_group->is_visible to
> simply hide these entries from userspace instead of leaving them
> "broken"?

I would have to look into that, but can that be changed dynamically?
Also, note that the enforcement is only on the writes/stores which impact
the state-machine, but not the reads/shows.

For PRU usecases, we will be setting this dynamically.

regards
Suman

> 
> Regards,
> Bjorn
> 
>> Signed-off-by: Suman Anna 
>> ---
>> v2: revised to account for the 'recovery' sysfs file as well, patch
>> description updated accordingly
>> v1: 
>> https://patchwork.kernel.org/project/linux-remoteproc/patch/20180915003725.17549-5-s-a...@ti.com/
>>
>>  drivers/remoteproc/remoteproc_sysfs.c | 12 
>>  include/linux/remoteproc.h|  2 ++
>>  2 files changed, 14 insertions(+)
>>
>> diff --git a/drivers/remoteproc/remoteproc_sysfs.c 
>> b/drivers/remoteproc/remoteproc_sysfs.c
>> index bd2950a246c9..3fd18a71c188 100644
>> --- a/drivers/remoteproc/remoteproc_sysfs.c
>> +++ b/drivers/remoteproc/remoteproc_sysfs.c
>> @@ -49,6 +49,10 @@ static ssize_t recovery_store(struct device *dev,
>>  {
>>  struct rproc *rproc = to_rproc(dev);
>>  
>> +/* restrict sysfs operations if not allowed by remoteproc drivers */
>> +if (rproc->deny_sysfs_ops)
>> +return -EPERM;
>> +
>>  if (sysfs_streq(buf, "enabled")) {
>>  /* change the flag and begin the recovery process if needed */
>>  rproc->recovery_disabled = false;
>> @@ -158,6 +162,10 @@ static ssize_t firmware_store(struct device *dev,
>>  char *p;
>>  int err, len = count;
>>  
>> +/* restrict sysfs operations if not allowed by remoteproc drivers */
>> +if (rproc->deny_sysfs_ops)
>> +return -EPERM;
>> +
>>  err = mutex_lock_interruptible(>lock);
>>  if (err) {
>>  dev_err(dev, "can't lock rproc %s: %d\n", rproc->name, err);
>> @@ -225,6 +233,10 @@ static ssize_t state_store(struct device *dev,
>>  struct rproc *rproc = to_rproc(dev);
>>  int ret = 0;
>>  
>> +/* restrict sysfs operations if not allowed by remoteproc drivers */
>> +if (rproc->deny_sysfs_ops)
>> +return -EPERM;
>> +
>>  if (sysfs_streq(buf, "start")) {
>>  if (rproc->state == RPROC_RUNNING)
>>  return -EBUSY;
>> diff --git a/include/linux/remoteproc.h b/include/linux/remoteproc.h
>> index 3fa3ba6498e8..dbc3767f7d0e 100644
>> --- a/include/linux/remoteproc.h
>> +++ b/include/linux/remoteproc.h
>> @@ -508,6 +508,7 @@ struct rproc_dump_segment {
>>   * @has_iommu: flag to indicate if remote processor is behind an MMU
>>   * @auto_boot: flag to indicate if remote processor should be auto-started
>>   * @autonomous: true if an external entity has booted the remote processor
>> + * @deny_sysfs_ops: flag to not permit sysfs operations on state, firmware 
>> and recovery
>>   * @dump_segments: list of segments in the firmware
>>   * @nb_vdev: number of vdev currently handled by rproc
>>   * @char_dev: character device of the rproc
>> @@ -545,6 +546,7 @@ struct rproc {
>>  bool has_iommu;
>>  bool auto_boot;
>>  bool autonomous;
>> +bool deny_sysfs_ops;
>>  struct list_head dump_segments;
>>  int nb_vdev;
>>  u8 elf_class;
>> -- 
>> 2.28.0
>>

Re: [PATCH v12 5/5] selftest: mhi: Add support to test MHI LOOPBACK channel

2020-11-20 Thread Hemant Kumar


Hi Shuah,

On 11/20/20 7:26 AM, Shuah Khan wrote:

On 11/16/20 3:46 PM, Hemant Kumar wrote:

Loopback test opens the MHI device file node and writes
a data buffer to it. MHI UCI kernel space driver copies
the data and sends it to MHI uplink (Tx) LOOPBACK channel.
MHI device loops back the same data to MHI downlink (Rx)
LOOPBACK channel. This data is read by test application
and compared against the data sent. Test passes if data
buffer matches between Tx and Rx. Test application performs
open(), poll(), write(), read() and close() file operations.

Signed-off-by: Hemant Kumar 
---
  Documentation/mhi/uci.rst  |  32 +
  tools/testing/selftests/Makefile   |   1 +
  tools/testing/selftests/drivers/.gitignore |   1 +
  tools/testing/selftests/drivers/mhi/Makefile   |   8 +
  tools/testing/selftests/drivers/mhi/config |   2 +
  .../testing/selftests/drivers/mhi/loopback_test.c  | 802 
+

  6 files changed, 846 insertions(+)
  create mode 100644 tools/testing/selftests/drivers/mhi/Makefile
  create mode 100644 tools/testing/selftests/drivers/mhi/config
  create mode 100644 tools/testing/selftests/drivers/mhi/loopback_test.c

diff --git a/Documentation/mhi/uci.rst b/Documentation/mhi/uci.rst
index ce8740e..0a04afe 100644
--- a/Documentation/mhi/uci.rst
+++ b/Documentation/mhi/uci.rst
@@ -79,6 +79,38 @@ MHI client driver performs read operation, same 
data gets looped back to MHI
  host using LOOPBACK channel 1. LOOPBACK channel is used to verify 
data path

  and data integrity between MHI Host and MHI device.


Nice.

[..]

+
+enum debug_level {
+    DBG_LVL_VERBOSE,
+    DBG_LVL_INFO,
+    DBG_LVL_ERROR,
+};
+
+enum test_status {
+    TEST_STATUS_SUCCESS,
+    TEST_STATUS_ERROR,
+    TEST_STATUS_NO_DEV,
+};
+


Since you are running this test as part of kselftest, please use
ksft errors nd returns.

Are you suggesting to use following macros instead of test_status enum ?
#define KSFT_PASS  0
#define KSFT_FAIL  1




+struct lb_test_ctx {
+    char dev_node[256];
+    unsigned char *tx_buff;
+    unsigned char *rx_buff;
+    unsigned int rx_pkt_count;
+    unsigned int tx_pkt_count;
+    int iterations;
+    bool iter_complete;
+    bool comp_complete;
+    bool test_complete;
+    bool all_complete;
+    unsigned long buff_size;
+    long byte_recvd;
+    long byte_sent;
+};
+
+bool force_exit;
+char write_data = 'a';
+int completed_iterations;
+
+struct lb_test_ctx test_ctxt;
+enum debug_level msg_lvl;
+struct pollfd read_fd;
+struct pollfd write_fd;
+enum test_status mhi_test_return_value;
+enum test_status tx_status;
+enum test_status rx_status;
+enum test_status cmp_rxtx_status;
+
+#define test_log(test_msg_lvl, format, ...) do { \
+    if (test_msg_lvl >= msg_lvl) \
+    fprintf(stderr, format, ##__VA_ARGS__); \
+} while (0)
+
+static void loopback_test_sleep_ms(int ms)
+{
+    usleep(1000 * ms);
+}
+


Have you run this as part of "make kselftest" run. How does this
sleep work in that env.?
Looks like kselftest runs this test application by directly executing 
the binary, but this test application requires a valid mhi device file 
node as a required parameter. So considering that requirement, is this 
test application qualifies to run using kselftest ? Without a valid 
device file node test would fail. Is there an option to run this test as 
standalone test which can only be run when a mhi device file node is 
present ? Having said that i tested this driver by

directly executing it using the test binary which is compiled using
make loopback_test under mhi dir.


Are there any cases where this test can't run and have to - those
cases need to be skips.
Yes, as this test application can not run by itself it needs a valid mhi 
device file node to write and test reads the same device node to get the 
data back.
So test can not be run without having a MHI device connected over a 
transport (in my testing MHI device is connected over PCIe). Could you 
please suggest an option to use this test application as a standalone 
test instead of being part of kselftest?


thanks,
-- Shuah


Thanks,
Hemant
--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

[PATCH] spi: amd: Use devm_platform_ioremap_resource() in amd_spi_probe

2020-11-20 Thread Qing Zhang

Simplify this function implementation by using a known wrapper function.

Signed-off-by: Qing Zhang 
---
 drivers/spi/spi-amd.c | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/drivers/spi/spi-amd.c b/drivers/spi/spi-amd.c
index 7f62954..3cf7609 100644
--- a/drivers/spi/spi-amd.c
+++ b/drivers/spi/spi-amd.c
@@ -250,7 +250,6 @@ static int amd_spi_probe(struct platform_device *pdev)
struct device *dev = >dev;
struct spi_master *master;
struct amd_spi *amd_spi;
-   struct resource *res;
int err = 0;
 
/* Allocate storage for spi_master and driver private data */
@@ -261,9 +260,7 @@ static int amd_spi_probe(struct platform_device *pdev)
}
 
amd_spi = spi_master_get_devdata(master);
-
-   res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
-   amd_spi->io_remap_addr = devm_ioremap_resource(>dev, res);
+   amd_spi->io_remap_addr = devm_platform_ioremap_resource(pdev, 0);
if (IS_ERR(amd_spi->io_remap_addr)) {
err = PTR_ERR(amd_spi->io_remap_addr);
dev_err(dev, "error %d ioremap of SPI registers failed\n", err);
-- 
2.1.0

Re: [PATCH v2 3/3] remoteproc: wkup_m3: Set deny_sysfs_ops flag

2020-11-20 Thread Bjorn Andersson

On Fri 20 Nov 21:01 CST 2020, Suman Anna wrote:

> The Wakeup M3 remote processor is controlled by the wkup_m3_ipc
> client driver, so set the newly introduced 'deny_sysfs_ops' flag
> to not allow any overriding of the remoteproc firmware or state
> from userspace.
> 

Reviewed-by: Bjorn Andersson 

> Signed-off-by: Suman Anna 
> ---
> v2: rebased version, no code changes, patch title adjusted slightly
> v1: 
> https://patchwork.kernel.org/project/linux-remoteproc/patch/20180915003725.17549-6-s-a...@ti.com/
> 
>  drivers/remoteproc/wkup_m3_rproc.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/remoteproc/wkup_m3_rproc.c 
> b/drivers/remoteproc/wkup_m3_rproc.c
> index b9349d684258..d92d7f32ba8d 100644
> --- a/drivers/remoteproc/wkup_m3_rproc.c
> +++ b/drivers/remoteproc/wkup_m3_rproc.c
> @@ -160,6 +160,7 @@ static int wkup_m3_rproc_probe(struct platform_device 
> *pdev)
>   }
>  
>   rproc->auto_boot = false;
> + rproc->deny_sysfs_ops = true;
>  
>   wkupm3 = rproc->priv;
>   wkupm3->rproc = rproc;
> -- 
> 2.28.0
>

Re: [PATCH v2 2/3] remoteproc: Introduce deny_sysfs_ops flag

2020-11-20 Thread Bjorn Andersson

On Fri 20 Nov 21:01 CST 2020, Suman Anna wrote:

> The remoteproc framework provides sysfs interfaces for changing
> the firmware name and for starting/stopping a remote processor
> through the sysfs files 'state' and 'firmware'. The 'recovery'
> sysfs file can also be used similarly to control the error recovery
> state machine of a remoteproc. These interfaces are currently
> allowed irrespective of how the remoteprocs were booted (like
> remoteproc self auto-boot, remoteproc client-driven boot etc).
> These interfaces can adversely affect a remoteproc and its clients
> especially when a remoteproc is being controlled by a remoteproc
> client driver(s). Also, not all remoteproc drivers may want to
> support the sysfs interfaces by default.
> 
> Add support to deny the sysfs state/firmware/recovery change by
> introducing a state flag 'deny_sysfs_ops' that the individual
> remoteproc drivers can set based on their usage needs. The default
> behavior is to allow the sysfs operations as before.
> 

This makes sense, but can't we implement attribute_group->is_visible to
simply hide these entries from userspace instead of leaving them
"broken"?

Regards,
Bjorn

> Signed-off-by: Suman Anna 
> ---
> v2: revised to account for the 'recovery' sysfs file as well, patch
> description updated accordingly
> v1: 
> https://patchwork.kernel.org/project/linux-remoteproc/patch/20180915003725.17549-5-s-a...@ti.com/
> 
>  drivers/remoteproc/remoteproc_sysfs.c | 12 
>  include/linux/remoteproc.h|  2 ++
>  2 files changed, 14 insertions(+)
> 
> diff --git a/drivers/remoteproc/remoteproc_sysfs.c 
> b/drivers/remoteproc/remoteproc_sysfs.c
> index bd2950a246c9..3fd18a71c188 100644
> --- a/drivers/remoteproc/remoteproc_sysfs.c
> +++ b/drivers/remoteproc/remoteproc_sysfs.c
> @@ -49,6 +49,10 @@ static ssize_t recovery_store(struct device *dev,
>  {
>   struct rproc *rproc = to_rproc(dev);
>  
> + /* restrict sysfs operations if not allowed by remoteproc drivers */
> + if (rproc->deny_sysfs_ops)
> + return -EPERM;
> +
>   if (sysfs_streq(buf, "enabled")) {
>   /* change the flag and begin the recovery process if needed */
>   rproc->recovery_disabled = false;
> @@ -158,6 +162,10 @@ static ssize_t firmware_store(struct device *dev,
>   char *p;
>   int err, len = count;
>  
> + /* restrict sysfs operations if not allowed by remoteproc drivers */
> + if (rproc->deny_sysfs_ops)
> + return -EPERM;
> +
>   err = mutex_lock_interruptible(>lock);
>   if (err) {
>   dev_err(dev, "can't lock rproc %s: %d\n", rproc->name, err);
> @@ -225,6 +233,10 @@ static ssize_t state_store(struct device *dev,
>   struct rproc *rproc = to_rproc(dev);
>   int ret = 0;
>  
> + /* restrict sysfs operations if not allowed by remoteproc drivers */
> + if (rproc->deny_sysfs_ops)
> + return -EPERM;
> +
>   if (sysfs_streq(buf, "start")) {
>   if (rproc->state == RPROC_RUNNING)
>   return -EBUSY;
> diff --git a/include/linux/remoteproc.h b/include/linux/remoteproc.h
> index 3fa3ba6498e8..dbc3767f7d0e 100644
> --- a/include/linux/remoteproc.h
> +++ b/include/linux/remoteproc.h
> @@ -508,6 +508,7 @@ struct rproc_dump_segment {
>   * @has_iommu: flag to indicate if remote processor is behind an MMU
>   * @auto_boot: flag to indicate if remote processor should be auto-started
>   * @autonomous: true if an external entity has booted the remote processor
> + * @deny_sysfs_ops: flag to not permit sysfs operations on state, firmware 
> and recovery
>   * @dump_segments: list of segments in the firmware
>   * @nb_vdev: number of vdev currently handled by rproc
>   * @char_dev: character device of the rproc
> @@ -545,6 +546,7 @@ struct rproc {
>   bool has_iommu;
>   bool auto_boot;
>   bool autonomous;
> + bool deny_sysfs_ops;
>   struct list_head dump_segments;
>   int nb_vdev;
>   u8 elf_class;
> -- 
> 2.28.0
>

Re: [PATCH] remoteproc: Add a rproc_set_firmware() API

2020-11-20 Thread Bjorn Andersson

On Fri 20 Nov 21:20 CST 2020, Suman Anna wrote:

> A new API, rproc_set_firmware() is added to allow the remoteproc platform
> drivers and remoteproc client drivers to be able to configure a custom
> firmware name that is different from the default name used during
> remoteproc registration. This function is being introduced to provide
> a kernel-level equivalent of the current sysfs interface to remoteproc
> client drivers, and can only change firmwares when the remoteproc is
> offline. This allows some remoteproc drivers to choose different firmwares
> at runtime based on the functionality the remote processor is providing.
> The TI PRU Ethernet driver will be an example of such usage as it
> requires to use different firmwares for different supported protocols.
> 
> Also, update the firmware_store() function used by the sysfs interface
> to reuse this function to avoid code duplication.
> 
> Signed-off-by: Suman Anna 

Reviewed-by: Bjorn Andersson 

Regards,
Bjorn

> ---
>  drivers/remoteproc/remoteproc_core.c  | 63 +++
>  drivers/remoteproc/remoteproc_sysfs.c | 33 +-
>  include/linux/remoteproc.h|  1 +
>  3 files changed, 66 insertions(+), 31 deletions(-)
> 
> diff --git a/drivers/remoteproc/remoteproc_core.c 
> b/drivers/remoteproc/remoteproc_core.c
> index dab2c0f5caf0..46c2937ebea9 100644
> --- a/drivers/remoteproc/remoteproc_core.c
> +++ b/drivers/remoteproc/remoteproc_core.c
> @@ -1934,6 +1934,69 @@ struct rproc *rproc_get_by_phandle(phandle phandle)
>  #endif
>  EXPORT_SYMBOL(rproc_get_by_phandle);
>  
> +/**
> + * rproc_set_firmware() - assign a new firmware
> + * @rproc: rproc handle to which the new firmware is being assigned
> + * @fw_name: new firmware name to be assigned
> + *
> + * This function allows remoteproc drivers or clients to configure a custom
> + * firmware name that is different from the default name used during 
> remoteproc
> + * registration. The function does not trigger a remote processor boot,
> + * only sets the firmware name used for a subsequent boot. This function
> + * should also be called only when the remote processor is offline.
> + *
> + * This allows either the userspace to configure a different name through
> + * sysfs or a kernel-level remoteproc or a remoteproc client driver to set
> + * a specific firmware when it is controlling the boot and shutdown of the
> + * remote processor.
> + *
> + * Return: 0 on success or a negative value upon failure
> + */
> +int rproc_set_firmware(struct rproc *rproc, const char *fw_name)
> +{
> + struct device *dev;
> + int ret, len;
> + char *p;
> +
> + if (!rproc || !fw_name)
> + return -EINVAL;
> +
> + dev = rproc->dev.parent;
> +
> + ret = mutex_lock_interruptible(>lock);
> + if (ret) {
> + dev_err(dev, "can't lock rproc %s: %d\n", rproc->name, ret);
> + return -EINVAL;
> + }
> +
> + if (rproc->state != RPROC_OFFLINE) {
> + dev_err(dev, "can't change firmware while running\n");
> + ret = -EBUSY;
> + goto out;
> + }
> +
> + len = strcspn(fw_name, "\n");
> + if (!len) {
> + dev_err(dev, "can't provide empty string for firmware name\n");
> + ret = -EINVAL;
> + goto out;
> + }
> +
> + p = kstrndup(fw_name, len, GFP_KERNEL);
> + if (!p) {
> + ret = -ENOMEM;
> + goto out;
> + }
> +
> + kfree(rproc->firmware);
> + rproc->firmware = p;
> +
> +out:
> + mutex_unlock(>lock);
> + return ret;
> +}
> +EXPORT_SYMBOL(rproc_set_firmware);
> +
>  static int rproc_validate(struct rproc *rproc)
>  {
>   switch (rproc->state) {
> diff --git a/drivers/remoteproc/remoteproc_sysfs.c 
> b/drivers/remoteproc/remoteproc_sysfs.c
> index 3fd18a71c188..cf846caf2e1a 100644
> --- a/drivers/remoteproc/remoteproc_sysfs.c
> +++ b/drivers/remoteproc/remoteproc_sysfs.c
> @@ -159,42 +159,13 @@ static ssize_t firmware_store(struct device *dev,
> const char *buf, size_t count)
>  {
>   struct rproc *rproc = to_rproc(dev);
> - char *p;
> - int err, len = count;
> + int err;
>  
>   /* restrict sysfs operations if not allowed by remoteproc drivers */
>   if (rproc->deny_sysfs_ops)
>   return -EPERM;
>  
> - err = mutex_lock_interruptible(>lock);
> - if (err) {
> - dev_err(dev, "can't lock rproc %s: %d\n", rproc->name, err);
> - return -EINVAL;
> - }
> -
> - if (rproc->state != RPROC_OFFLINE) {
> - dev_err(dev, "can't change firmware while running\n");
> - err = -EBUSY;
> - goto out;
> - }
> -
> - len = strcspn(buf, "\n");
> - if (!len) {
> - dev_err(dev, "can't provide a NULL firmware\n");
> - err = -EINVAL;
> - goto out;
> - }
> -
> - p = kstrndup(buf, len, GFP_KERNEL);
> - if (!p) {
> - err =

ATN:PLEASE/ I AM Mr Mohammad Z. Raqab

2020-11-20 Thread Mr Mohammad Z. Raqab

 ATN:PLEASE/ I AM Mr Mohammad Z. Raqab
Before I introduce myself, I wish to inform you that this letter is
not a hoax mail and I urge you to treat it serious. This letter must ,
come to you as a big surprise, but I believe it is only a day that
people meet and become great friends and business partners.

Please I want you to read this letter very carefully and I must
apologize for barging this message into your mail box without any
formal introduction due to the urgency and confidentiality of this
business and I know that this message will come to you as a surprise.
Please, this is not a joke and I will not like you to joke with it OK,
With due respect to your person and much sincerity of purpose, I make
this contact with you as I believe that you can be of great
assistance,  to me. My name is Mr Mohammad Z. Raqab, from Burkina
Faso, West Africa. I
work in Bank  as telex manager, please see this as a confidential
message and do  not reveal it to another person, and let me know
whether you can be of assistance regarding my proposal  below because
it is top secret.

I am about to retire from active Banking service to start a new life ,
but I am skeptical to reveal this particular secret to a stranger.
You must assure me that everything will be handled confidentially
because we are not going to suffer again in life. It has been 10 years
now that most of the greedy African Politicians used our bank to
launder,  money overseas through the help of their Political

advisers.  Most of the funds which they transferred out of the shores
of Africa were gold
and oil money that was supposed to have been used to develop the
continent. Their Political advisers always inflated the amounts before
transferring to foreign accounts,  so I also used the opportunity to
divert part of the funds hence I am aware that there is no official
trace of

how much was transferred as all the accounts used for such  transfers
were being closed after transfer.  I acted as the Bank Officer to most
of the politicians and when I discovered that they
were using me to succeed in their greedy act;  I also cleaned some of
their banking records from the Bank files and no one cared to ask me
because the money was too much for them to control,  They laundered
over $5billion Dollars during the process. Before I send this message
to you,  I have already diverted ($10.5million Dollars) to an escrow
account belonging to no one in the bank. The bank is anxious now to
know who the beneficiary to the funds is because

they have made a lot of profits with the funds. It is more than Eight
years now and most of the politicians are no longer using our bank to
transfer funds overseas. The ($10.5million Dollars) has been laying
waste in our bank and I don't want to retire from the bank without
transferring the funds to a foreign account to enable me share the
proceeds with the receiver (a foreigner).
The money will be shared 60% for me and 40% for you. There is no one
coming to ask you about the funds because I secured everything. I only
want you to assist me by providing a reliable bank account where the
funds can be transferred. You are not to face any difficulties or
legal implications as I am going to handle the transfer personally. If
you are capable of receiving the funds,  do let me know immediately to
enable me give you a detailed

information on what to do.  For me,  I have not stolen the money from
anyone because the other people that took the whole money did not face
any problems.  This is my chance to grab my own life opportunity but
you must keep the details of the funds secret to avoid any leakages as
no one in the bank knows about my plans.  Please get back to me if you
are interested and capable to handle this project,  I shall intimate
you on what to do when I hear from your confirmation and acceptance.
If you are capable of being my trusted associate,do declare your
consent to me, I am looking forward to hear from you immediately for
further information,
(deal) transaction.
1)your full name.
2) sex.
3) age.
4) country.
5)occupation.
6) personal Mobile number.
7)Home  address.
8)Your marital status
Thanks with my best regards.
Mr Mohammad Z. Raqab,
Bank Telex Manager
Burkina Faso/Ouagadougou
My PRIVATE mail, z.raqabmohamma...@gmail.com

Re: [PATCH v3 4/4] Documentation/admin-guide: Change doc for split_lock_detect parameter

2020-11-20 Thread Randy Dunlap

Hi--

On 11/20/20 6:36 PM, Fenghua Yu wrote:
> + ratelimit:N -
> +   Set rate limit to N bus locks per second
> +   for bus lock detection. 0 < N <= HZ/2 and
> +   N is approximate. Only applied to non-root
> +   users.

Sorry, but I don't know what this means. I think it's the "and N is appropriate"
that is confusing me.

0 < N <= HZ/2 and N is appropriate.

-- 
~Randy

Re: [PATCH net-next 4/6] net: ipa: support retries on generic GSI commands

2020-11-20 Thread Alex Elder

On 11/20/20 8:49 PM, Jakub Kicinski wrote:
> On Thu, 19 Nov 2020 16:49:27 -0600 Alex Elder wrote:
>> +do
>> +ret = gsi_generic_command(gsi, channel_id,
>> +  GSI_GENERIC_HALT_CHANNEL);
>> +while (ret == -EAGAIN && retries--);
> 
> This may well be the first time I've seen someone write a do while loop
> without the curly brackets!

I had them at one time, then saw I could get away
without them.  I don't have a preference but I see
you accepted it as-is.

I really appreciate your timely responses.

-Alex

Re: [PATCH] RISC-V: fix barrier() use in

2020-11-20 Thread Palmer Dabbelt


On Mon, 16 Nov 2020 17:39:51 PST (-0800), rdun...@infradead.org wrote:

riscv's  uses barrier() so it should
#include  to prevent build errors.

Fixes this build error:
  CC [M]  drivers/net/ethernet/emulex/benet/be_main.o
In file included from ./include/vdso/processor.h:10,
 from ./arch/riscv/include/asm/processor.h:11,
 from ./include/linux/prefetch.h:15,
 from drivers/net/ethernet/emulex/benet/be_main.c:14:
./arch/riscv/include/asm/vdso/processor.h: In function 'cpu_relax':
./arch/riscv/include/asm/vdso/processor.h:14:2: error: implicit declaration of 
function 'barrier' [-Werror=implicit-function-declaration]
   14 |  barrier();

This happens with a total of 5 networking drivers -- they all use
.

rv64 allmodconfig now builds cleanly after this patch.

Fixes fallout from:
815f0ddb346c ("include/linux/compiler*.h: make compiler-*.h mutually exclusive")

Fixes: ad5d1122b82f ("riscv: use vDSO common flow to reduce the latency of the 
time-related functions")
Reported-by: Andreas Schwab 
Signed-off-by: Randy Dunlap 
Cc: Andrew Morton 
Cc: Stephen Rothwell 
Cc: Arvind Sankar 
Cc: linux-ri...@lists.infradead.org
Cc: clang-built-li...@googlegroups.com
Cc: Nick Desaulniers 
Cc: Nathan Chancellor 
Cc: Paul Walmsley 
Cc: Palmer Dabbelt 
Cc: Albert Ou 
---
 arch/riscv/include/asm/vdso/processor.h |2 ++
 1 file changed, 2 insertions(+)

--- lnx-510-rc4.orig/arch/riscv/include/asm/vdso/processor.h
+++ lnx-510-rc4/arch/riscv/include/asm/vdso/processor.h
@@ -4,6 +4,8 @@

 #ifndef __ASSEMBLY__

+#include 
+
 static inline void cpu_relax(void)
 {
 #ifdef __riscv_muldiv


Thanks, this is on fixes.

[PATCH] remoteproc: Add a rproc_set_firmware() API

2020-11-20 Thread Suman Anna

A new API, rproc_set_firmware() is added to allow the remoteproc platform
drivers and remoteproc client drivers to be able to configure a custom
firmware name that is different from the default name used during
remoteproc registration. This function is being introduced to provide
a kernel-level equivalent of the current sysfs interface to remoteproc
client drivers, and can only change firmwares when the remoteproc is
offline. This allows some remoteproc drivers to choose different firmwares
at runtime based on the functionality the remote processor is providing.
The TI PRU Ethernet driver will be an example of such usage as it
requires to use different firmwares for different supported protocols.

Also, update the firmware_store() function used by the sysfs interface
to reuse this function to avoid code duplication.

Signed-off-by: Suman Anna 
---
 drivers/remoteproc/remoteproc_core.c  | 63 +++
 drivers/remoteproc/remoteproc_sysfs.c | 33 +-
 include/linux/remoteproc.h|  1 +
 3 files changed, 66 insertions(+), 31 deletions(-)

diff --git a/drivers/remoteproc/remoteproc_core.c 
b/drivers/remoteproc/remoteproc_core.c
index dab2c0f5caf0..46c2937ebea9 100644
--- a/drivers/remoteproc/remoteproc_core.c
+++ b/drivers/remoteproc/remoteproc_core.c
@@ -1934,6 +1934,69 @@ struct rproc *rproc_get_by_phandle(phandle phandle)
 #endif
 EXPORT_SYMBOL(rproc_get_by_phandle);
 
+/**
+ * rproc_set_firmware() - assign a new firmware
+ * @rproc: rproc handle to which the new firmware is being assigned
+ * @fw_name: new firmware name to be assigned
+ *
+ * This function allows remoteproc drivers or clients to configure a custom
+ * firmware name that is different from the default name used during remoteproc
+ * registration. The function does not trigger a remote processor boot,
+ * only sets the firmware name used for a subsequent boot. This function
+ * should also be called only when the remote processor is offline.
+ *
+ * This allows either the userspace to configure a different name through
+ * sysfs or a kernel-level remoteproc or a remoteproc client driver to set
+ * a specific firmware when it is controlling the boot and shutdown of the
+ * remote processor.
+ *
+ * Return: 0 on success or a negative value upon failure
+ */
+int rproc_set_firmware(struct rproc *rproc, const char *fw_name)
+{
+   struct device *dev;
+   int ret, len;
+   char *p;
+
+   if (!rproc || !fw_name)
+   return -EINVAL;
+
+   dev = rproc->dev.parent;
+
+   ret = mutex_lock_interruptible(>lock);
+   if (ret) {
+   dev_err(dev, "can't lock rproc %s: %d\n", rproc->name, ret);
+   return -EINVAL;
+   }
+
+   if (rproc->state != RPROC_OFFLINE) {
+   dev_err(dev, "can't change firmware while running\n");
+   ret = -EBUSY;
+   goto out;
+   }
+
+   len = strcspn(fw_name, "\n");
+   if (!len) {
+   dev_err(dev, "can't provide empty string for firmware name\n");
+   ret = -EINVAL;
+   goto out;
+   }
+
+   p = kstrndup(fw_name, len, GFP_KERNEL);
+   if (!p) {
+   ret = -ENOMEM;
+   goto out;
+   }
+
+   kfree(rproc->firmware);
+   rproc->firmware = p;
+
+out:
+   mutex_unlock(>lock);
+   return ret;
+}
+EXPORT_SYMBOL(rproc_set_firmware);
+
 static int rproc_validate(struct rproc *rproc)
 {
switch (rproc->state) {
diff --git a/drivers/remoteproc/remoteproc_sysfs.c 
b/drivers/remoteproc/remoteproc_sysfs.c
index 3fd18a71c188..cf846caf2e1a 100644
--- a/drivers/remoteproc/remoteproc_sysfs.c
+++ b/drivers/remoteproc/remoteproc_sysfs.c
@@ -159,42 +159,13 @@ static ssize_t firmware_store(struct device *dev,
  const char *buf, size_t count)
 {
struct rproc *rproc = to_rproc(dev);
-   char *p;
-   int err, len = count;
+   int err;
 
/* restrict sysfs operations if not allowed by remoteproc drivers */
if (rproc->deny_sysfs_ops)
return -EPERM;
 
-   err = mutex_lock_interruptible(>lock);
-   if (err) {
-   dev_err(dev, "can't lock rproc %s: %d\n", rproc->name, err);
-   return -EINVAL;
-   }
-
-   if (rproc->state != RPROC_OFFLINE) {
-   dev_err(dev, "can't change firmware while running\n");
-   err = -EBUSY;
-   goto out;
-   }
-
-   len = strcspn(buf, "\n");
-   if (!len) {
-   dev_err(dev, "can't provide a NULL firmware\n");
-   err = -EINVAL;
-   goto out;
-   }
-
-   p = kstrndup(buf, len, GFP_KERNEL);
-   if (!p) {
-   err = -ENOMEM;
-   goto out;
-   }
-
-   kfree(rproc->firmware);
-   rproc->firmware = p;
-out:
-   mutex_unlock(>lock);
+   err = rproc_set_firmware(rproc, buf);
 
return err ? err : count;
 }
diff --git

Re: [PATCH v12 5/5] selftest: mhi: Add support to test MHI LOOPBACK channel

2020-11-20 Thread Hemant Kumar




Hi Mani,
On 11/19/20 10:10 PM, Manivannan Sadhasivam wrote:

On Mon, Nov 16, 2020 at 02:46:22PM -0800, Hemant Kumar wrote:

Loopback test opens the MHI device file node and writes
a data buffer to it. MHI UCI kernel space driver copies
the data and sends it to MHI uplink (Tx) LOOPBACK channel.
MHI device loops back the same data to MHI downlink (Rx)
LOOPBACK channel. This data is read by test application
and compared against the data sent. Test passes if data
buffer matches between Tx and Rx. Test application performs
open(), poll(), write(), read() and close() file operations.

Signed-off-by: Hemant Kumar 


One nitpick below, with that addressed:

Reviewed-by: Manivannan Sadhasivam 

[..]


Effectively this functions does parse and run, so this should be called
as, "loopback_test_parse_run" or pthread creation should be moved here.

Done.

--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

Re: [PATCH v7 00/17] Add support for Clang LTO

2020-11-20 Thread Nathan Chancellor

On Fri, Nov 20, 2020 at 11:29:51AM +0100, Ard Biesheuvel wrote:
> On Thu, 19 Nov 2020 at 00:42, Nick Desaulniers  
> wrote:
> >
> > On Wed, Nov 18, 2020 at 2:07 PM Sami Tolvanen  
> > wrote:
> > >
> > > This patch series adds support for building the kernel with Clang's
> > > Link Time Optimization (LTO). In addition to performance, the primary
> > > motivation for LTO is to allow Clang's Control-Flow Integrity (CFI) to
> > > be used in the kernel. Google has shipped millions of Pixel devices
> > > running three major kernel versions with LTO+CFI since 2018.
> > >
> > > Most of the patches are build system changes for handling LLVM bitcode,
> > > which Clang produces with LTO instead of ELF object files, postponing
> > > ELF processing until a later stage, and ensuring initcall ordering.
> > >
> > > Note that v7 brings back arm64 support as Will has now staged the
> > > prerequisite memory ordering patches [1], and drops x86_64 while we work
> > > on fixing the remaining objtool warnings [2].
> > >
> > > [1] 
> > > https://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git/log/?h=for-next/lto
> > > [2] https://lore.kernel.org/lkml/20201114004911.aip52eimk6c2uxd4@treble/
> > >
> > > You can also pull this series from
> > >
> > >   https://github.com/samitolvanen/linux.git lto-v7
> >
> > Thanks for continuing to drive this series Sami.  For the series,
> >
> > Tested-by: Nick Desaulniers 
> >
> > I did virtualized boot tests with the series applied to aarch64
> > defconfig without CONFIG_LTO, with CONFIG_LTO_CLANG, and a third time
> > with CONFIG_THINLTO.  If you make changes to the series in follow ups,
> > please drop my tested by tag from the modified patches and I'll help
> > re-test.  Some minor feedback on the Kconfig change, but I'll post it
> > off of that patch.
> >
> 
> When you say 'virtualized" do you mean QEMU on x86? Or actual
> virtualization on an AArch64 KVM host?
> 
> The distinction is important here, given the potential impact of LTO
> on things that QEMU simply does not model when it runs in TCG mode on
> a foreign host architecture.

I have booted this series on my Raspberry Pi 4 (ARCH=arm64 defconfig).

$ uname -r
5.10.0-rc4-00108-g830200082c74

$ zgrep LTO /proc/config.gz
CONFIG_LTO=y
CONFIG_ARCH_SUPPORTS_LTO_CLANG=y
CONFIG_ARCH_SUPPORTS_THINLTO=y
CONFIG_THINLTO=y
# CONFIG_LTO_NONE is not set
CONFIG_LTO_CLANG=y
# CONFIG_HID_WALTOP is not set

and I have taken that same kernel and booted it under QEMU with
'-enable-kvm' without any visible issues.

I have tested four combinations:

clang 12 @ f9f0a4046e11c2b4c130640f343e3b2b5db08c1:
* CONFIG_THINLTO=y
* CONFIG_THINLTO=n

clang 11.0.0
* CONFIG_THINLTO=y
* CONFIG_THINLTO=n

Tested-by: Nathan Chancellor 

Cheers,
Nathan

Re: [GIT PULL 1/1] bcm2835-dt-next-2020-11-20

2020-11-20 Thread Florian Fainelli




On 11/20/2020 8:52 AM, Nicolas Saenz Julienne wrote:
> Hi Florian,
> 
> The following changes since commit 3650b228f83adda7e5ee532e2b90429c03f7b9ec:
> 
>   Linux 5.10-rc1 (2020-10-25 15:14:11 -0700)
> 
> are available in the Git repository at:
> 
>   https://git.kernel.org/pub/scm/linux/kernel/git/nsaenz/linux-rpi.git 
> tags/bcm2835-dt-next-2020-11-20
> 
> for you to fetch changes up to 278407a53c3b33fb820332c4d39eb39316c3879a:
> 
>   ARM: dts: bcm283x: increase dwc2's RX FIFO size (2020-11-20 17:43:10 +0100)
> 
> 
> - Maxime introduces a quirk to avoid EMI between WiFi and HDMI@1440p on
> RPi4b.
> 
> - Pavel fixes dwc2's fifo size to properly support isochronous
> transfers.
> 
> 

Merged into devicetree/next, thanks Nicolas!
-- 
Florian

Re: [PATCH v2] mdio_bus: suppress err message for reset gpio EPROBE_DEFER

2020-11-20 Thread Jakub Kicinski

On Thu, 19 Nov 2020 22:34:46 +0200 Grygorii Strashko wrote:
> The mdio_bus may have dependencies from GPIO controller and so got
> deferred. Now it will print error message every time -EPROBE_DEFER is
> returned which from:
> __mdiobus_register()
>  |-devm_gpiod_get_optional()
> without actually identifying error code.
> 
> "mdio_bus 4a101000.mdio: mii_bus 4a101000.mdio couldn't get reset GPIO"
> 
> Hence, suppress error message for devm_gpiod_get_optional() returning
> -EPROBE_DEFER case by using dev_err_probe().
> 
> Signed-off-by: Grygorii Strashko 

Applied (with the line wrapped), thanks!

[PATCH v3 4/5] perf metric: Add utilities to work on ids map.

2020-11-20 Thread Ian Rogers

Add utilities to new/free an ids hashmap, as well as to union. Add
testing of the union. Unioning hashmaps will be used when parsing the
metric, if a value is known then the hashmap is unnecessary, otherwise
we need to union together all the event ids to compute their values for
reporting.

Signed-off-by: Ian Rogers 
---
 tools/perf/tests/expr.c | 47 ++
 tools/perf/util/expr.c  | 87 +++--
 tools/perf/util/expr.h  |  9 +
 3 files changed, 139 insertions(+), 4 deletions(-)

diff --git a/tools/perf/tests/expr.c b/tools/perf/tests/expr.c
index 7ccb97c73347..1c881bea7fca 100644
--- a/tools/perf/tests/expr.c
+++ b/tools/perf/tests/expr.c
@@ -6,6 +6,51 @@
 #include 
 #include 
 
+static int test_ids_union(void)
+{
+   struct hashmap *ids1, *ids2;
+
+   /* Empty union. */
+   ids1 = ids__new();
+   TEST_ASSERT_VAL("ids__new", ids1);
+   ids2 = ids__new();
+   TEST_ASSERT_VAL("ids__new", ids2);
+
+   ids1 = ids__union(ids1, ids2);
+   TEST_ASSERT_EQUAL("union", (int)hashmap__size(ids1), 0);
+
+   /* Union {foo, bar} against {}. */
+   ids2 = ids__new();
+   TEST_ASSERT_VAL("ids__new", ids2);
+
+   TEST_ASSERT_EQUAL("ids__insert", ids__insert(ids1, strdup("foo"), 
NULL), 0);
+   TEST_ASSERT_EQUAL("ids__insert", ids__insert(ids1, strdup("bar"), 
NULL), 0);
+
+   ids1 = ids__union(ids1, ids2);
+   TEST_ASSERT_EQUAL("union", (int)hashmap__size(ids1), 2);
+
+   /* Union {foo, bar} against {foo}. */
+   ids2 = ids__new();
+   TEST_ASSERT_VAL("ids__new", ids2);
+   TEST_ASSERT_EQUAL("ids__insert", ids__insert(ids2, strdup("foo"), 
NULL), 0);
+
+   ids1 = ids__union(ids1, ids2);
+   TEST_ASSERT_EQUAL("union", (int)hashmap__size(ids1), 2);
+
+   /* Union {foo, bar} against {bar,baz}. */
+   ids2 = ids__new();
+   TEST_ASSERT_VAL("ids__new", ids2);
+   TEST_ASSERT_EQUAL("ids__insert", ids__insert(ids2, strdup("bar"), 
NULL), 0);
+   TEST_ASSERT_EQUAL("ids__insert", ids__insert(ids2, strdup("baz"), 
NULL), 0);
+
+   ids1 = ids__union(ids1, ids2);
+   TEST_ASSERT_EQUAL("union", (int)hashmap__size(ids1), 3);
+
+   ids__free(ids1);
+
+   return 0;
+}
+
 static int test(struct expr_parse_ctx *ctx, const char *e, double val2)
 {
double val;
@@ -24,6 +69,8 @@ int test__expr(struct test *t __maybe_unused, int subtest 
__maybe_unused)
int ret;
struct expr_parse_ctx *ctx;
 
+   TEST_ASSERT_EQUAL("ids_union", test_ids_union(), 0);
+
ctx = expr__ctx_new();
TEST_ASSERT_VAL("expr__ctx_new", ctx);
expr__add_id_val(ctx, strdup("FOO"), 1);
diff --git a/tools/perf/util/expr.c b/tools/perf/util/expr.c
index a248d14882cc..1adb6cd202e0 100644
--- a/tools/perf/util/expr.c
+++ b/tools/perf/util/expr.c
@@ -59,8 +59,48 @@ static bool key_equal(const void *key1, const void *key2,
return !strcmp((const char *)key1, (const char *)key2);
 }
 
-/* Caller must make sure id is allocated */
-int expr__add_id(struct expr_parse_ctx *ctx, const char *id)
+struct hashmap *ids__new(void)
+{
+   return hashmap__new(key_hash, key_equal, NULL);
+}
+
+void ids__free(struct hashmap *ids)
+{
+   struct hashmap_entry *cur;
+   size_t bkt;
+
+   if (ids == NULL)
+   return;
+
+#ifdef PARSER_DEBUG
+   fprintf(stderr, "freeing ids: ");
+   ids__print(ids);
+   fprintf(stderr, "\n");
+#endif
+
+   hashmap__for_each_entry(ids, cur, bkt) {
+   free((char *)cur->key);
+   free(cur->value);
+   }
+
+   hashmap__free(ids);
+}
+
+void ids__print(struct hashmap *ids)
+{
+   size_t bkt;
+   struct hashmap_entry *cur;
+
+   if (!ids)
+   return;
+
+   hashmap__for_each_entry(ids, cur, bkt) {
+   fprintf(stderr, "key:%s, ", (const char *)cur->key);
+   }
+}
+
+int ids__insert(struct hashmap *ids, const char *id,
+   struct expr_id *parent)
 {
struct expr_id_data *data_ptr = NULL, *old_data = NULL;
char *old_key = NULL;
@@ -70,10 +110,10 @@ int expr__add_id(struct expr_parse_ctx *ctx, const char 
*id)
if (!data_ptr)
return -ENOMEM;
 
-   data_ptr->parent = ctx->parent;
+   data_ptr->parent = parent;
data_ptr->kind = EXPR_ID_DATA__PARENT;
 
-   ret = hashmap__set(ctx->ids, id, data_ptr,
+   ret = hashmap__set(ids, id, data_ptr,
   (const void **)_key, (void **)_data);
if (ret)
free(data_ptr);
@@ -82,6 +122,45 @@ int expr__add_id(struct expr_parse_ctx *ctx, const char *id)
return ret;
 }
 
+struct hashmap *ids__union(struct hashmap *ids1, struct hashmap *ids2)
+{
+   size_t bkt;
+   struct hashmap_entry *cur;
+   int ret;
+   struct expr_id_data *old_data = NULL;
+   char *old_key = NULL;
+
+   if (!ids1)
+   return ids2;
+
+   if (!ids2)
+   return

[PATCH v3 5/5] perf metric: Don't compute unused events.

2020-11-20 Thread Ian Rogers

For a metric like:
  EVENT1 if #smt_on else EVENT2

currently EVENT1 and EVENT2 will be measured and then when the metric is
reported EVENT1 or EVENT2 will be printed depending on the value from
smt_on() during the expr parsing. Computing both events is unnecessary and
can lead to multiplexing as discussed in this thread:
https://lore.kernel.org/lkml/20201110100346.2527031-1-irog...@google.com/

This change modifies the expression parsing code by:
 - getting rid of the "other" parsing and introducing a boolean argument
   to say whether ids should be computed or not.
 - expressions are changed so that a pair of value and ids are returned.
 - when computing the metric value the ids are unused.
 - when computing the ids, constant values and smt_on are assigned to
   the value.
 - If the value is from an event ID then the event is added to the ids
   hashmap and the value set to NAN.
 - Typically operators union IDs for their inputs and set the value to
   NAN, however, if the inputs are constant then these are computed and
   propagated as the value.
 - If the input is constant to certain operators like:
 IDS1 if CONST else IDS2
   then the result will be either IDS1 or IDS2 depending on CONST (which
   may be evaluated from an entire expression), and so IDS1 or IDS2 may
   be discarded avoiding events from being programmed.
 - The ids at the end of parsing are added to the context.

Signed-off-by: Ian Rogers 
---
 tools/perf/tests/expr.c |  17 ++
 tools/perf/util/expr.c  |   9 +-
 tools/perf/util/expr.h  |   1 -
 tools/perf/util/expr.l  |   9 -
 tools/perf/util/expr.y  | 373 
 5 files changed, 319 insertions(+), 90 deletions(-)

diff --git a/tools/perf/tests/expr.c b/tools/perf/tests/expr.c
index 1c881bea7fca..5cab5960b257 100644
--- a/tools/perf/tests/expr.c
+++ b/tools/perf/tests/expr.c
@@ -1,6 +1,7 @@
 // SPDX-License-Identifier: GPL-2.0
 #include "util/debug.h"
 #include "util/expr.h"
+#include "util/smt.h"
 #include "tests.h"
 #include 
 #include 
@@ -132,6 +133,22 @@ int test__expr(struct test *t __maybe_unused, int subtest 
__maybe_unused)
TEST_ASSERT_VAL("find ids", hashmap__find(ctx->ids, "EVENT2,param=3/",
(void **)_ptr));
 
+   /* Only EVENT1 or EVENT2 need be measured depending on the value of 
smt_on. */
+   expr__ctx_clear(ctx);
+   TEST_ASSERT_VAL("find ids",
+   expr__find_ids("EVENT1 if #smt_on else EVENT2",
+   NULL, ctx, 0) == 0);
+   TEST_ASSERT_VAL("find ids", hashmap__size(ctx->ids) == 1);
+   TEST_ASSERT_VAL("find ids", hashmap__find(ctx->ids,
+ smt_on() ? "EVENT1" : 
"EVENT2",
+ (void **)_ptr));
+
+   /* The expression is a constant 1.0 without needing to evaluate EVENT1. 
*/
+   expr__ctx_clear(ctx);
+   TEST_ASSERT_VAL("find ids",
+   expr__find_ids("1.0 if EVENT1 > 100.0 else 1.0",
+   NULL, ctx, 0) == 0);
+   TEST_ASSERT_VAL("find ids", hashmap__size(ctx->ids) == 0);
expr__ctx_free(ctx);
 
return 0;
diff --git a/tools/perf/util/expr.c b/tools/perf/util/expr.c
index 1adb6cd202e0..28aaa50c6c68 100644
--- a/tools/perf/util/expr.c
+++ b/tools/perf/util/expr.c
@@ -329,10 +329,9 @@ void expr__ctx_free(struct expr_parse_ctx *ctx)
 
 static int
 __expr__parse(double *val, struct expr_parse_ctx *ctx, const char *expr,
- int start, int runtime)
+ bool compute_ids, int runtime)
 {
struct expr_scanner_ctx scanner_ctx = {
-   .start_token = start,
.runtime = runtime,
};
YY_BUFFER_STATE buffer;
@@ -352,7 +351,7 @@ __expr__parse(double *val, struct expr_parse_ctx *ctx, 
const char *expr,
expr_set_debug(1, scanner);
 #endif
 
-   ret = expr_parse(val, ctx, scanner);
+   ret = expr_parse(val, ctx, compute_ids, scanner);
 
expr__flush_buffer(buffer, scanner);
expr__delete_buffer(buffer, scanner);
@@ -363,13 +362,13 @@ __expr__parse(double *val, struct expr_parse_ctx *ctx, 
const char *expr,
 int expr__parse(double *final_val, struct expr_parse_ctx *ctx,
const char *expr, int runtime)
 {
-   return __expr__parse(final_val, ctx, expr, EXPR_PARSE, runtime) ? -1 : 
0;
+   return __expr__parse(final_val, ctx, expr, /*compute_ids=*/false, 
runtime) ? -1 : 0;
 }
 
 int expr__find_ids(const char *expr, const char *one,
   struct expr_parse_ctx *ctx, int runtime)
 {
-   int ret = __expr__parse(NULL, ctx, expr, EXPR_OTHER, runtime);
+   int ret = __expr__parse(NULL, ctx, expr, /*compute_ids=*/true, runtime);
 
if (one)
expr__del_id(ctx, one);
diff --git a/tools/perf/util/expr.h b/tools/perf/util/expr.h
index 62d3ae5ddfba..cefeb2c8d85e 100644
--- a/tools/perf/util/expr.h
+++

[PATCH v3 3/5] perf metric: Rename expr__find_other.

2020-11-20 Thread Ian Rogers

A later change will remove the notion of other, rename the function to
expr__find_ids as this is what it populates.

Signed-off-by: Ian Rogers 
---
 tools/perf/tests/expr.c   | 26 +-
 tools/perf/tests/pmu-events.c |  9 -
 tools/perf/util/expr.c|  4 ++--
 tools/perf/util/expr.h|  2 +-
 tools/perf/util/metricgroup.c |  2 +-
 tools/perf/util/stat-shadow.c |  6 +++---
 6 files changed, 24 insertions(+), 25 deletions(-)

diff --git a/tools/perf/tests/expr.c b/tools/perf/tests/expr.c
index b0a3b5fd0c00..7ccb97c73347 100644
--- a/tools/perf/tests/expr.c
+++ b/tools/perf/tests/expr.c
@@ -64,25 +64,25 @@ int test__expr(struct test *t __maybe_unused, int subtest 
__maybe_unused)
TEST_ASSERT_VAL("missing operand", ret == -1);
 
expr__ctx_clear(ctx);
-   TEST_ASSERT_VAL("find other",
-   expr__find_other("FOO + BAR + BAZ + BOZO", "FOO",
-ctx, 1) == 0);
-   TEST_ASSERT_VAL("find other", hashmap__size(ctx->ids) == 3);
-   TEST_ASSERT_VAL("find other", hashmap__find(ctx->ids, "BAR",
+   TEST_ASSERT_VAL("find ids",
+   expr__find_ids("FOO + BAR + BAZ + BOZO", "FOO",
+   ctx, 1) == 0);
+   TEST_ASSERT_VAL("find ids", hashmap__size(ctx->ids) == 3);
+   TEST_ASSERT_VAL("find ids", hashmap__find(ctx->ids, "BAR",
(void **)_ptr));
-   TEST_ASSERT_VAL("find other", hashmap__find(ctx->ids, "BAZ",
+   TEST_ASSERT_VAL("find ids", hashmap__find(ctx->ids, "BAZ",
(void **)_ptr));
-   TEST_ASSERT_VAL("find other", hashmap__find(ctx->ids, "BOZO",
+   TEST_ASSERT_VAL("find ids", hashmap__find(ctx->ids, "BOZO",
(void **)_ptr));
 
expr__ctx_clear(ctx);
-   TEST_ASSERT_VAL("find other",
-   expr__find_other("EVENT1\\,param\\=?@ + 
EVENT2\\,param\\=?@",
-NULL, ctx, 3) == 0);
-   TEST_ASSERT_VAL("find other", hashmap__size(ctx->ids) == 2);
-   TEST_ASSERT_VAL("find other", hashmap__find(ctx->ids, "EVENT1,param=3/",
+   TEST_ASSERT_VAL("find ids",
+   expr__find_ids("EVENT1\\,param\\=?@ + 
EVENT2\\,param\\=?@",
+   NULL, ctx, 3) == 0);
+   TEST_ASSERT_VAL("find ids", hashmap__size(ctx->ids) == 2);
+   TEST_ASSERT_VAL("find ids", hashmap__find(ctx->ids, "EVENT1,param=3/",
(void **)_ptr));
-   TEST_ASSERT_VAL("find other", hashmap__find(ctx->ids, "EVENT2,param=3/",
+   TEST_ASSERT_VAL("find ids", hashmap__find(ctx->ids, "EVENT2,param=3/",
(void **)_ptr));
 
expr__ctx_free(ctx);
diff --git a/tools/perf/tests/pmu-events.c b/tools/perf/tests/pmu-events.c
index 294daf568bb6..3ac70fa31379 100644
--- a/tools/perf/tests/pmu-events.c
+++ b/tools/perf/tests/pmu-events.c
@@ -502,9 +502,8 @@ static int test_parsing(void)
if (!pe->metric_expr)
continue;
expr__ctx_clear(ctx);
-   if (expr__find_other(pe->metric_expr, NULL, ctx, 0)
- < 0) {
-   expr_failure("Parse other failed", map, pe);
+   if (expr__find_ids(pe->metric_expr, NULL, ctx, 0) < 0) {
+   expr_failure("Parse find ids failed", map, pe);
ret++;
continue;
}
@@ -559,8 +558,8 @@ static int metric_parse_fake(const char *str)
pr_debug("parsing '%s'\n", str);
 
ctx = expr__ctx_new();
-   if (expr__find_other(str, NULL, ctx, 0) < 0) {
-   pr_err("expr__find_other failed\n");
+   if (expr__find_ids(str, NULL, ctx, 0) < 0) {
+   pr_err("expr__find_ids failed\n");
return -1;
}
 
diff --git a/tools/perf/util/expr.c b/tools/perf/util/expr.c
index e0623d38e6ee..a248d14882cc 100644
--- a/tools/perf/util/expr.c
+++ b/tools/perf/util/expr.c
@@ -287,8 +287,8 @@ int expr__parse(double *final_val, struct expr_parse_ctx 
*ctx,
return __expr__parse(final_val, ctx, expr, EXPR_PARSE, runtime) ? -1 : 
0;
 }
 
-int expr__find_other(const char *expr, const char *one,
-struct expr_parse_ctx *ctx, int runtime)
+int expr__find_ids(const char *expr, const char *one,
+  struct expr_parse_ctx *ctx, int runtime)
 {
int ret = __expr__parse(NULL, ctx, expr, EXPR_OTHER, runtime);
 
diff --git a/tools/perf/util/expr.h b/tools/perf/util/expr.h
index 00b941cfe6a6..955d5adb7ca4 100644
--- a/tools/perf/util/expr.h
+++ b/tools/perf/util/expr.h
@@ -43,7 +43,7 @@ int expr__resolve_id(struct

[PATCH v3 2/5] perf metric: Use NAN for missing event IDs.

2020-11-20 Thread Ian Rogers

If during computing a metric an event (id) is missing the parsing
aborts. A later patch will make it so that events that aren't used in
the output are deliberately omitted, in which case we don't want the
abort. Modify the missing ID case to report NAN for these cases.

Signed-off-by: Ian Rogers 
---
 tools/perf/util/expr.y | 9 -
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/tools/perf/util/expr.y b/tools/perf/util/expr.y
index b2ada8f8309a..41c9cd4efadd 100644
--- a/tools/perf/util/expr.y
+++ b/tools/perf/util/expr.y
@@ -1,6 +1,7 @@
 /* Simple expression parser */
 %{
 #define YYDEBUG 1
+#include 
 #include 
 #include "util.h"
 #include "util/debug.h"
@@ -88,12 +89,10 @@ expr: NUMBER
| ID{
struct expr_id_data *data;
 
-   if (expr__resolve_id(ctx, $1, )) {
-   free($1);
-   YYABORT;
-   }
+   $$ = NAN;
+   if (expr__resolve_id(ctx, $1, ) == 
0)
+   $$ = expr_id_data__value(data);
 
-   $$ = expr_id_data__value(data);
free($1);
}
| expr '|' expr { $$ = (long)$1 | (long)$3; }
-- 
2.29.2.454.gaff20da3a2-goog

[PATCH v3 1/5] perf metric: Restructure struct expr_parse_ctx.

2020-11-20 Thread Ian Rogers

A later change to parsing the ids out (in expr__find_other) will
potentially drop hashmaps and so it is more convenient to move
expr_parse_ctx to have a hashmap pointer rather than a struct value. As
this pointer must be freed, rather than just going out of scope,
add expr__ctx_new and expr__ctx_free to manage expr_parse_ctx memory.
Adjust use of struct expr_parse_ctx accordingly.

Signed-off-by: Ian Rogers 
---
 tools/perf/tests/expr.c   | 81 ++-
 tools/perf/tests/pmu-events.c | 37 +---
 tools/perf/util/expr.c| 38 
 tools/perf/util/expr.h|  5 ++-
 tools/perf/util/metricgroup.c | 44 ++-
 tools/perf/util/stat-shadow.c | 50 +
 6 files changed, 151 insertions(+), 104 deletions(-)

diff --git a/tools/perf/tests/expr.c b/tools/perf/tests/expr.c
index 4d01051951cd..b0a3b5fd0c00 100644
--- a/tools/perf/tests/expr.c
+++ b/tools/perf/tests/expr.c
@@ -22,67 +22,70 @@ int test__expr(struct test *t __maybe_unused, int subtest 
__maybe_unused)
const char *p;
double val;
int ret;
-   struct expr_parse_ctx ctx;
+   struct expr_parse_ctx *ctx;
 
-   expr__ctx_init();
-   expr__add_id_val(, strdup("FOO"), 1);
-   expr__add_id_val(, strdup("BAR"), 2);
+   ctx = expr__ctx_new();
+   TEST_ASSERT_VAL("expr__ctx_new", ctx);
+   expr__add_id_val(ctx, strdup("FOO"), 1);
+   expr__add_id_val(ctx, strdup("BAR"), 2);
 
-   ret = test(, "1+1", 2);
-   ret |= test(, "FOO+BAR", 3);
-   ret |= test(, "(BAR/2)%2", 1);
-   ret |= test(, "1 - -4",  5);
-   ret |= test(, "(FOO-1)*2 + (BAR/2)%2 - -4",  5);
-   ret |= test(, "1-1 | 1", 1);
-   ret |= test(, "1-1 & 1", 0);
-   ret |= test(, "min(1,2) + 1", 2);
-   ret |= test(, "max(1,2) + 1", 3);
-   ret |= test(, "1+1 if 3*4 else 0", 2);
-   ret |= test(, "1.1 + 2.1", 3.2);
-   ret |= test(, ".1 + 2.", 2.1);
-   ret |= test(, "d_ratio(1, 2)", 0.5);
-   ret |= test(, "d_ratio(2.5, 0)", 0);
-   ret |= test(, "1.1 < 2.2", 1);
-   ret |= test(, "2.2 > 1.1", 1);
-   ret |= test(, "1.1 < 1.1", 0);
-   ret |= test(, "2.2 > 2.2", 0);
-   ret |= test(, "2.2 < 1.1", 0);
-   ret |= test(, "1.1 > 2.2", 0);
+   ret = test(ctx, "1+1", 2);
+   ret |= test(ctx, "FOO+BAR", 3);
+   ret |= test(ctx, "(BAR/2)%2", 1);
+   ret |= test(ctx, "1 - -4",  5);
+   ret |= test(ctx, "(FOO-1)*2 + (BAR/2)%2 - -4",  5);
+   ret |= test(ctx, "1-1 | 1", 1);
+   ret |= test(ctx, "1-1 & 1", 0);
+   ret |= test(ctx, "min(1,2) + 1", 2);
+   ret |= test(ctx, "max(1,2) + 1", 3);
+   ret |= test(ctx, "1+1 if 3*4 else 0", 2);
+   ret |= test(ctx, "1.1 + 2.1", 3.2);
+   ret |= test(ctx, ".1 + 2.", 2.1);
+   ret |= test(ctx, "d_ratio(1, 2)", 0.5);
+   ret |= test(ctx, "d_ratio(2.5, 0)", 0);
+   ret |= test(ctx, "1.1 < 2.2", 1);
+   ret |= test(ctx, "2.2 > 1.1", 1);
+   ret |= test(ctx, "1.1 < 1.1", 0);
+   ret |= test(ctx, "2.2 > 2.2", 0);
+   ret |= test(ctx, "2.2 < 1.1", 0);
+   ret |= test(ctx, "1.1 > 2.2", 0);
 
-   if (ret)
+   if (ret) {
+   expr__ctx_free(ctx);
return ret;
+   }
 
p = "FOO/0";
-   ret = expr__parse(, , p, 1);
+   ret = expr__parse(, ctx, p, 1);
TEST_ASSERT_VAL("division by zero", ret == -1);
 
p = "BAR/";
-   ret = expr__parse(, , p, 1);
+   ret = expr__parse(, ctx, p, 1);
TEST_ASSERT_VAL("missing operand", ret == -1);
 
-   expr__ctx_clear();
+   expr__ctx_clear(ctx);
TEST_ASSERT_VAL("find other",
expr__find_other("FOO + BAR + BAZ + BOZO", "FOO",
-, 1) == 0);
-   TEST_ASSERT_VAL("find other", hashmap__size() == 3);
-   TEST_ASSERT_VAL("find other", hashmap__find(, "BAR",
+ctx, 1) == 0);
+   TEST_ASSERT_VAL("find other", hashmap__size(ctx->ids) == 3);
+   TEST_ASSERT_VAL("find other", hashmap__find(ctx->ids, "BAR",
(void **)_ptr));
-   TEST_ASSERT_VAL("find other", hashmap__find(, "BAZ",
+   TEST_ASSERT_VAL("find other", hashmap__find(ctx->ids, "BAZ",
(void **)_ptr));
-   TEST_ASSERT_VAL("find other", hashmap__find(, "BOZO",
+   TEST_ASSERT_VAL("find other", hashmap__find(ctx->ids, "BOZO",
(void **)_ptr));
 
-   expr__ctx_clear();
+   expr__ctx_clear(ctx);
TEST_ASSERT_VAL("find other",
expr__find_other("EVENT1\\,param\\=?@ + 
EVENT2\\,param\\=?@",
-NULL, , 3) == 0);
-   TEST_ASSERT_VAL("find other", hashmap__size() == 2);
-   TEST_ASSERT_VAL("find other", hashmap__find(, "EVENT1,param=3/",
+

[PATCH v3 0/5] Don't compute events that won't be used in a metric.

2020-11-20 Thread Ian Rogers


For a metric like:
  EVENT1 if #smt_on else EVENT2

currently EVENT1 and EVENT2 will be measured and then when the metric
is reported EVENT1 or EVENT2 will be printed depending on the value
from smt_on() during the expr parsing. Computing both events is
unnecessary and can lead to multiplexing as discussed in this thread:
https://lore.kernel.org/lkml/20201110100346.2527031-1-irog...@google.com/

This change modifies expression parsing so that constants are
considered when building the set of ids (events) and only events not
contributing to a constant value are measured.

v3. fixes an assignment in patch 2/5. In patch 5/5 additional comments
are added and useless frees are replaced by asserts. A new peephole
optimization is added for the case CONST IF expr ELSE CONST, where the
the constants are identical, as we don't need to evaluate the IF
condition.

v2. is a rebase.

Ian Rogers (5):
  perf metric: Restructure struct expr_parse_ctx.
  perf metric: Use NAN for missing event IDs.
  perf metric: Rename expr__find_other.
  perf metric: Add utilities to work on ids map.
  perf metric: Don't compute unused events.

 tools/perf/tests/expr.c   | 159 +-
 tools/perf/tests/pmu-events.c |  42 ++--
 tools/perf/util/expr.c| 136 ++--
 tools/perf/util/expr.h|  17 +-
 tools/perf/util/expr.l|   9 -
 tools/perf/util/expr.y| 376 +++---
 tools/perf/util/metricgroup.c |  44 ++--
 tools/perf/util/stat-shadow.c |  54 +++--
 8 files changed, 623 insertions(+), 214 deletions(-)

-- 
2.29.2.454.gaff20da3a2-goog

[PATCH net-next v2 1/2] lockdep: Introduce in_softirq lockdep assert

2020-11-20 Thread Yunsheng Lin

The current semantic for napi_consume_skb() is that caller need
to provide non-zero budget when calling from NAPI context, and
breaking this semantic will cause hard to debug problem, because
_kfree_skb_defer() need to run in atomic context in order to push
the skb to the particular cpu' napi_alloc_cache atomically.

So add the lockdep_assert_in_softirq() to assert when the running
context is not in_softirq, in_softirq means softirq is serving or
BH is disabled. Because the softirq context can be interrupted by
hard IRQ or NMI context, so lockdep_assert_in_softirq() need to
assert about hard IRQ or NMI context too.

Suggested-by: Jakub Kicinski 
Signed-off-by: Yunsheng Lin 
---
 include/linux/lockdep.h | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h
index f559487..f5e3d81 100644
--- a/include/linux/lockdep.h
+++ b/include/linux/lockdep.h
@@ -594,6 +594,12 @@ do {   
\
  this_cpu_read(hardirqs_enabled)));\
 } while (0)
 
+#define lockdep_assert_in_softirq()\
+do {   \
+   WARN_ON_ONCE(__lockdep_enabled  &&  \
+(!in_softirq() || in_irq() || in_nmi()));  \
+} while (0)
+
 #else
 # define might_lock(lock) do { } while (0)
 # define might_lock_read(lock) do { } while (0)
@@ -605,6 +611,7 @@ do {
\
 
 # define lockdep_assert_preemption_enabled() do { } while (0)
 # define lockdep_assert_preemption_disabled() do { } while (0)
+# define lockdep_assert_in_softirq() do { } while (0)
 #endif
 
 #ifdef CONFIG_PROVE_RAW_LOCK_NESTING
-- 
2.8.1

[PATCH net-next v2 0/2] Add an assert in napi_consume_skb()

2020-11-20 Thread Yunsheng Lin

This patch introduces a lockdep_assert_in_softirq() interface and
uses it to assert the case when napi_consume_skb() is not called in
the softirq context.

Changelog:
V2: Use lockdep instead of one-off Kconfig knob

Yunsheng Lin (2):
  lockdep: Introduce in_softirq lockdep assert
  net: Use lockdep_assert_in_softirq() in napi_consume_skb()

 include/linux/lockdep.h | 7 +++
 net/core/skbuff.c   | 2 ++
 2 files changed, 9 insertions(+)

-- 
2.8.1

[PATCH net-next v2 2/2] net: Use lockdep_assert_in_softirq() in napi_consume_skb()

2020-11-20 Thread Yunsheng Lin

Use napi_consume_skb() to assert the case when it is not called
in a atomic softirq context.

Signed-off-by: Yunsheng Lin 
---
 net/core/skbuff.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index ffe3dcc..effa19d 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -902,6 +902,8 @@ void napi_consume_skb(struct sk_buff *skb, int budget)
return;
}
 
+   lockdep_assert_in_softirq();
+
if (!skb_unref(skb))
return;
 
-- 
2.8.1

Re: [PATCH v5 1/1] lib/vsprintf: Add support for printing V4L2 and DRM fourccs

2020-11-20 Thread Sergey Senozhatsky

On (20/11/20 16:57), Petr Mladek wrote:
> On Fri 2020-11-13 12:54:41, Sakari Ailus wrote:
> > Add a printk modifier %p4cc (for pixel format) for printing V4L2 and DRM
> > pixel formats denoted by fourccs. The fourcc encoding is the same for both
> > so the same implementation can be used.
> > 
> > Suggested-by: Mauro Carvalho Chehab 
> > Signed-off-by: Sakari Ailus 
> 
> The last version looks fine to me.
> 
> Reviewed-by: Petr Mladek 

Reviewed-by: Sergey Senozhatsky 

-ss

[PATCH v2 2/3] remoteproc: Introduce deny_sysfs_ops flag

2020-11-20 Thread Suman Anna

The remoteproc framework provides sysfs interfaces for changing
the firmware name and for starting/stopping a remote processor
through the sysfs files 'state' and 'firmware'. The 'recovery'
sysfs file can also be used similarly to control the error recovery
state machine of a remoteproc. These interfaces are currently
allowed irrespective of how the remoteprocs were booted (like
remoteproc self auto-boot, remoteproc client-driven boot etc).
These interfaces can adversely affect a remoteproc and its clients
especially when a remoteproc is being controlled by a remoteproc
client driver(s). Also, not all remoteproc drivers may want to
support the sysfs interfaces by default.

Add support to deny the sysfs state/firmware/recovery change by
introducing a state flag 'deny_sysfs_ops' that the individual
remoteproc drivers can set based on their usage needs. The default
behavior is to allow the sysfs operations as before.

Signed-off-by: Suman Anna 
---
v2: revised to account for the 'recovery' sysfs file as well, patch
description updated accordingly
v1: 
https://patchwork.kernel.org/project/linux-remoteproc/patch/20180915003725.17549-5-s-a...@ti.com/

 drivers/remoteproc/remoteproc_sysfs.c | 12 
 include/linux/remoteproc.h|  2 ++
 2 files changed, 14 insertions(+)

diff --git a/drivers/remoteproc/remoteproc_sysfs.c 
b/drivers/remoteproc/remoteproc_sysfs.c
index bd2950a246c9..3fd18a71c188 100644
--- a/drivers/remoteproc/remoteproc_sysfs.c
+++ b/drivers/remoteproc/remoteproc_sysfs.c
@@ -49,6 +49,10 @@ static ssize_t recovery_store(struct device *dev,
 {
struct rproc *rproc = to_rproc(dev);
 
+   /* restrict sysfs operations if not allowed by remoteproc drivers */
+   if (rproc->deny_sysfs_ops)
+   return -EPERM;
+
if (sysfs_streq(buf, "enabled")) {
/* change the flag and begin the recovery process if needed */
rproc->recovery_disabled = false;
@@ -158,6 +162,10 @@ static ssize_t firmware_store(struct device *dev,
char *p;
int err, len = count;
 
+   /* restrict sysfs operations if not allowed by remoteproc drivers */
+   if (rproc->deny_sysfs_ops)
+   return -EPERM;
+
err = mutex_lock_interruptible(>lock);
if (err) {
dev_err(dev, "can't lock rproc %s: %d\n", rproc->name, err);
@@ -225,6 +233,10 @@ static ssize_t state_store(struct device *dev,
struct rproc *rproc = to_rproc(dev);
int ret = 0;
 
+   /* restrict sysfs operations if not allowed by remoteproc drivers */
+   if (rproc->deny_sysfs_ops)
+   return -EPERM;
+
if (sysfs_streq(buf, "start")) {
if (rproc->state == RPROC_RUNNING)
return -EBUSY;
diff --git a/include/linux/remoteproc.h b/include/linux/remoteproc.h
index 3fa3ba6498e8..dbc3767f7d0e 100644
--- a/include/linux/remoteproc.h
+++ b/include/linux/remoteproc.h
@@ -508,6 +508,7 @@ struct rproc_dump_segment {
  * @has_iommu: flag to indicate if remote processor is behind an MMU
  * @auto_boot: flag to indicate if remote processor should be auto-started
  * @autonomous: true if an external entity has booted the remote processor
+ * @deny_sysfs_ops: flag to not permit sysfs operations on state, firmware and 
recovery
  * @dump_segments: list of segments in the firmware
  * @nb_vdev: number of vdev currently handled by rproc
  * @char_dev: character device of the rproc
@@ -545,6 +546,7 @@ struct rproc {
bool has_iommu;
bool auto_boot;
bool autonomous;
+   bool deny_sysfs_ops;
struct list_head dump_segments;
int nb_vdev;
u8 elf_class;
-- 
2.28.0

[PATCH v2 0/3] remoteproc sysfs fixes/improvements

2020-11-20 Thread Suman Anna

Hi All,

This is a refresh of the unaccepted patches from an old series [1].
Patches 2 and 3 from that series were merged and these are rebased and
revised versions of the same patches. I had forgotten about these patches,
and am resurrecting these again. Patches are on top of latest 5.10-rc4.

The features being introduced here will be needed by the recently posted PRU
remoteproc driver [2] in addition to the existing Wkup M3 remoteproc driver.
Both of these drivers follow a client-driven boot methodology, with the latter
strictly booted by another driver in kernel. The PRU remoteproc driver will be
supporting both in-kernel clients as well as control from userspace 
orthogonally.
The logic though is applicable and useful to any remoteproc driver not using
'auto-boot' and using an external driver/application to boot the remoteproc.

regards
Suman

[1] 
https://patchwork.kernel.org/project/linux-remoteproc/cover/20180915003725.17549-1-s-a...@ti.com/
[2] 
https://patchwork.kernel.org/project/linux-remoteproc/cover/20201119140850.12268-1-grzegorz.jaszc...@linaro.org/

Suman Anna (3):
  remoteproc: Fix unbalanced boot with sysfs for no auto-boot rprocs
  remoteproc: Introduce deny_sysfs_ops flag
  remoteproc: wkup_m3: Set deny_sysfs_ops flag

 drivers/remoteproc/remoteproc_sysfs.c | 28 ++-
 drivers/remoteproc/wkup_m3_rproc.c|  1 +
 include/linux/remoteproc.h|  2 ++
 3 files changed, 30 insertions(+), 1 deletion(-)

-- 
2.28.0

[PATCH v2 3/3] remoteproc: wkup_m3: Set deny_sysfs_ops flag

2020-11-20 Thread Suman Anna

The Wakeup M3 remote processor is controlled by the wkup_m3_ipc
client driver, so set the newly introduced 'deny_sysfs_ops' flag
to not allow any overriding of the remoteproc firmware or state
from userspace.

Signed-off-by: Suman Anna 
---
v2: rebased version, no code changes, patch title adjusted slightly
v1: 
https://patchwork.kernel.org/project/linux-remoteproc/patch/20180915003725.17549-6-s-a...@ti.com/

 drivers/remoteproc/wkup_m3_rproc.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/remoteproc/wkup_m3_rproc.c 
b/drivers/remoteproc/wkup_m3_rproc.c
index b9349d684258..d92d7f32ba8d 100644
--- a/drivers/remoteproc/wkup_m3_rproc.c
+++ b/drivers/remoteproc/wkup_m3_rproc.c
@@ -160,6 +160,7 @@ static int wkup_m3_rproc_probe(struct platform_device *pdev)
}
 
rproc->auto_boot = false;
+   rproc->deny_sysfs_ops = true;
 
wkupm3 = rproc->priv;
wkupm3->rproc = rproc;
-- 
2.28.0

[PATCH v2 1/3] remoteproc: Fix unbalanced boot with sysfs for no auto-boot rprocs

2020-11-20 Thread Suman Anna

The remoteproc core performs automatic boot and shutdown of a remote
processor during rproc_add() and rproc_del() for remote processors
supporting 'auto-boot'. The remoteproc devices not using 'auto-boot'
require either a remoteproc client driver or a userspace client to
use the sysfs 'state' variable to perform the boot and shutdown. The
in-kernel client drivers hold the corresponding remoteproc driver
module's reference count when they acquire a rproc handle through
the rproc_get_by_phandle() API, but there is no such support for
userspace applications performing the boot through sysfs interface.

The shutdown of a remoteproc upon removing a remoteproc platform
driver is automatic only with 'auto-boot' and this can cause a
remoteproc with no auto-boot to stay powered on and never freed
up if booted using the sysfs interface without a matching stop,
and when the remoteproc driver module is removed or unbound from
the device. This will result in a memory leak as well as the
corresponding remoteproc ida being never deallocated. Fix this
by holding a module reference count for the remoteproc's driver
during a sysfs 'start' and releasing it during the sysfs 'stop'
operation.

Signed-off-by: Suman Anna 
Acked-by: Arnaud Pouliquen 
---
v2: rebased version, no changes
v1: 
https://patchwork.kernel.org/project/linux-remoteproc/patch/20180915003725.17549-2-s-a...@ti.com/

 drivers/remoteproc/remoteproc_sysfs.c | 16 +++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/drivers/remoteproc/remoteproc_sysfs.c 
b/drivers/remoteproc/remoteproc_sysfs.c
index d1cf7bf277c4..bd2950a246c9 100644
--- a/drivers/remoteproc/remoteproc_sysfs.c
+++ b/drivers/remoteproc/remoteproc_sysfs.c
@@ -3,6 +3,7 @@
  * Remote Processor Framework
  */
 
+#include 
 #include 
 #include 
 
@@ -228,14 +229,27 @@ static ssize_t state_store(struct device *dev,
if (rproc->state == RPROC_RUNNING)
return -EBUSY;
 
+   /*
+* prevent underlying implementation from being removed
+* when remoteproc does not support auto-boot
+*/
+   if (!rproc->auto_boot &&
+   !try_module_get(dev->parent->driver->owner))
+   return -EINVAL;
+
ret = rproc_boot(rproc);
-   if (ret)
+   if (ret) {
dev_err(>dev, "Boot failed: %d\n", ret);
+   if (!rproc->auto_boot)
+   module_put(dev->parent->driver->owner);
+   }
} else if (sysfs_streq(buf, "stop")) {
if (rproc->state != RPROC_RUNNING)
return -EINVAL;
 
rproc_shutdown(rproc);
+   if (!rproc->auto_boot)
+   module_put(dev->parent->driver->owner);
} else {
dev_err(>dev, "Unrecognised option: %s\n", buf);
ret = -EINVAL;
-- 
2.28.0

Re: [GIT PULL] io_uring fixes for 5.10-rc

2020-11-20 Thread Jens Axboe

On 11/20/20 7:41 PM, Jens Axboe wrote:
> On 11/20/20 5:23 PM, Linus Torvalds wrote:
>> On Fri, Nov 20, 2020 at 1:36 PM Jens Axboe  wrote:
>>>
>>> I don't disagree with you on that. I've been a bit gun shy on touching
>>> the VFS side of things, but this one isn't too bad. I hacked up a patch
>>> that allows io_uring to do LOOKUP_RCU and a quick test seems to indicate
>>> it's fine. On top of that, we just propagate the error if we do fail and
>>> get rid of that odd retry loop.
>>
>> Ok, this looks better to me (but is obviously not 5.10 material).
>>
>> That said, I think I'd prefer to keep 'struct nameidata' internal to
>> just fs/namei.c, and maybe we can just expert that
>>
>> struct nameidata nd;
>>
>> set_nameidata(, req->open.dfd, req->open.filename);
>> file = path_openat(, , op.lookup_flags | LOOKUP_RCU);
>> restore_nameidata();
>> return filp == ERR_PTR(-ECHILD) ? -EAGAIN : filp;
>>
>> as a helper from namei.c instead? Call it "do_filp_open_rcu()" or something?
> 
> Yes, that's probably a better idea. I'll move in that direction.

Actually, I think we can do even better. How about just having
do_filp_open() exit after LOOKUP_RCU fails, if LOOKUP_RCU was already
set in the lookup flags? Then we don't need to change much else, and
most of it falls out naturally.

Except it seems that should work, except LOOKUP_RCU does not guarantee
that we're not going to do IO:

[   20.463195]  schedule+0x5f/0xd0
[   20.463444]  io_schedule+0x45/0x70
[   20.463712]  bit_wait_io+0x11/0x50
[   20.463981]  __wait_on_bit+0x2c/0x90
[   20.464264]  out_of_line_wait_on_bit+0x86/0x90
[   20.464611]  ? var_wake_function+0x30/0x30
[   20.464932]  __ext4_find_entry+0x2b5/0x410
[   20.465254]  ? d_alloc_parallel+0x241/0x4e0
[   20.465581]  ext4_lookup+0x51/0x1b0
[   20.465855]  ? __d_lookup+0x77/0x120
[   20.466136]  path_openat+0x4e8/0xe40
[   20.466417]  do_filp_open+0x79/0x100
[   20.466720]  ? __kernel_text_address+0x30/0x70
[   20.467068]  ? __alloc_fd+0xb3/0x150
[   20.467349]  io_openat2+0x65/0x210
[   20.467618]  io_issue_sqe+0x3e/0xf70

Which I'm actually pretty sure that I discovered before and attempted to
do a LOOKUP_NONBLOCK, which was kind of half assed and that Al
(rightfully) hated because of that.

diff --git a/fs/io_uring.c b/fs/io_uring.c
index 43ba815e4107..9a0a21ac5227 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -4069,36 +4069,25 @@ static int io_openat2(struct io_kiocb *req, bool 
force_nonblock)
struct file *file;
int ret;
 
-   if (force_nonblock && !req->open.ignore_nonblock)
-   return -EAGAIN;
-
ret = build_open_flags(>open.how, );
if (ret)
goto err;
+   if (force_nonblock)
+   op.lookup_flags |= LOOKUP_RCU;
 
ret = __get_unused_fd_flags(req->open.how.flags, req->open.nofile);
if (ret < 0)
goto err;
 
file = do_filp_open(req->open.dfd, req->open.filename, );
+   if (force_nonblock && file == ERR_PTR(-ECHILD)) {
+   put_unused_fd(ret);
+   return -EAGAIN;
+   }
+
if (IS_ERR(file)) {
put_unused_fd(ret);
ret = PTR_ERR(file);
-   /*
-* A work-around to ensure that /proc/self works that way
-* that it should - if we get -EOPNOTSUPP back, then assume
-* that proc_self_get_link() failed us because we're in async
-* context. We should be safe to retry this from the task
-* itself with force_nonblock == false set, as it should not
-* block on lookup. Would be nice to know this upfront and
-* avoid the async dance, but doesn't seem feasible.
-*/
-   if (ret == -EOPNOTSUPP && io_wq_current_is_worker()) {
-   req->open.ignore_nonblock = true;
-   refcount_inc(>refs);
-   io_req_task_queue(req);
-   return 0;
-   }
} else {
fsnotify_open(file);
fd_install(ret, file);
diff --git a/fs/namei.c b/fs/namei.c
index 03d0e11e4f36..eb2c917986a5 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -3394,8 +3394,11 @@ struct file *do_filp_open(int dfd, struct filename 
*pathname,
 
set_nameidata(, dfd, pathname);
filp = path_openat(, op, flags | LOOKUP_RCU);
-   if (unlikely(filp == ERR_PTR(-ECHILD)))
+   if (unlikely(filp == ERR_PTR(-ECHILD))) {
+   if (flags & LOOKUP_RCU)
+   return filp;
filp = path_openat(, op, flags);
+   }
if (unlikely(filp == ERR_PTR(-ESTALE)))
filp = path_openat(, op, flags | LOOKUP_REVAL);
restore_nameidata();

-- 
Jens Axboe

Re: [PATCH net-next 0/3] net: ipa: platform-specific clock and interconnect rates

2020-11-20 Thread patchwork-bot+netdevbpf

Hello:

This series was applied to netdev/net-next.git (refs/heads/master):

On Thu, 19 Nov 2020 16:40:38 -0600 you wrote:
> This series changes the way the IPA core clock rate and the
> bandwidth parameters for interconnects are specified.  Previously
> these were specified with hard-wired constants, with the same values
> used for the SDM845 and SC7180 platforms.  Now these parameters are
> recorded in platform-specific configuration data.
> 
> For the SC7180 this means we use an all-new core clock rate and
> interconnect parameters.
> 
> [...]

Here is the summary with links:
  - [net-next,1/3] net: ipa: define clock and interconnect data
https://git.kernel.org/netdev/net-next/c/dfccb8b13c0c
  - [net-next,2/3] net: ipa: populate clock and interconnect data
https://git.kernel.org/netdev/net-next/c/f08c99226458
  - [net-next,3/3] net: ipa: use config data for clocking
https://git.kernel.org/netdev/net-next/c/91d02f955150

You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html

Re: [PATCH net-next 0/6] net: ipa: add a driver shutdown callback

2020-11-20 Thread patchwork-bot+netdevbpf

Hello:

This series was applied to netdev/net-next.git (refs/heads/master):

On Thu, 19 Nov 2020 16:49:23 -0600 you wrote:
> The final patch in this series adds a driver shutdown callback for
> the IPA driver.  The patches leading up to that address some issues
> encountered while ensuring that callback worked as expected:
>   - The first just reports a little more information when channels
> or event rings are in unexpected states
>   - The second patch recognizes a condition where an as-yet-unused
> channel does not require a reset during teardown
>   - The third patch explicitly ignores a certain error condition,
> because it can't be avoided, and is harmless if it occurs
>   - The fourth properly handles requests to retry a channel HALT
> request
>   - The fifth makes a second attempt to stop modem activity during
> shutdown if it's busy
> 
> [...]

Here is the summary with links:
  - [net-next,1/6] net: ipa: print channel/event ring number on error
https://git.kernel.org/netdev/net-next/c/f8d3bdd561a7
  - [net-next,2/6] net: ipa: don't reset an ALLOCATED channel
https://git.kernel.org/netdev/net-next/c/5d28913d4ee6
  - [net-next,3/6] net: ipa: ignore CHANNEL_NOT_RUNNING errors
https://git.kernel.org/netdev/net-next/c/f849afcc8c3b
  - [net-next,4/6] net: ipa: support retries on generic GSI commands
https://git.kernel.org/netdev/net-next/c/1136145660f3
  - [net-next,5/6] net: ipa: retry modem stop if busy
https://git.kernel.org/netdev/net-next/c/7c80e83829db
  - [net-next,6/6] net: ipa: add driver shutdown callback
https://git.kernel.org/netdev/net-next/c/ae1d72f9779f

You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html

Re: [PATCH bpf-next v7 32/34] bpf: eliminate rlimit-based memory accounting infra for bpf maps

2020-11-20 Thread Roman Gushchin

On Fri, Nov 20, 2020 at 06:52:27PM -0800, Alexei Starovoitov wrote:
> On Thu, Nov 19, 2020 at 09:37:52AM -0800, Roman Gushchin wrote:
> >  static void bpf_map_put_uref(struct bpf_map *map)
> > @@ -619,7 +562,7 @@ static void bpf_map_show_fdinfo(struct seq_file *m, 
> > struct file *filp)
> >"value_size:\t%u\n"
> >"max_entries:\t%u\n"
> >"map_flags:\t%#x\n"
> > -  "memlock:\t%llu\n"
> > +  "memlock:\t%llu\n" /* deprecated */
> >"map_id:\t%u\n"
> >"frozen:\t%u\n",
> >map->map_type,
> > @@ -627,7 +570,7 @@ static void bpf_map_show_fdinfo(struct seq_file *m, 
> > struct file *filp)
> >map->value_size,
> >map->max_entries,
> >map->map_flags,
> > -  map->memory.pages * 1ULL << PAGE_SHIFT,
> > +  0LLU,
> 
> The set looks great to me overall, but above change is problematic.
> There are tools out there that read this value.
> Returning zero might cause oncall alarms to trigger.
> I think we can be more accurate here.
> Instead of zero the kernel can return
> round_up(max_entries * round_up(key_size + value_size, 8), PAGE_SIZE)
> It's not the same as before, but at least the numbers won't suddenly
> go to zero and comparison between maps is still relevant.
> Of course we can introduce a page size calculating callback per map type,
> but imo that would be overkill. These monitoring tools don't care about
> precise number, but rather about relative value and growth from one
> version of the application to another.
> 
> If Daniel doesn't find other issues this can be fixed in the follow up.

Makes total sense. I'll prepare a follow-up patch.

Thanks!

Re: [PATCH v2 00/10] Broadcom b53 YAML bindings

2020-11-20 Thread Florian Fainelli




On 11/11/2020 8:50 PM, Florian Fainelli wrote:
> Hi,
> 
> This patch series fixes the various Broadcom SoCs DTS files and the
> existing YAML binding for missing properties before adding a proper b53
> switch YAML binding from Kurt.
> 
> If this all looks good, given that there are quite a few changes to the
> DTS files, it might be best if I take them through the upcoming Broadcom
> ARM SoC pull requests. Let me know if you would like those patches to be
> applied differently.
> 
> Thanks!

Series applied to devicetree/next, thanks everyone.
-- 
Florian

Re: [PATCH bpf-next v7 32/34] bpf: eliminate rlimit-based memory accounting infra for bpf maps

2020-11-20 Thread Alexei Starovoitov

On Thu, Nov 19, 2020 at 09:37:52AM -0800, Roman Gushchin wrote:
>  static void bpf_map_put_uref(struct bpf_map *map)
> @@ -619,7 +562,7 @@ static void bpf_map_show_fdinfo(struct seq_file *m, 
> struct file *filp)
>  "value_size:\t%u\n"
>  "max_entries:\t%u\n"
>  "map_flags:\t%#x\n"
> -"memlock:\t%llu\n"
> +"memlock:\t%llu\n" /* deprecated */
>  "map_id:\t%u\n"
>  "frozen:\t%u\n",
>  map->map_type,
> @@ -627,7 +570,7 @@ static void bpf_map_show_fdinfo(struct seq_file *m, 
> struct file *filp)
>  map->value_size,
>  map->max_entries,
>  map->map_flags,
> -map->memory.pages * 1ULL << PAGE_SHIFT,
> +0LLU,

The set looks great to me overall, but above change is problematic.
There are tools out there that read this value.
Returning zero might cause oncall alarms to trigger.
I think we can be more accurate here.
Instead of zero the kernel can return
round_up(max_entries * round_up(key_size + value_size, 8), PAGE_SIZE)
It's not the same as before, but at least the numbers won't suddenly
go to zero and comparison between maps is still relevant.
Of course we can introduce a page size calculating callback per map type,
but imo that would be overkill. These monitoring tools don't care about
precise number, but rather about relative value and growth from one
version of the application to another.

If Daniel doesn't find other issues this can be fixed in the follow up.

Re: [linux-sunxi] Re: [PATCH 3/3] arm64: allwinner: dts: a64: add DT for PineTab developer sample

2020-11-20 Thread Samuel Holland

Maxime,

On 11/20/20 5:30 PM, Icenowy Zheng wrote:
>>> +/ {
>>> +   model = "PineTab Developer Sample";
>>> +   compatible = "pine64,pinetab-dev", "allwinner,sun50i-a64";
>>> +};
>>
>> Changing the DT and the compatible half-way through it isn't ok. Please
>> add a new DT with the newer revision like we did for the pinephone
>
> We did this on Pine H64.

 What are you referring to? I couldn't find a commit where we did what
 you suggested in that patch to the pine H64.
>>>
>>> Oh the situation is complex. On Pine H64, we didn't specify anything at
>>> start (which is the same here), the DT is originally version-neutral
>>> but then transitioned to model B, then reverted to model A. Here the DT is 
>>> always
>>> for the sample.
>>>
>>> However, for Pine H64 there's model A/B names, but for PineTab there's no
>>> any samples that are sold, thus except who got the samples, all PineTab
>>> owners simply owns a "PineTab", not a "PineTab xxx version".
>>
>> It's fairly simple really, we can't really predict the future, so any DT
>> submitted is for the current version of whatever board there is. This is

I don't think that was the intention at all. The DT was submitted for a
future product, whatever that future product ends up being at the time
of its release. Since there are necessarily no users until the product
ships, there is no chance of breaking users by modifying the DT.

>> what we (somewhat messily) did for the PineH64, for the pinephone, or
>> really any other board that has several revisions

Surely a non-public prototype doesn't count as a separate revision! This
sort of policy strongly discourages ever shipping a board with
out-of-the-box mainline Linux support. Because if there any hardware
bugs fixed between initial upstreaming and production, the manufacture
must come up with a new DT name.

This is hostile to the users as well, because the "canonical" DT name no
longer matches the "canonical" (read: the only one ever available)
version of the hardware.

Do you want manufacturers to submit their initial board DT as
"$BOARD-prototype.dts", just in case they have to make a change before
production? And only after the board is shipped (with out-of-tree
patches, of course, to use $BOARD.dts, since the shipped board is *not*
the prototype) submit a "$BOARD.dts" to mainline?

Maxime, can you clarify specifically what the line is where a device
tree is "locked down" and further changes to the hardware require a new
name? First sample leaves the factory? $NUMBER units produced? First
sold to the public for money?

Without some guidance, or a change in policy, this problem is going to
keep coming up again and again.

You'll note that so far it has mostly affected Pine devices, and I don't
think that's because they make more board revisions than other
manufacturers. It's because they're actively involved in getting their
boards supported upstream. For other manufacturers, it's some user
sending in a device tree months after the hardware ships to the public
-- of course the hardware is more stable at that point. I think Pine's
behavior is something we want to encourage, not penalize.

> Okay. But I'm not satisfied with a non-public sample occupies
> the pinetab name. Is rename it to pinetab-dev and add a
> pinetab-retail okay?
To me, naming the production version anything but "pinetab" isn't
satisfying either.

Samuel

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 1584 matches

Mail list logo